The HTTP Query Method

119 points by todsacerdoti 10 months ago

jessekv 10 months ago

This paragraph is the key:

> The QUERY method provides a solution that spans the gap between the use of GET and POST. As with POST, the input to the query operation is passed along within the content of the request rather than as part of the request URI. Unlike POST, however, the method is explicitly safe and idempotent, allowing functions like caching and automatic retries to operate.

nonethewiser 10 months ago

But doesn't this beg the question: why not GET with query params?
I'm not necessarily against it - I've had the urge to send a body on a GET request before (cant recall or justify the use-case, however).
Reasons I can think of:
- browsers limit url (and therefore) query param size
- general dev ergonomics
The practical limits (ie, not standard specified limits) on query param size seems fair but its worth mentioning there are practical limits to body size as well introduced by web servers, proxies, cloud api gateways, etc or ultimately hardware. From what I can see this is something like 2kb for query params in the lower end case (Chrome's 2083 char limit) and 10-100mb (but highly variable and potentially only hardware bound) for body size.
In both cases it's worth stating that the spec is being informed by practical realities outside of the spec. In terms of the spec itself, there is no limit to either body size or query param length. How much should the spec be determined by particular implementation details of browsers, cloud services, etc.?
- treve 10 months ago
  
  With a request body you can also rely on all the other standard semantics that request bodies afford, such as different content-types, content encoding. Query parameters are also often considered less secure for things like PII and (less important these days) query parameters don't really define character encoding.
  But generally the most important reason is that you can take advantage of the full range of mimetypes, and yes practicality speaking there's a limit on how much you should stuff in a query parameter.
  This has resulted in tons of protocols using POST, such as GraphQL. This is a nice middle ground.
- chrsig 10 months ago
  
  Servers have to limit header size as well, since content-length isn't available. Without a bounded header size the client can just send indefinitely.
  Also just impossible to read, and url encoding is full of parser alignment issues. I have to imagine that QUERY would support a json request body.
- nine_k 10 months ago
  
  Size can be a very practical concern if your query parameter is e.g. a picture.
  Not showing the URL in various logs can be a concern if your query parameters are sensitive.
  - nonethewiser 10 months ago
    
    A picture as an input for a GET seems like a very strange use-case. Can you elaborate?
    I do agree on the quirkiness of encoding a bunch of data in the URL. Its nice to decouple the endpoint from the input at a more fundamental level.
    
    bakje 10 months ago
    
    Use cases like reverse image search come to mind
- cryptonector 10 months ago
  Reasons:
  - URI length limits (not just browsers, but also things like Jersey) - q-params are ugly and limiting by comparison to an arbitrarily large and complex request body with a MIME type
  However, q-params are super convenient because the query is then part of the URI, which you can then cut-n-paste around. Of course, a convenient middle of the road is to use a QUERY which returns not just query results but also a URI with the query expressed as q-params so you can cut-n-paste it.
  - EE84M3i 10 months ago
    
    Query params are also extremely commonly logged all over the place and so putting sensitive information in them is almost always a bad idea.
    
    cryptonector 10 months ago
    
    Of course. The security considerations section does mention this as a feature of `QUERY`, saying that the `Location:` returned by the server should not encode all the details of the request.
    However it's also true that q-params effectively form part of the UI. I'm certain you've edited URIs before -- I have, and I know not-so-knowledgeable people who do it too.
    Striking a balance here is not easy. With `QUERY` the server can decide how much of the query to encode into the `Location:`, if any of it at all. The server might use knowledge of the "schema" that the query refers to, or it might use the syntax of the query (if it supports indicating sensitive portions), or it might only "link-shorten" the whole query.
- MBCook 10 months ago
  
  This avoid changing the definition of GET. Who knows how many middle boxes would mess with things if you did that because they “know” what GET means and so their thing is “safe”.
  Until GET changes.
  People aren’t using QUERY yet so that problem doesn’t exist.
TacticalCoder 10 months ago

> Unlike POST, however, the method is explicitly safe and idempotent, allowing functions like caching and automatic retries to operate.
A shitload of answers to GET requests, although cacheable, are stale though. If you issue a GET for a page which contains stuff like "Number of views: xxx" or "Number of users logged in: yyy" or "Last updated on: yyyy-mm-dd" there goes idempotency.
Some GET requests are actually idempotent ("Give me the value of AAPL at close on 2021-07-21") but many aren't.
Stale data won't break much but there's a world between "it's an idempotent call" and "I'm actually likely to be seeing stale data if I'm using a cached value".
I mean... Enter in you browser "https://example.org", that's a GET. And that's definitely not idempotent for most sites we're all using on a daily basis.
- kbolino 10 months ago
  
  Idempotence does not mean immutability, it means that two or more identical operations have the same effect on the resource as a single one. Since GET operations, by virtue of also being safe, generally have no effect at all, this is almost always trivially true. Just because the resource's content changed for some other reason doesn't mean GET is not idempotent.
  - paulddraper 10 months ago
    
    Correct.
    GET, DELETE, HEAD, OPTIONS, PUT are idempotent.
    POST is not. (Thus existence of Idempotency-Key, etc)
- Timon3 10 months ago
  
  Idempotency in the context of HTTP requests isn't about the response you receive, but about the state of the resource on the server. You're supposed to be able to call GET /{resource}/{id} any amount of times without the resource {id} being different at the end. You can't do the same for POST, as POST /{resource} could create a new entity with every call.
  A view counter also doesn't break this, as the view counter isn't the resource you're interacting with. As long as you're not modifying the resource on a `GET` call, the request is idempotent.
  - kbolino 10 months ago
    
    Strictly speaking, an inline view counter, which is incremented on each GET and included in the response, would break idempotence. Similarly, a PUT request that implicitly modifies a "last updated" field would also break idempotence. These are pretty minor violations, though, which arguably don't change any "semantics" of the response.
    
    Timon3 10 months ago
    
    If the view counter is part of the resource itself, then yes, incrementing it on GET breaks the idempotence contract - but it should be obvious that by breaking idempotence, you're breaking idempotence. If it's part of the response without being part of the resource, you're not breaking idempotence - otherwise things like updating a JS library would also break idempotence, but it doesn't, since it's not part of the resource.
    
    kbolino 10 months ago
    
    Updating a JS library has nothing to do with idempotence. The JS libraries used in <script> tags by HTML pages are absolutely part of the resource. But they aren't changed by GET requests, they're (usually) changed by the site admin out-of-band. Idempotence is not immutability; the content of a resource can change between two identical GET requests, it just can't change because of those requests.
    
    Timon3 10 months ago
    
    > Updating a JS library has nothing to do with idempotence. The JS libraries used in <script> tags by HTML pages are absolutely part of the resource.
    No, they absolutely aren't, and this is a really important distinction. Not everything contained in the response is part of the resource.
    The resource is essentially the data living on the server. It doesn't matter what the response looks like or how it's formatted - as long as the same data is transferred, you're referring to the same resource (e.g. `/{resource}/{id}.html` and `/{resource}/{id}.json` can be different representations of the same resource). This means that changing ancillary response data, i.e. non-resource response data, doesn't change the resource, because it's not part of the resource.
    If you were correct, the resource would have to contain the JS libraries used. They would have to be part of the data model. Have you ever seen an application like that? Where all JS libraries, CSS files and so on are duplicated into every single resource, and updating the files means updating every single resource entry in your database? Where a JSON API also serves all the JS libraries used in the HTML representation of the resource? And mind you, we're not talking about a "HTML page builder" or something, but about any CRUD application. Usually these things live outside of the resources, in templates or similar.
    
    kbolino 10 months ago
    
    Ok, I think we're talking past each other a bit.
    If on index.html, the script src is "example-1.2.3.js" and you change it to "example-1.2.4.js" then yes, you have changed the resource identified by index.html.
    If instead you have a script src of "example.js" and you simply change the content served at example.js, this does not change the resource index.html.
    This rarely has anything to do with method idempotence, because these changes are usually made out-of-band (classically, by uploading new files to an FTP server).
    
    Timon3 10 months ago
    
    No, this is simply not true. If you change "example-1.2.3.js" to "example-1.2.4.js" in the response to `/{resource}/{id}`, it does not change the underlying resource. The included script is simply a part of the representation of the resource, but it's not the resource itself. The representation of the resource can change without the resource itself changing.
    https://httpwg.org/specs/rfc9110.html#rfc.section.3.2
    You can change "index.html" as much as you want to, as long as the resource it represents stays the same.
    "index.html" is not a resource, it's a representation of a resource. If you still disagree, please explain how "index.html" and "index.json" can be representations of the same resource.
    
    kbolino 10 months ago
    
    They are not the same resource. Every resource is uniquely identified by its URI (normalized, ignoring the fragment), and so https://example.com/index.html and https://example.com/index.json are therefore not the same resource.
    Perhaps what you are thinking of is content negotiation. I can request https://example.com/index.html and (despite the name) get JSON back, either because the server is cheeky or because I said "Accept: application/json" or similarly expressed a preference for JSON over HTML. Assuming both JSON and HTML exist at the same URI, these would be two representations of the same resource (accessed from the same URI, but with different headers). Broadly speaking, however, this mechanism is not often used outside of certain specialized protocols, and so it generally doesn't matter: changing the HTML bytes of index.html changes the resource because HTML is the only representation. Per the contract that specifies method idempotence and safety, GET index.html should therefore never cause the HTML content I receive back to change. However, the HTML content can change between requests for other reasons.
    Whether changing only one representation of a resource is the same as changing the "whole" resource somehow is a bit moot, because the specific representation is what's consumed by the client and stored by caches. The ETag and cache parameters are tied to the specific representation as well, and can't be "smeared" across all representations. Practically speaking, the resource is inseparable from its representation, and thus two different representations are treated by well behaved HTTP clients and caches the same as two different resources, even if a higher-level protocol (SOAP, WebDAV, etc.) might treat them as semantically equivalent.
    This is my read of RFCs 9110 and 9111; if you have strong evidence otherwise, I'm open to it.
- stackghost 10 months ago
  
  I suppose the counterargument is that "users logged in" or "views" is stale all the time, because it could have changed while the response was in flight.
  If you really need live data for "views" or what have you, then perhaps the front end should be querying the backend repeatedly via a separate POST.
TYPE_FASTER 10 months ago

Yes. I was skeptical at first. Why add another method? This explains why.
- solardev 10 months ago
  
  GraphQL reads are like that. You POST a query with the content inside the request body. This would be a nicer verb for it.
  - ako 10 months ago
    
    OData provides GraphQL functionality but using GET + URL parameters, for large queries it's really hard to read or edit. Using the request body makes sense for readability, so OData also offers the option to query with POST + request body.

giobox 10 months ago

I was just working with OpenSearch's API recently, who sort of abuse the semantics of the HTTP spec by using GETs with a body to perform search queries, largely to solve a similar problem. A "QUERY" message type would probably easily replace the GET with a body used by OpenSearch etc. I'd go as far as to argue thats largely what this new QUERY type is - official recognition for a "GET" request with a body.

> https://opensearch.org/docs/latest/api-reference/search/

BugsJustFindMe 10 months ago

It's not abuse of the spec. The spec says that GET is allowed to have a body. Whether an endpoint must honor it is left undefined, and this has lead people to believe wrongly that that means endpoints are not allowed to expect it. But certainly the endpoint itself defines what it does with your requests, and that's what actually matters.
QUERY is just GET once more with feeling, because people are worried about existing software that made the wrong decision about denying or stripping GET bodies and would need to be updated to allow them through, and detecting an unimplemented verb is I guess maybe simpler than detecting some middle layer being a dick and dropping your data.
- treve 10 months ago
  
  This is not a correct interpretation of the spec, as far as I can tell. I did some more research on this + primary sources on my blog if you're interested in why and why people are confused about this:
  https://evertpot.com/get-request-bodies/
  - BugsJustFindMe 10 months ago
    
    Your post puts a tremendous amount of weight on how you personally feel about using it across load balancers that you don't control and whatnot, but you also still acknowledge that it doesn't violate the spec.
    > It’s true that undefined behavior means that you as a server developer can define it
    Good. Agreed. Done.
    Someone else complying with the spec in an incompatible way doesn't make your compliance wrong. It just reduces the circumstances where it will work. But those circumstances are...well... circumstantial. And I think it's important to establish that there's a meaningful difference between "won't work as expected in certain common but not universal circumstances" and "not allowed to at all".
    
    treve 10 months ago
    
    You either didn't read what I wrote, or are homing in on the things that are ambiguous vs the things that aren't. (or I'm just a bad writer). One thing is not ambiguous:
    The intent, as stated by one of the primary authors of the HTTP specification has "no semantic meaning" and "is never useful to do so".
    You can do something different and assign meaning to the body, but it's no longer HTTP. So you are correct, it's not disallowed, just pointless. I'd also recommend you read the latest version of the HTTP spec. It's a bit more direct about this.
    You can disagree with the authors of the HTTP spec of course, but I'm not interested in having that argument with you.

pgris2 10 months ago

I don't like using the body. GET's are easily shareable, bookmarkeable, etc, (even editable by humans). etc. thans to query strings. I rather have a GET with better (shorter) serialization for parameters and a standarized max-length of 16k or something like that.

imbnwa 10 months ago

They discuss this desire in the draft:
>A server MAY create or locate a resource that identifies the query operation for future use. If the server does so, the URI of the resource can be included in the Location header field of the response (see Section 10.2.2 of [HTTP]). This represents a claim that a client can send a GET request to the indicated URI to repeat the query operation just performed without resending the query parameters.
lokar 10 months ago

HTTP is used in many contexts where no human will see the request

mtrovo 10 months ago

I can't stop thinking how HTTP REST is just an abuse on the original design of HTTP and the whole thing of using HTTP verbs to add meaning to requests is not a very good abstraction for RPC calls.

The web evolved from being a tool to access documents on directories to this whole apps in the cloud thing and we kept using the same tree abstraction to sync state to the server, which doesn't make a lot of sense in lots of places.

Maybe we need a better abstraction to begin with, something like discoverable native RPCs using a protocol designed for it like thrift or grpc.

jarjoura 10 months ago

HTTP as a pure transport protocol keeps coming back to the default, because it works. Its superpower is that it pushes stateless (and secure) design from end to end. You have to fight a losing battle to get around that, so if you play along, you can end up with better power efficiency, resiliency and scalability.
REST is just very simple to understand and easy to prototype with. There's better abstractions on top of HTTP, like GraphQL and gRPC (as you mentioned), but you can layer those on after you have a working solution and are looking for more performance.
HTTP3 is on the way this decade and I'm excited for its promises. Given how long it took for HTTP2 to standardize, I'm not optimistic it will be soon, but it does mean we have a path forward.
- Two4 10 months ago
  
  HTTP2 has already been mostly replaced by HTTP3, iirc. We now mostly have a split between HTTP1.1 And 3, if I'm remembering an article I read correctly.
cryptonector 10 months ago
> The web evolved from being a tool to access documents on directories to this whole apps in the cloud thing and we kept using the same tree abstraction to sync state to the server, which doesn't make a lot of sense in lots of places.
The first part is correct, but hard disagree on the rest. HTTP makes a lot of sense for RPC-ish things because a) it can do those things better than RPC, b) HTTP can do things that RPC typically can't (like: content type negotiations and conversions, caching, online / indefinite length content transmission, etc).
HTTP is basically a filesystem access protocol with extra semantics made possible by a) headers, b) MIME types, and if you think of some "files" as active/dynamic resources rather than static resources, then presto, you have a pretty good understanding of HTTP. ("Dynamic" means code processes a request and produces response content, possibly altering state, that a plain filesystem can't possibly do. An RDBMS is an example of "dynamic", while a filesystem is an example of "static".)
REST is quite fine. It's very nice actually, and much nicer than RPC. And everything that's nice about REST and not nice about RPC is to do with those extensible headers and MIME types, and especially semantics and cache controls.
But devs always want to generate APIs from IDLs, so RPC is always so tempting.
As for RPC, there's nothing terribly wrong with it as long as one can generate async-capable APIs from IDLs. The thing everyone hates about RPC is that typically it gets synchronous interfaces for distributed computations, which is a mistake. But RPC protocols do not imply synchronous interfaces -- that's just a mistake in the codegen tools designs.
Ultimately the non-RESTful thing about RPCs that sucks is that
```
  - no URIs (see below response to
    your point about discovery)

  - nothing is exposed about RPC
    idempotence, which makes
    "routing" fraught
  
  - lack of 3xx redirects, which
    makes "routing" hard
  
  - lack of cache controls
  
  - lack of online streaming
    ("chunked" encoding w/
     indefinite content-length)
```
Conversely, the things that make HTTP/REST good are:
```
  - URIs!!
  
  - idempotence is explicitly part
    of the interface because it is
    part of the protocol
  
  - generic status codes including
    redirects
  
  - content type negotiation
  
  - conditional requests
  
  - byte range requests
  
  - request/response body streaming
```
> Maybe we need a better abstraction to begin with, something like discoverable native RPCs using a protocol designed for it like thrift or grpc.
That's been tried. ONC RPC and DCE RPC for example had a service discovery system. It's not enough, and it can't be enough. You really need URIs. And you really need URIs to be embeddable in contents like HTML, XML, JSON, etc. -- by convention if need be (e.g., in JSON it can only be by convention / schema). You also need to be able to send extra metadata in request/response headers, including URIs.
(Re: URIs and URIs headers, HATEOAS really depends on very smart user-agents, which basically haven't materialized because it turns out that HTML+JS is enough to make UIs good, and so URIs in headers are not useful for UIs, but they are useful for APIs.)
It took me a long time to understand all of this, that REST is right and typical RPCs are lame. Many of the points are very subtle, and you might have to build a RESTful application that uses many of these features of HTTP/REST in order to come around -- that's a lot to ask for!
The industry seems to be constantly vacillating between REST and RPC. SOAP came and went; no one misses it. gRPC is the RPC of the day, but I think in the end the only nice thing about it is binary encodings and schema, and that it won't survive in the long run.
TZubiri 10 months ago

En.wikipedia.org/wiki/constitutionalism

apitman 10 months ago

Looks interesting. Sadly it will likely be hamstring by CORS rules. When designing an API, I frequently send all non-GET requests as POSTs with content type text/plain in order for them to classify as simple requests and avoid CORS preflights, which add an entire round trip per request. Obviously only safe if you're doing proper authorization with a token or something. Another fun bit is that you have to put the token in the query string, going against best practices for things like OAuth2, because the Authorization header isn't CORS approved. CORS enforcement is an abomination.

cryptonector 10 months ago

Because of the semantics of QUERY CORS rules should apply as to GETs -- one would think anyways. The Internet-Draft ought to say something about CORS though, that's for sure.

arcuri82 10 months ago

it looks great. Covering a few pain points in modelling APIs when you need to retrieve data idempotently but the URL is not identifying a specific resource, ie, "The content returned in response to a QUERY cannot be assumed to be a representation of the resource identified by the target URI".

This is a "work in progress". Is there any estimation of when it will be finalized by? Something like during 2025, and frameworks/libraries starting to support it by something like 2026? Just to have a reference, anyone remember how long it took for PATCH?

cryptonector 10 months ago

This appears to be a Working Group item of the HTTPbis WG [0] which has a Zulip stream [1], a mailing list [2][3], a site [4], and a GitHub organization [5], thus lots of ways to send feedback. The Internet-Draft itself has a GitHub repository [6]. For this I think a GitHub issue/issues [7] would be best; some already exist now for issues in this thread.

  [0] https://datatracker.ietf.org/wg/httpbis/about/
  [1] https://zulip.ietf.org/#narrow/stream/225-httpbis
  [2] mailto:httpbisa@ietf.org (I think)
  [3] https://mailarchive.ietf.org/arch/browse/httpbisa/
  [4] https://httpwg.org/
  [5] https://github.com/httpwg
  [6] https://github.com/httpwg/http-extensions
      https://github.com/httpwg/http-extensions/blob/main/draft-ietf-httpbis-safe-method-w-body.xml
  [7] https://github.com/httpwg/http-extensions/issues

treve 10 months ago

Many things never need explicit support, because good HTTP citizens allow unknown HTTP methods (and treat them basically as POST). True for for example PHP and fetch().
Node.js needs explicit support, this got added a few months ago. There's a good chance that as long as your server supports it, you can start using it today.

westurner 10 months ago

There is currently no way to determine whether there's, for example, an application/json representation of a URL that returns text/html by default.

OPTIONS and PROPFIND don't get it.

There should be an HTTP method (or a .well-known url path prefix) to query and list every Content-Type available for a given URL.

From https://x.com/westurner/status/1111000098174050304 :

> So, given the current web standards, it's still not possible to determine what's lurking behind a URL given the correct headers? Seems like that could've been the first task for structured data on the internet

treve 10 months ago
Servers can return something like:
```
    Link: </foo>; rel="alternate" type="application/json"
```
You could return multiple of these as a response to a HEAD request for example.
You could also use the Accept header in response to a HTTP OPTIONS request, or as part of a 415 error response.
```
    Accept: application/json, text/html
```
https://httpwg.org/specs/rfc9110.html#field.accept (yes Accept can be used in responses)
.well-known is not a good place for this kind of thing. That discovery mechanism should only be used in situations where only a domainname is known, and a feature or endpoint needs to be discovered for a full domain. In most cases you want a link.
The building blocks for most of this stuff is certainly there. There's a lot of wheels being reinvented all the time.
- westurner 10 months ago
  
  > https://httpwg.org/specs/rfc9110.html#field.accept (yes Accept can be used in responses)
  I think that would solve.
  HTTP servers SHOULD send one or more Accept: {content_type} HTTP headers in the HTTP Response to an HTTP OPTIONS Request in order to indicate which content types can be requested from that path.
  - westurner 10 months ago
    
    IDK where that should be added though? Maybe REST and W3C LDP and similar
    The OpenAPI docs could suggest same
tarasglek 10 months ago

Yeah no way to ask for openapi schema definition.
- westurner 10 months ago
  
  https://schema.org/WebAPI
  https://github.com/schemaorg/schemaorg/issues/1423#issuecomm... :
  > Examples of how to represent GraphQL, SPARQL, LDP, and SOLID APIs [as properties of one or more rdfs:subClassOf schema:WebAPI (as RDFa in HTML or JSON-LD)]

jkrems 10 months ago

> When doing so, caches SHOULD first normalize request content to remove semantically insignificant differences, thereby improving cache efficiency, by: [...]

That part sounds like it's asking for trouble. I'm curious if this will make it to the final draft. If the client mis-identifies which parts of the request body are semantically insignificant, the result would be immediate cache poisoning and fun hard-to-debug bugs.

If it's meant as a "MAY", then that seems kind of meaningless: If the client for some reason knows that one particular aspect of the request body is insignificant, it could just generate request bodies that are normalized in the first place..?

cryptonector 10 months ago

Request body normalization is not really feasible, not without having the normalization function be specified by the request body's MIME type, and MIME types specify no such thing. Besides, normalization of things like JSON is fiendishly tricky, and may not be feasible in the case of many MIME types. IMO this should be removed from the spec.
Instead the server should normalize if it can and wants to, and the resulting URI should be used by the cache. The 3xx approach might work well for this, whereas having the server immediately assign a Content-Location: URI as I propose elsewhere in this thread would not allow for server-side normalization in time for a cache.
orf 10 months ago

Yeah, that’s nuts and is obviously flawed behaviour that can interact poorly with any number of things - not least of all any kind of checksums within the response.
I’m surprised to see that in a RFC.
Edit: it’s only for the cache key:
> Note that any such normalization is performed solely for the purpose of generating a cache key; it does not change the request itself.
Still super dangerous.
Edit edit: I just typed out a long message on the GitHub issue tracker for this, but submitting it errored and I’ve lost all the content. Urgh

divbzero 10 months ago

Why an HTTP Get Request Shouldn’t Have a Body - https://www.baeldung.com/cs/http-get-with-body - July 2024

the_arun 10 months ago

For taking advantage of `Cache-Control` don't we need the URI to be unique? For eg. /Contacts?id=abcd with graphql query in the body? the id can be a hash/signature of query string?

treve 10 months ago

No, the key is not just the URI. It's potentially the URI + method + Some request headers + some response headers. The Vary header lets you control which request headers should be considered.
This method specifically includes the request body in that key.

cryptonym 10 months ago

Can be useful for some APIs.

The use-case 4.2 with both Content-Location and Location feels weird to me. Not sure you would want multiple urls with different meanings. Isn't it harder to keep it idempotent if we are generating URLs for request and result? Not sure Location is generally meaningful in 200... that may impact other RFC.

Could be interesting to see a sample where query is created but result will be available later. That's probably just the 303 See Other with a Retry-After?

cryptonector 10 months ago

Content-Location refers to a specific response to the given query.
Location refers to a URI that denotes the given query.
It's a bit strange, since I don't think it's likely that a server will want to persist any particular response to a query for very long. But it's not unreasonable to support that.

dehrmann 10 months ago

Years ago, I had an interview question that was essentially "model a restful query API." I'm not sure if it was the gotcha or that it was a bad question, but rest isn't a good fit. GET lacking a request body doesn't help, and except for long-running queries, modeling things as resources doesn't work. You can either design an API that isn't very restful or do confusing things to squeeze it into that box.

k__ 10 months ago

So, QUERY is a POST but with GET semantics?

Sounds like GraphQL won.

marcosdumay 10 months ago

It's way more than GraphQL.
Searching and analyzing large things is a real need. It won't go away by changing your protocol.
paulddraper 10 months ago

GraphQL queries are a specific usecase for HTTP QUERY.
(Today they rely on HTTP POST.)

cryptonector 10 months ago

So basically a GET with a request body, with the query as the request body rather than as URI local-part query-parameters.

TZubiri 10 months ago

So a POST with different semantics?
- cryptonector 10 months ago
  
  Yes. A POST that is idempotent, safe, cacheable (POST normally isn't any of those).
  Alternatively this is a standardization of "GET-with-request-body" (which is NOT allowed by the spec but which some implement anyways).

mg 10 months ago

The first example shows a GET request and states that it is "common and interoperable query". But that "if the query parameters extend to several kilobytes or more of data it may not be, because many implementations place limits on their size".

So how about simply removing those limits on GET requests?

pimlottc 10 months ago

The installed base of HTTP processing devices is ridiculously huge. There is no practical way to change the existing behavior of GET.
IncreasePosts 10 months ago

There are thousands of pieces of software which have those limitations built in intentionally or unintentionally though. Using a new verb avoids that issue

thrance 10 months ago

I often find myself creating endpoints like `POST /<resource>/query`, which should absolutely be a GET if they could have bodies. Why add yet more to the web specs when we could get rid of the "no body" exception on GET?

nonethewiser 10 months ago

> I often find myself creating endpoints like `POST /<resource>/query`
Why?

packetlost 10 months ago

I wonder how caching will work when the body has the potential to be huge? You could hash it to save on RAM, but I imagine that would get "expensive" at scale.

paulddraper 10 months ago

https://en.wikipedia.org/wiki/Hash_function
nine_k 10 months ago

Huge as in 50 kB? Fine. Huge as in 50 MB? Well, pass it through maybe if your NVMe device is running out of space.
cryptonym 10 months ago

A robust cache implementation would probably put a limit on request content-length, either denying the request or forwarding it as-is and bypassing cache.
- packetlost 10 months ago
  
  I mean, yeah? The standard does say *MAY* cache content. It also says the cache server should try to do semantic interpretation of the body if possible, which I think is a misstep. IMO it would be better to have canonical representations of whatever format you're using and then let the cache layer rely on hashing or similar mechanisms. More complicated edge servers/proxies is not what we want because it leads to vulnerabilities.

PaulDavisThe1st 10 months ago

IETF 2029: The HTTP ASK Method

IETF 2034: The HTTP TELLME Method

IETF 2038: The HTTP WHATIF Method

cryptonector 10 months ago

Those sound like good April 1st RFC fodder.

MuffinFlavored 10 months ago

A simple query with a direct response:

    QUERY /contacts HTTP/1.1
    Host: example.org
    Content-Type: example/query
    Accept: text/csv
    
    select surname, givenname, email limit 10

Not quite full SQL (no JOIN or WHERE in any examples I see)

Hmm... as long as you handle authentication/authorization correctly, why is it bad?

It's a way to pluck certain JSON fields that were otherwise going to be returned? Kind of like one of the benefits of GraphQL? Will this catch on?

cjpearson 10 months ago

I think it would be a good idea to remove these examples from the draft as it gives people the impression that this method is SQL-over-HTTP. A more typical body might be a JSON object with various filter criteria.
SahAssar 10 months ago

The request body is just an example, this does not specify the query language, just the http method.
cryptonector 10 months ago

The example uses a non-existent MIME type named `example/query` as the request's `Content-Type`. We don't know what the syntax and semantics of queries of `example/query` type are, and we don't need to know because this is just an example!
But the intent is clear that one could have `application/sql`, `application/sparql`, and many other MIME types for representations of queries, and if one where to use one of those then the request body should be valid when interpreted in the context of the request's `Content-Type`. Thus you absolutely could have the full power of SQL, if the server were to support it.
pimlottc 10 months ago

The WHERE is implied by the endpoint (“contacts”)
- MuffinFlavored 10 months ago
  
  That would be the FROM in my opinion. WHERE would be ? filters, etc.
  - pimlottc 10 months ago
    
    Oops, what am I thinking, you’re right