Query String vs Resource Path for Filtering Criter

2020-07-11 08:18发布

问题:

Background

I have 2 resources: courses and professors.

A course has the following attributes:

  • id
  • topic
  • semester_id
  • year
  • section
  • professor_id

A professor has the the following attributes:

  • id
  • faculty
  • super_user
  • first_name
  • last_name

So, you can say that a course has one professor and a professor may have many courses.

If I want to get all courses or all professors I can: GET /api/courses or GET /api/professors respectively.


Quandary

My quandary comes when I want to get all courses that a certain professor teaches.

I could use either of the following:

  • GET /api/professors/:prof_id/courses
  • GET /api/courses?professor_id=:prof_id

I'm not sure which to use though.


Current solution

Currently, I'm using an augmented form of the latter. My reasoning is that it is more scale-able if I want to add in filtering/sorting criteria.

I'm actually encoding/embedding JSON strings into the query parameters. So, a (decoded) example might be:

GET /api/courses?where={professor_id: "teacher45", year: 2016}&order={attr: "topic", sort: "asc"}

The request above would retrieve all courses that were (or are currently being) taught by the professor with the provided professor_id in the year 2016, sorted according to topic title in ascending ASCII order.

I've never seen anyone do it this way though, so I wonder if I'm doing something stupid.


Closing Questions

Is there a standard practice for using the query string vs the resource path for filtering criteria? What have some larger API's done in the past? Is it acceptable, or encouraged to use use both paradigms at the same time (make both endpoints available)? If I should indeed be using the second paradigm, is there a better organization method I could use besides encoding JSON? Has anyone seen another public API using JSON in their query strings?


Edited to be less opinion based. (See comments)

回答1:

As already explained in a previous comment, REST doesn't care much about the actual form of the link that identifies a unique resource unless either the RESTful constraints or the hypertext transfer protocol (HTTP) itself is violated.

Regarding the use of query or path (or even matrix) parameters is completely up to you. There is no fixed rule when to use what but just individual preferences.

I like to use query parameters especially when the value is optional and not required as plenty of frameworks like JAX-RS i.e. allow to define default values therefore. Query parameters are often said to avoid caching of responses which however is more an urban legend then the truth, though certain implementations might still omit responses from being cached for an URI containing query strings.

If the parameter defines something like a specific flavor property (i.e. car color) I prefer to put them into a matrix parameter. They can also appear within the middle of the URI i.e. /api/professors;hair=grey/courses could return all cources which are held by professors whose hair color is grey.

Path parameters are compulsory arguments that the application requires to fulfill the request in my sense of understanding otherwise the respective method handler will not be invoked on the service side in first place. Usually this are some resource identifiers like table-row IDs ore UUIDs assigned to a specific entity.

In regards to depicting relationships I usually start with the 1 part of a 1:n relationship. If I face a m:n relationship, like in your case with professors - cources, I usually start with the entity that may exist without the other more easily. A professor is still a professor even though he does not hold any lectures (in a specific term). As a course wont be a course if no professor is available I'd put professors before cources, though in regards to REST cources are fine top-level resources nonetheless.

I therefore would change your query

GET /api/courses?where={professor_id: "teacher45", year: 2016}&order={attr: "topic", sort: "asc"}

to something like:

GET /api/professors/teacher45/courses;year=2016?sort=asc&onField=topic

I changed the semantics of your fields slightly as the year property is probably better suited on the courses rather then the professors resource as the professor is already reduced to a single resource via the professors id. The courses however should be limited to only include those that where held in 2016. As the sorting is rather optional and may have a default value specified, this is a perfect candidate for me to put into the query parameter section. The field to sort on is related to the sorting itself and therefore also belongs to the query parameters. I've put the year into a matrix parameter as this is a certain property of the course itself, like the color of a car or the year the car was manufactured.

But as already explained previously, this is rather opinionated and may not match with your or an other folks perspective.



回答2:

I could use either of the following:

  • GET /api/professors/:prof_id/courses
  • GET /api/courses?professor_id=:prof_id

You could. Here are some things to consider:

Machines (in particular, REST clients) should be treating the URI as an opaque thing; about the closest they ever come to considering its value is during resolution.

But human beings, staring that a log of HTTP traffic, do not treat the URI opaquely -- we are actually trying to figure out the context of what is going on. Staying out of the way of the poor bastard that is trying to track down a bug is a good property for a URI Design to have.

It's also a useful property for your URI design to be guessable. A URI designed from a few simple consistent principles will be a lot easier to work with than one which is arbitrary.

There is a great overview of path segment vs query over at Programmers

https://softwareengineering.stackexchange.com/questions/270898/designing-a-rest-api-by-uri-vs-query-string/285724#285724

Of course, if you have two different URI, that both "follow the rules", then the rules aren't much help in making a choice.

Supporting multiple identifiers is a valid option. It's completely reasonable that there can be more than one way to obtain a specific representation. For instance, these resources

/questions/38470258/answers/first
/questions/38470258/answers/accepted
/questions/38470258/answers/top

could all return representations of the same "answer".

On the /other hand, choice adds complexity. It may or may not be a good idea to offer your clients more than one way to do a thing. "Don't make me think!"

On the /other/other hand, an api with a bunch of "general" principles that carry with them a bunch of arbitrary exceptions is not nearly as easy to use as one with consistent principles and some duplication (citation needed).

The notion of a "canonical" URI, which is important in SEO, has an analog in the API world. Mark Seemann has an article about self links that covers the basics.

You may also want to consider which methods a resource supports, and whether or not the design suggests those affordances. For example, POST to modify a collection is a commonly understood idiom. So if your URI looks like a collection

POST /api/professors/:prof_id/courses

Then clients are more likely to make the associate between the resource and its supported methods.

POST /api/courses?professor_id=:prof_id

There's nothing "wrong" with this, but it isn't nearly so common a convention.

GET /api/courses?where={professor_id: "teacher45", year: 2016}&order={attr: "topic", sort: "asc"}

I've never seen anyone do it this way though, so I wonder if I'm doing something stupid.

I haven't either, but syntactically it looks a little bit like GraphQL. I don't see any reason why you couldn't represent a query that way. It would make more sense to me as a single query description, rather than breaking it into multiple parts. And of course it would need to be URL encoded, etc.

But I would not want to crazy with that right unless you really need to give to your clients that sort of flexibility. There are simpler designs (see Roman's answer)