Correct http status code for resource which requir

2019-03-07 20:16发布

站内文章 / 前沿技术

12 0

祖国的老花朵

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

There seems to be a lot of confusion about the correct http status code to return if the user tries to access a page which requires the user to login.

So basically what status code will be send when I show the login page?

I'm pretty sure we need to use a status code in the 4xx range.

I'm not talking about HTTP authentication here, so that's at least 1 status code we aren't going to use (401 Unauthorized).

Now what should we use? The answers (also here on SO) seem to vary:

According to the answer here we should use 403 Forbidden.

But in the description of the status code is:

Authorization will not help and the request SHOULD NOT be repeated.

Well that doesn't look like the right one. Since authorization WOULD help.

So let´s check out some other answer. The answer here even doesn't use the 4xx range at all but rather uses 302 Found

The description of the 302 Found status code:

The requested resource resides temporarily under a different URI. Since the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field.

I think that also isn't what I want. Since it is not the requested resource which resides under a different URI. But rather a completely different resource (login page vs authenticated content page).

So I moved along and picked another answer surprisingly with yet another solution.

This answer suggest we choose 400 Bad Request.

The description of this status code is:

The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.

I think the server understood the request just fine, but just refuses to give access before the user is authenticated.

Another answer also says a 403 response is correct, however it ends with:

If this is a public facing website where you are trying to deny access based on a session cookie [that's what I do], 200 with an appropriate body to indicate that log in is needed or a 302 temporary redirect to a log in page is often best.

So 403 is correct, but 200 or 302 is THE BEST.

Hey! That's what I am looking for: THE BEST solution. But shouldn't the best be the same as the correct one? And why would it be the best?

Thanks to all who have made it this far into this question :)

I know I shouldn't worry too much about it. And I think this question is more hypothetical (not really, but used it because of lack of a better word).

But this question is haunting me for some time now.

And if I would have been a manager (who just picked up some cool sounding words as they always do) I would have said: but, but, but, but restfulness is important. :-)

So: what is the right way™ of using a status code in the above situation (if any)?

tl;dr

What is the correct http status code response when a user tries to access a page which requires login?

回答1:

If the user has not provided any credentials and your API requires them, return a 401 - Unauthorized. That will challenge the client to do so. There's usually little debate about this particular scenario.

If the user has provided valid credentials but they are insufficient to access the requested resource (perhaps the credentials were for a freemium account but the requested resource is only for your paid users), you have a couple of options given the looseness of some of the HTTP code definitions:

Return 403 - Forbidden. This is more descriptive and is typically understood as, "the supplied credentials were valid but still were not enough to grant access"
Return 401 - Unauthorized. If you're paranoid about security, you might not want to give the extra information back to the client as was returned in (1) above
Return either 401 or 403 but with helpful information in the response body describing the reasons why access is being denied. Again, that information might be more than you would want to provide in case it helps attackers somewhat.

Personally, I've always used #1 for the scenario where valid credentials have been passed but the account they're associated with doesn't have access to the requested resource.

回答2:

You ask for "the best", "the right way", and "the correct", in turn, which makes answering this question difficult because those criteria are not necessarily interchangeable and may, in fact, conflict -- especially where RESTfulness is concerned.

The "best" answer depends on your application. Are you building a Plain Old Browser-Based (POBB) web-application? Are you building a native client (ex. iOS or Android) and hitting a service over the Web? Are you making heavy use of AJAX to drive web-page updates? Is curl the intended client?

Let's assume you are building a traditional web application. Let's look at how Google does it (output chopped for brevity):

$ curl -v http://gmail.com/
< HTTP/1.1 301 Moved Permanently
< Location: http://mail.google.com/mail/
< Content-Type: text/html; charset=UTF-8
< Content-Length: 225
< ...

Google first redirects us to the "true" URL for GMail (using a 302 redirect).

$ curl -v http://mail.google.com/mail/
< HTTP/1.1 302 Moved Temporarily
< Location: https://accounts.google.com/ServiceLogin?service=mail&passive=true&rm=false&continue=http://mail.google.com/mail/&scc=1&ltmpl=default&ltmplcache=2
< Content-Type: text/html; charset=UTF-8
< Content-Length: 352
< ...

And then it redirects us to the login page (using a 302 redirect).

$ curl -v 'https://accounts.google.com/ServiceLogin?service=mail&passive=true&rm=false&continue=http://mail.google.com/mail/&scc=1&ltmpl=default&ltmplcache=2'
< HTTP/1.1 200 OK
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< ...

The login page itself is delivered with the 200 status code!

Why this way?

From a user-experience perspective, if a user goes to a page they can't view because they are not authenticated, you want to take the user to a page that allows them to correct this (via logging in). In this example, the login page stands alone and is just another page (which is why 200 is appropriate).

You could throw up a 4XX page with an explanation and a link to the login page. That might, in fact, seem more RESTful. But it's a worse user experience.

Ok, but is there a case where something like 403 makes sense? Absolutely.

First, though, note that 403 isn't well-defined in the specification. In order to understand how it should be used, you need to look at how it's implemented in the field.

403 is commonly used by web servers like Apache and IIS as the status code for pages returned when the browser requests a directory listing (a URI ending in "/") but the server has directory listings disabled. In this case, 403 is really a specialized 404, and there isn't much you can do for the user except let him/her know what went wrong.

However, here's an example of a site that uses the 403 to both signal to the user that he/she doesn't have sufficient privilege and what action to take to correct the situation (check out the full response for details):

curl -v http://www.w3.org/Protocols/rfc2616/
< HTTP/1.1 403 Forbidden
< Content-Type: text/html; charset=iso-8859-1
< Content-Length: 1564
< ...

(As an aside, 403 is also seen in web-based APIs, like Twitter's API; here, 403 means "The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits.")

As an improvement, let's assume, however, that you don't want to redirect the user to a login page, or force the user to follow a link to the login page. Instead, you want to display the login form on the page that the user is prevented from seeing. If they successfully authenticate, they see the content when the page reloads; if they fail, they get the login form again. They never navigate to another URL.

In this case, a status code of 403 makes a lot of sense, and is homologous to the 401 case, with the caveat that the browser won't pop up a dialog asking the user to authenticate -- the form is in the page itself.

This approach to authentication is not common, but it could make sense, and is IMHO preferable to the pop-up-a-javascript-modal-to-log-in solutions that developers try to implement.

It comes down to the question, do you want to redirect or not?

Additional: thoughts about the 401 status code...

The 401 status code -- and associated basic/digest authentication -- has many things going for it. It's embraced by the HTTP specification, it's supported by every major browser, it's not inherently un-RESTful... The problem is, from a user experience perspective, it's very very unattractive. There's the un-stylable, cryptic pop-up dialog, lack of an elegant solution for logging out, etc. If you (or your stakeholders/clients) can live with those issues (a big if) then it might qualify as the "correct" solution.

回答3:

Agreed. REST is just a style, not a strict protocol. Many public web services deviate from this style. You can build your service to return whatever you want. Just make sure your clients know how what return codes to expect.

Personally, I have always used 401 (unauthorized) to indicate an unauthenticated user has requested a resource that requires a login. I then require the client application to guide the user to the login.

I use 400 (bad request) in response to a logon attempt with invalid credentials.

HTTP 302 (moved) seems more appropriate for web applications where the client is a browser. Browsers typically follow the re-direct address in the response. This can be useful for guiding the user to a logon page.

回答4:

I'm not talking about HTTP authentication here, so that's at least 1 status code we aren't going to use (401 Unauthorized).

Wrong. 401 is part of Hypertext Transfer Protocol (RFC 2616 Fielding, et al.), but not limited to HTTP authentication. Furthermore, it's the only status code indicating that the request requires user authentication.

302 & 200 codes could be used and is easier to implement in some scenarios, but not all. And if you want to obey the specs, 401 is the only correct answer there is.

And 403 is indeed the most wrong code to return. As you correctly stated...

Authorization will not help and the request SHOULD NOT be repeated.

So this is clearly not suitable to indicate that authorization is an option.

I would stick to the standard: 401 Unauthorized

UPDATE

To add a little more info, lifting the confusion related to...

The response MUST include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource.

If you think that's going to stop you from using a 401, you have to remember there's more:

"The field value consists of at least one challenge that indicates the authentication scheme(s) and parameters applicable to the Request-URI."

This "indicating the authentication scheme(s)" means you can opt-in for other auth-schemes!

The HTTP protocol (RFC 2616) defines a simple framework for access authentication schemes, but you don't HAVE to use THAT framework.

In other words: you're not bound to the usual WWW-Auth. You only just MUST indicate HOW your webapp does it's authorization and offer the according data in the header, that's all. According to the specs, using a 401, you can choose your own poison of authorization! And that's where your "webapp" can do what YOU want it to do when it comes to the 401 header and your authorization implementation.

Don't let the specs confuse you, thinking you HAVE to use the usual HTTP authentication scheme. You don't! The only thing the specs really enforce: you just HAVE/MUST identify your webapp's authentication scheme and pass on related parameters to enable the requesting party to start potential authorization attempts.

And if you're still unsure, I can put all this into a simple but understandable perspective: let's say you're going to invent a new authorization scheme tomorrow, then the specs allow you to use that too. If the specs would have restricted implementation of such newer authorization technology implementations, those specs would've been modified ages ago. The specs define standards, but they do not really limit the multitude of potential implementations.

回答5:

Your "TL;DR" doesn't match the "TL" version.

The proper response for requesting a resource that you need authorization to request, is 401.

302 is not the proper response, because, in fact, the resource is not available some place else. The original URL was correct, the client simply didn't have the rights. If you follow the redirect, you do not actually get what you're looking for. You get dropped in to some ad hoc workflow that has nothing to do with the resource.

403 is incorrect. 403 is the "can't get there from here" error. You simply can't see this, I don't care who you are. Some would argue 403 and 404 are similar. The difference is simply with 403, the server is saying "yea, I have it, but you can't", whereas 404 says "I know nothing about what you're talking about." Security wonks would argue that 404 is "safer". Why tell them something they don't need to know.

The problem you are encountering has nothing to do with REST or HTTP. Your problem is trying to set up some stateful relationship between the client and server, manifested in the end via some cookie. The whole resource -> 302 -> Login page is all about user experience using the hack that's known as the Web Browser, which happens to be both, in stock form, a lousy HTTP client and a lousy REST participant.

HTTP has an authorization mechanism. The Authorization header. The user experience around it, in a generic browser, is awful. So no one uses it.

So there is not proper HTTP response (well there is, 401, but don't/can't use that). There is not proper REST response, as REST typically relies on the underlying protocol (HTTP in this case, but we've tackled that already).

So. 302 -> 200 for the login page is all she wrote. That's what you get. If you weren't using the browser, or did everything via XHR or some other custom client, this wouldn't be an issue. You'd just use Authorization header, follow the HTTP protocol, and leverage a scheme like either DIGEST or what AWS uses, and be done. Then you can use the appropriate standards to answer questions like these.

回答6:

As you point out, 403 Forbidden is explicitly defined with the phrase "Authorization will not help", but it is worth noting that the authors were almost certainly referring here to HTTP authorization (which will indeed not help as your site uses a different authorization scheme). Indeed, given that the status code is a signal to the user agent rather than the user, such a code would be correct insofar as any authorization the agent attempts to provide will not assist any further with the required authorization process (c.f. 401 Unauthorized).

However, if you take that definition of 403 Forbidden literally and feel it is still inappropriate, perhaps 409 Conflict might apply? As defined in RFC 2616 §10.4.10:

   The request could not be completed due to a conflict with the current
   state of the resource. This code is only allowed in situations where
   it is expected that the user might be able to resolve the conflict
   and resubmit the request. The response body SHOULD include enough
   information for the user to recognize the source of the conflict.
   Ideally, the response entity would include enough information for the
   user or user agent to fix the problem; however, that might not be
   possible and is not required.

There is indeed a conflict with the current state of the resource: the resource is in a "locked" state and such conflict can only be "resolved" through the user providing their credentials and resubmitting the request. The body will include "enough information for the user to recognize the source of the conflict" (it will state that they are not logged-in) and indeed will also include "enough information for the user or user agent to fix the problem" (i.e. a login form).

回答7:

Your Answer:

401 Unauthorized especially if you do not care or will not be redirecting people to a login page

-or-

302 Found to imply there was the resource but they need to provide credentials to be returned to it. Do this only if you will be using a redirect and make sure to provide appropriate information in the body of the response.

Other Suggestions:

401 Unauthorized is generally used for resources the user does not have access to after handling authentication.

403 Forbidden is a little obscure to me in honesty. I use it when I lock down resources from the file system level, and like your post said, "authorization does not help".

400 Bad Request is inappropriate as needing to login does not represent malformed syntax.

回答8:

I believe 401 is the correct status code to return from failed authorization. Reference RFC 2616 section-14.8 It reads "A user agent that wishes to authenticate itself with a server-- usually, but not necessarily, after receiving a 401 response"