Proper use of HTTP status codes in a “validation”

2019-01-22 03:00发布

Among the data my application sends to a third-party SOA server are complex XMLs. The server owner does provide the XML schemas (.xsd) and, since the server rejects invalid XMLs with a meaningless message, I need to validate them locally before sending.

I could use a stand-alone XML schema validator but they are slow, mainly because of the time required to parse the schema files. So I wrote my own schema validator (in Java, if that matters) in the form of an HTTP Server which caches the already parsed schemas.

The problem is: many things can go wrong in the course of the validation process. Other than unexpected exceptions and successful validation:

  • the server may not find the schema file specified
  • the file specified may not be a valid schema file
  • the XML is invalid against the schema file

Since it's an HTTP Server I'd like to provide the client with meaningful status codes. Should the server answer with a 400 error (Bad request) for all the above cases? Or they have nothing to do with HTTP and it should answer 200 with a message in the body? Any other suggestion?

Update: the main application is written in Ruby, which doesn't have a good xml schema validation library, so a separate validation server is not over-engineering.

7条回答
SAY GOODBYE
2楼-- · 2019-01-22 03:34

Status code 422 ("Unprocessable Entity") sounds close enough:

"The 422 (Unprocessable Entity) status code means the server understands the content type of the request entity (hence a 415(Unsupported Media Type) status code is inappropriate), and the syntax of the request entity is correct (thus a 400 (Bad Request) status code is inappropriate) but was unable to process the contained instructions. For example, this error condition may occur if an XML request body contains well-formed (i.e., syntactically correct), but semantically erroneous, XML instructions."

查看更多
叛逆
3楼-- · 2019-01-22 03:41

Amazon could be used as a model for how to map http status codes to real application level conditions: http://docs.amazonwebservices.com/AWSImportExport/latest/API/index.html?Errors.html (see Amazon S3 Status Codes heading)

查看更多
别忘想泡老子
4楼-- · 2019-01-22 03:42

I'd go with 400 Bad request and a more specific message in the body (possibly with a secondary error code in a header, like X-Parse-Error: 10451 for easier processing)

查看更多
贼婆χ
5楼-- · 2019-01-22 03:44

That sounds like a neat idea, but the HTTP status codes don't really provide an "operation failed" case. I would return HTTP 200 with an X-Validation-Result: true/false header, using the body for any text or "reason" as necessary. Save the HTTP 4xx for HTTP-level errors, not application-level errors.

It's kind of a shame and a double-standard, though. Many applications use HTTP authentication, and they're able to return HTTP 401 Not Authorized or 403 Forbidden from the application level. It would be convenient and sensible to have some sort of blanket HTTP 4xx Request Rejected that you could use.

查看更多
贪生不怕死
6楼-- · 2019-01-22 03:47

Say you're posting XML files to a resource, eg like so:

POST /validator Content-type: application/xml

If the request entity fails to parse as the media type it was submitted as (ie as application/xml), 400 Bad Request is the right status.

If it parses syntactically as the media type it was submitted as, but it doesn't validate against some desired schema, or otherwise has semantics which make it unprocessable by the resource it's submitted to - then 422 Unprocessable Entity is the best status (although you should probably accompany it by some more specific error information in the error response; also note it's technically defined in an extension to HTTP, WebDAV, although is quite widely used in HTTP APIs and more appropriate than any of the other HTTP error statuses when there's a semantic error with a submitted entity).

If it's being submitted as a media type which implies a particular schema on top of xml (eg as application/xhtml+xml) then you can use 400 Bad Request if it fails to validate against that schema. But if its media type is plain XML then I'd argue that the schema isn't part of the media type, although it's a bit of a grey area; if the xml file specifies its schema you could maybe interpret validation as being part of the syntactic requirements for application/xml.

If you're submitting the XML files via a multipart/form or application/x-www-form-urlencoded form submissions, then you'd have to use 422 Unprocessable Entity for all problems with the XML file; 400 would only be appropriate if there's a syntactic problem with the basic form upload.

查看更多
We Are One
7楼-- · 2019-01-22 03:58

It's a perfectly valid thinking to map error situations in the validation process to meaningful HTTP status codes.

I suppose you send the XML file to your validation server as a POST content using the URI to determine a specific schema for validation.

So here are some suggestions for error mappings:

  • 200: XML content is valid
  • 400: XML content was not well-formed, header were inconsistent, request did not match RFC 2616 syntax
  • 401: schema was not found in cache and server needs credentials to use for authentication against the 3rd party SOA backend in order to obtain the schema file
  • 404: Schema file not found
  • 409: the XML content was invalid against the specified schema
  • 412: Specified file was not a valid XMl schema
  • 500: any unexpected exception in your validation server (NullPointerExceptions et al.)
  • 502: the schema was not found in cache and the attempt to request it from the 3rd party SOA server failed.
  • 503: validation server is restarting
  • 504: see 502 with reason=timeout
查看更多
登录 后发表回答