Why Is HTTP/SOAP considered to be “thick”

2019-01-21 10:47发布

问题:

I've heard some opinions that the SOAP/HTTP web service call stack is "thick" or "heavyweight," but I can't really pinpoint why. Would it be considered thick because of the serialization/deserialization of the SOAP envelope and the message? Is that really a heavy-weight operation?

Or is it just considered "thick" compared to a raw/binary data transfer over a fixed connection?

Or is it some other reason? Can anyone shed some light on this?

回答1:

SOAP is designed to be abstract enough to use other transports besides HTTP. That means, among other things, that it does not take advantage of certain aspects of HTTP (mostly RESTful usage of URLs and methods, e.g. PUT /customers/1234 or GET /customers/1234).

SOAP also bypasses existing TCP/IP mechanisms for the same reason - to be transport-independent. Again, this means it can't take advantage of the transport, such as sequence management, flow control, service discovery (e.g. accept()ing a connection on a well-known port means the service exists), etc.

SOAP uses XML for all of its serialization - while that means that data is "universally readable" with just an XML parser, it introduces so much boilerplate that you really need a SOAP parser in order to function efficiently. And at that point, you (as a software consumer) have lost the benefit of XML anyways; who cares what the payload looks like over the wire if you need libSOAP to handle it anyways.

SOAP requires WSDL in order to describe interfaces. The WSDL itself isn't a problem, but it tends to be advertised as much more "dynamic" than it really is. In many cases, a single WSDL is created, and producer/consumer code is auto-generated from that, and it never changes. Overall, that requires a lot of tooling around without actually solving the original problem (how to communicate between different servers) any better. And since most SOAP services run over HTTP, the original problem was already mostly solved to begin with.



回答2:

SOAP and WSDL are extremely complicated standards, which have many implementations that support different subsets of the standards. SOAP does not map very well to a simple foreign function interface in the same way that XML-RPC does. Instead, you have to understand about XML namespaces, envelopes, headers, WSDL, XML schemas, and so on to produce correct SOAP messages. All you need to do to call an XML-RPC service is to define and endpoint and call a method on it. For example, in Ruby:

require 'xmlrpc/client'

server = XMLRPC::Client.new2("http://example.com/api")
result = server.call("add", 1, 2)

Besides XML-RPC, there are other techniques that can also be much more simple and lightweight, such as plain XML or JSON over HTTP (frequently referred to as REST, though that implies certain other design considerations). The advantage of something like XML or JSON over HTTP is that it's easy to use from JavaScript or even just a dumb web page with a form submission. It can also be scripted easily from the command line with tools like curl. It works with just about any language as HTTP libraries, XML libraries, and JSON libraries are available almost everywhere, and even if a JSON parser is not available, it is very easy to write your own.

Edit: I should clarify that I am referring to how conceptually heavyweight SOAP is, as opposed to heavy weight it is in terms of raw amount of data. I think that the raw amount of data is less important (though it adds up quick if you need to handle lots of small requests), while how conceptually heavyweight it is is quite important, because that means that there are a lot more places where something can go wrong, where there can be an incompatibility, etc.



回答3:

I agree with the first poster, but would like to add to it. The thick and thin definition is relative. With transports like JSON or REST emerging SOAP looks heavy on the surface for "hello world" examples. Now as you might already know what makes SOAP heavy and WS 2.0 in general is the enterprise/robust features . JSON is not secure in the same way that WS 2.0 can be. I have not heard SOAP referred to as thick, but many non-XML nuts look at these specifications as heavy or thick. To be clear I am not speaking for or against either as the both have their place. XML more verbose and human readable and thus "thicker". The last piece is that some people view HTTP a persisting connection protocol to be heavy given newer web trends like AJAX rather than serving up on big page. The connection overhead is large given there is really no benefit.

In summary, no real reason other than someone wants to call SOAP/HTTP thick, it is all relative. Fewer standards are perfect and for all scenarios. If I had to guess some smart web developer thinks he is being oh so smart by talking about how think XML technologies are and how super JSON is. Each have a place.



回答4:

SOAP's signal-to-noise ratio is too low. For a simple conversation there's too much structural overhead with no data value; and there's too much explicit configuration required (as compared to implicit configuration, like JSON).

It didn't start out that way, but it ended up being a poster-child for what happens to a good idea when a standards committee gets involved.



回答5:

1 - XML schemas, which are a key part of the WSDL spec, are really, really big and complicated. In practice, you tools that do things like map XML schema to programming language constructs only end up supporting part of the XML schema features.

2 - The WS-* specs, e.g., WS-Security and WS-SecureConversation, are again big and complicated. They are almost designed so that no one will fewer resources than Microsoft or IBM would ever be able to implement them completely.



回答6:

First of all, it depends a lot on how your services are implemented (i.e. you can do a lot to reduce the payload by just being careful of how your method signatures are done).

That said, not only the soap envelope but the message itself can be a lot more bulky in xml rather than a streamlined binary format. Just choosing the right class and member names can reduce it a lot...

Consider the following examples of serialized method returns from methods returning a collection of a stuff. Just choosing the right [serialization] name for classes/wrappers and members can make a big difference in the verbosity of the serialized soap request/response if you're returning repeated data (e.g. lists/collections/arrays).

Brief / short names:

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfShortIDName xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://tempuri.org/">
  <ShortIDName>
    <id>0</id>
    <name>foo 0</name>
  </ShortIDName>
  <ShortIDName>
    <id>1</id>
    <name>foo 1</name>
  </ShortIDName>
  <ShortIDName>
    <id>2</id>
    <name>foo 2</name>
  </ShortIDName>
  ...
</ArrayOfShortIDName>

Long names:

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfThisClassHasALongClassNameIDName xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://tempuri.org/">
  <ThisClassHasALongClassNameIDName>
    <MyLongMemberNameObjectID>0</MyLongMemberNameObjectID>
    <MyLongMemberNameObjectName>foo 0</MyLongMemberNameObjectName>
  </ThisClassHasALongClassNameIDName>
  <ThisClassHasALongClassNameIDName>
    <MyLongMemberNameObjectID>1</MyLongMemberNameObjectID>
    <MyLongMemberNameObjectName>foo 1</MyLongMemberNameObjectName>
  </ThisClassHasALongClassNameIDName>
  <ThisClassHasALongClassNameIDName>
    <MyLongMemberNameObjectID>2</MyLongMemberNameObjectID>
    <MyLongMemberNameObjectName>foo 2</MyLongMemberNameObjectName>
  </ThisClassHasALongClassNameIDName>
  ...
</ArrayOfThisClassHasALongClassNameIDName>


回答7:

I considered it "thick" because of the relatively large overhead involved with packaging and unpacking a message (serializing and deserializing).

Consider a web service with a web method called Add that takes two 32-bit integers. The caller packages up two integers and receive a single integer in reply. Where there's really only 96 bits of information being transmitted, the addition of the SOAP packets will probably add around 3,000 or more extra bits in each direction. A 30x increase.

Added to this is the relatively slow performance associated with serializing and deserializing the message into UTF-8 (or whatever) XML. Admittedly it's pretty fast these days, but it's certainly not trivial.



回答8:

I think it's mainly that the SOAP envelope adds a large amount of overhead to constructing the message, especially for the common case of a simple request with only a few, not-deeply-structured parameters. Compare that to a REST style web service where the parameters are simply included in the URL query.

Then add to that the complexity of WSDL and the typical "enterprise" library implementations...