550 likes | 801 Views
SOAP. Simple Object Access Protocol. SOAP. SOAP stands for Simple Object Access Protocol Made up of three major parts A messaging framework An encoding standard An RPC (remote procedure call) framework
E N D
SOAP Simple Object Access Protocol
SOAP • SOAP stands for Simple Object Access Protocol • Made up of three major parts • A messaging framework • An encoding standard • An RPC (remote procedure call) framework • It is possible to use just the messaging framework or messaging framework/encoding standard without using the RPC mechanism (though latter is where much of power lies). • SOAP is based entirely on XML
SOAP: Messaging framework • Just defines a generic document type using XML • This document type represents the abstraction of a message • Virtually any type of message you can think of can be packaged as a SOAP message. • However, doing so without RPC mechanisms takes only very small advantage of the features defined in the SOAP standard
General (Basic) Structure SOAP Message • Envelope • Defines the content of the message • Header (optional) • Contains destination information, versioning, extensions • Good place for security • Body • Contains payload SOAP Envelope SOAP Header SOAP Body Payload Document(s) SOAP Fault
General (Basic) Structure SOAP Message <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/encoding/" soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" > <soap:Header> ... ... </soap:Header> <soap:Body> <!-- User request code here --> <soap:Fault> ... ... </soap:Fault> </soap:Body> </soap:Envelope>
SOAP encoding • The second component of SOAP is a standard for how to represent common datatypes as SOAP types. This is knows as the encoding style. • SOAP does this in a language agnostic way, much like CORBA (but not in binary form) • For example, SOAP stipulates that an array of three integers be represented as: SOAP-ENC:Array SOAP-ENC:arrayType="xsd:int[3]"><SOAP-ENC:int>8</SOAP-ENC:int><SOAP-ENC:int>5</SOAP-ENC:int><SOAP-ENC:int>9</SOAP-ENC:int></SOAP-ENC:Array>
SOAP RPC • The third part of SOAP is an RPC mechanism that turns messages into method calls • We have a generic message structure + data. It requires just a little more work to turn the message into a function call. • Must be a way to represent parameters and return values, exceptions, etc.
SOAP RPC cartoon VB application Java application InvoiceVB-Structure InvoiceJava-Structure SOAP client SOAP Server SOAP Message The client application thinks its making a procedure call to a remote module
SOAP protocol bindings • Question:how are SOAP messages transmitted? • Answer: using existing protocols (http, SMTP, etc.) • This has some obvious advantages vs. defining its own protocol • Piggybacks on security model, general robustness • This has some disadvantages also • What are these? • SOAP defines bindings to different protocols that specify how SOAP is used with that protocol to send messages. • http is most popular
Inside http • http is a simple, flexible protocol • Some examples GET http://people.cs.uchicago.edu/~asiegel/lottery/lotto.html POST /path/script.cgi HTTP/1.0 From: frog@jmarshall.com User-Agent: HTTPTool/1.0 Content-Type: application/x-www-form-urlencoded Content-Length: 32 home=Cosby&favorite+flavor=flies POST /path/script.cgi HTTP/1.0 From: frog@jmarshall.com User-Agent: HTTPTool/1.0 Content-Type: text/xml Content-Length: 32 <greeting>hello world</greeting>
Testing http • Good idea to play around with http by connecting to server and issuing http commands • There are a two typical ways to do this: • Using telnet, which allows arbitrary commands to be passed to a server • telnet people.cs.uchicago.edu 80 • Note that expect can be useful in automating this • Using a socket library in a programming language (see sock.py on website) • Question: how does the server obtain the uploaded data in each case?
Role of SOAP • Note that the http + XML is the important thing here • SOAP only helps standardize the meaning of the messages that are sent • In terms of datatypes for rpc • In terms of headers, faults, etc. • Note that it is still possible to bypass SOAP and define your own xml-based protocol, retaining many of the advantages of SOAP.
Sorting out the API’s • In Java the following directly related API’s are available: • SAAJ • SOAP with Attachments API for Java • Provides a relatively low-level interface that allows one to programmatically construct/decompose SOAP messages and send to web server • Intended more tool writers. Good for learning. • JAX-RPC • Java API form XML-based RPC • Java’s rmi framework over SOAP • Compare RMI, CORBA, etc. • Makes developer unaware of SOAP internals • Apache XML-RPC for Java • An API implement rmi over XML-RPC • XML-RPC is an alternative protocol to SOAP
Envelope • MUST be the root element of the SOAP message • MUST be associated with SOAP envelope namespace • http://schemas.xmlsoap.org/soap/envelope • http://www.w3.org/2001/06/soap-envelope in SOAP 1.2 (Oct 15, 2002) • SOAP serialization namespace • Encoding Style attributes can contain a URI describing how the data should be serialized. • Two usual styles (more on this later) • "SOAP Section 5" encoding: http://www.w3.org/2001/06/soap-encoding • Literal encoding: (no namespace used – or set to empty string) • SOAP message MUST NOT contain • DTD • Processing Instructions.
Envelope versioning • Version determined by the namespace associated with the Envelope element • SOAP 1.1 Envelope version: http://schemas.xmlsoap.org/soap/envelope • If any other namespace used, assume it's a version problem • Versioning problems must generate a SOAP Fault • Example SOAP fault: HTTP/1.0 500 Internal Server Error Content-Type: text/html; charset="utf-8" Content-length: 311 <env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"> <env:Body> <env:Fault> <faultcode>env:VersionMismatch</faultcode> <faultstring>SOAP Envelope Version Mismatch</faultstring> </env:Fault> </env:Body> </env:Envelope>
Envelope Versioning Fault in SOAP 1.2 Note that 1.2 Envelope Version Fault Response is versioned 1.1 (or whatever incoming request is) • SOAP 1.2 (Oct 15, 2002) has defined an Upgrade element in the header for the versioning fault: <?xml version="1.0" ?> <env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"> <env:Header> <upg:Upgrade xmlns:upg="http://www.w3.org/2002/06/soap-upgrade" > <envelope qname="ns1:Envelope" xmlns:ns1="http://www.w3.org/2002/06/soap-envelope" /> </upg:Upgrade> </env:Header> <env:Body> <env:Fault> <faultcode>env:VersionMismatch</faultcode> <faultstring>Version Mismatch</faultstring> </env:Fault> </env:Body> </env:Envelope>
Header • Optional • If present, must immediately follow the SOAP Envelope XML element followed by any header entries • Uses same namespace as Envelope • Often contains meta-information regarding the method call. • Examples: • Security • No security mechanisms yet, but soon • Transaction IDs
Header • actor attribute defines the URI for which the header elements are intended (i.e. who should process a Header element) • mustUnderstand attribute how to process (default is “0” if not present) • encodingStyle attribute used to describe how data (such as binary integers) are marshaled into characters in the XML document <env:Header> <t:TransactionID xmlns:t="http://www.cs.uchicago.edu/dangulo/transact" env:mustUnderstand="1" env:actor="http://www.cs.uchicago.edu/dangulo/transact" > 42 </t:TransactionID> <m:localizations xmlns:m="http://www.cs.uchicago.edu/dangulo/localize/" env:actor="http://www.cs.uchicago.edu/dangulo/currency" > <m:language>en</m:language> <m:currency>USD</m:currency> </m:localizations> </env:Header>
actor Attribute • The SOAP message often gets passed through several intermediaries before being processed • For example, a SOAP proxy service might get the message before the target SOAP service • Header may contain information for both • intermediary service • target service • actor attribute specifies which service should process a specific Header element • actor attribute is replaced by role attribute in SOAP 1.2
Intermediary Services • SOAP requires that an intermediary strip off Header elements specified for that intermediary before passing the message to the next service in the chain. • If information in a Header element targeted for an intermediary is also needed by another service in the chain • The intermediary service may insert additional Header elements with an actor attribute that specifies the downstream service • In fact, any service may insert any Header elements that it deems necessary • If a Header element has no actor attribute • It is assumed to be destined for the final recipient • This is equivalent to adding an actor attribute with the URL of the final recipient
mustUnderstand Attribute • Also put on a Header element • If its value is "1" • recipient is required to understand and make proper use of the information supplied by that element • intended for situations where recipient can't do its job unless it knows what to do with the specific information supplied by this particular element • Examples of use • Client is upgraded to a new version which includes extra information • username • security
mustUnderstand Attribute • If the recipient does not understand this element • Must respond with a SOAP Fault HTTP/1.0 500 Internal Server Error Content-Type: text/xml; charst="utf-8" Content-length: 287 <env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"> <env:Body> <env:Fault> <faultcode>env:MustUnderstand</faultcode> <faultstring>SOAP Must Understand Error</faultcode> <faultactor>http://www.cs.uchicago.edu/dangulo/transact</faultactor> </env:Fault> </env:Body> </env:Envelope> • faultactor indicates where fault took place • We'll look at Faults in more detail later • Attribute values change to "true" / "false" in SOAP 1.2
Marshalling / Serialization VB application Java application Data here is binary Data here is binary Data here is ASCII InvoiceVB-Structure InvoiceJava-Structure Must Marshall or Serialize Must UnMarshall or DeSerialize SOAP client SOAP Server SOAP Message • To be interoperable, we use XML • XML is ASCII, not binary • End points use binary • Must Marshall (Serialize) and UnMarshall (DeSerialize) on the ends
Encoding Style • Specifies how to Serialize/DeSerialize • Scoped • Applied to the element it was declared in as well as any sub-elements • Can go on any element • We'll cover this later
Body • Message to exchange. • Most often for RPC calls and error reporting. • Immediate child element of SOAP Envelope XML element • follows Header, if present • Uses same namespace as Envelope and Header • Contains serialized method arguments. • Remote method name • Used to name the method call’s XML element • Must immediately follow the SOAP body opening XML tag. • SOAP Fault goes in the Body (of a response) too • The only Body elements actually defined in the SOAP specification are the SOAP Fault elements • Other elements are user defined
Example • A simple SOAP XML document requesting the price of soap (leaving off the required namespaces declarations) <env:Envelope> <env:Body> <m:GetPrice> <Item>Lever2000</Item></m:GetPrice> </env:Body> </env:Envelope> • Note that namespaces qualifiers are not required on elements in the Body.
Client/Server… • In order for SOAP to work • Client must have code running that is responsible for building the SOAP request. • Server must also be responsible for • Understanding the SOAP request • Invoking the specified method • Building the response message • Returning it to the client. • These details are up to you. • There already exist SOAP implementations for languages such as C++, Perl, VB, and Java.
Binding • SOAP is transport independent • SOAP usually transported over HTTP • SOAP can be transported over any protocol • e.g. SMTP (e-mail) • GSI (Globus Secure Transport) • HTTPS • pure sockets • HTTP is the default binding
SOAPAction HTTP header • When using SOAP over HTTP, must include SOAPAction header • SOAPAction HTTP request header field indicates that it is a SOAP HTTP request (contains a SOAP message) • The value • Indicates the intent of the request in a manner readily accessible to the HTTP server. • Is a URI • Is up to the application – not specified by SOAP specs • Doesn't have to be resolvable • An HTTP client must use SOAPAction header field when issuing a SOAP HTTP Request. • An HTTP server must not process an HTTP request as a SOAP HTTP request if it does not contain a SOAPAction header field. • It may be used by firewalls to filter request messages • It may be used by servers to facilitate dispatching of SOAP messages to internal message handlers • It should not be used as an insecure form of access authorization.
SOAPAction HTTP header • Example POST /xt/services/ColorRequest HTTP/1.0 Content Length: 442 Host: localhost Content-type: text/xml; charset=utf-8 SOAPAction: "/getColor" <!?xml version="1.0" encoding="UFT.8"?> <env:Envelope env:encodingStyle="http://schemas.xmlsoap.org/SOAP/encoding/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:env="http://schemas.xmlsoap.org/SOAP/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> ...
SOAP Messages with Attachments • SOAP messages often have attachments, such as pictures • The attachments don't have to be XML encoded, but may be binary • The SOAP message becomes the root of a Multipart/Related MIME structure • The SOAP message refers to the attachment using a URI with the cid: protocol • cid = "content ID"
SOAP Messages with Attachments MIME-version: 1.0 Content-Type: Multipart/Related; ... --MIME_boundary Content-Type: text/xml; ... <?xml version="1.0" ?> <env:Envelope ... <someTag href="cid:attached.gif@company.com"/> ... </end:Envelope> --MIME_boundary Content-Type: image/gif Content-Transfer-Encoding: binary Content-ID: <"attached.gif@company.com"> ... binary gif image ...
Encoding • One type of encoding specified in "section 5" of the SOAP spec • No default encoding (not even "section 5" encoding) • Encoding rules exist to define mapping between abstract data types and XML syntax (binary to character mapping) • Encoding style is specified with the encodingStyle attribute
Encoding • The encodingStyle attribute can be placed on any element – allowing mixed encoding styles • Two values most often used (anything possible): • "SOAP Section 5" encoding: http://www.w3.org/2001/06/soap-encoding • Literal encoding: (no namespace used – or set to empty string) • Also can do base64 encoding • Standards and tools for encoding have not solidified yet • Best bet is to use either Section 5 or literal for the entire message • Can turn it off currently scoped style using an empty string as URL ("") • parent scoped style becomes default again
Response • No Special HTTP Response headers (doesn't use SOAPAction: header) • Only special SOAP element is the Fault element and its children • Otherwise, looks like a normal SOAP message <?xml version="1.0" encoding="UTF-8"?> <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org/1999/XMLSchema"> <soap:Body><m:FavoriteColorResponseMsg xmlns:m="http://www.cs.uchicago.edu/dangulo/soap-methods"> <answer xsi:type="xsd:string">Red...No, Blue...Aarrgh!</answer> </m:FavoriteColorResponseMsg> </soap:Body> </soap:Envelope>
Fault • The <fault> element is in the body of the SOAP message • 0 or 1 <fault> elements may be in the message • The following subelements may be in the <fault> element
Fault Code (faultcode) • One of Two required elements in Fault element • Other required element is faultstring • Must be associated with SOAP envelope namespace • Server error code could be something like a back-end database couldn't be reached • Might try resending without modification • Fault codes are extensible using "dot" notation • Server.BridgeKeeperAbsent • Server.BridgeKeeperAbsent.ThrownInGorge
HTTP Headers with Faults • Response code can only be 2xx or 500 • If message is received and understood, the response should use 2xx
HTTP Headers with Faults • If message cannot be processed for any reason • server does not understand the message • message is improperly formatted • message is missing information • message cannot be processed for any other reason • Response should be "500 Internal Server Error" • 500 response should be followed by a SOAP envelope which includes its own fault code • Reasoning: the error is internal to the server as far as HTTP is concerned
Example Fault with HTTP and SOAP HTTP/1.0 500 Internal Server Error Content-Type: text/xml; charst="utf-8" Content-length: 287 <env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"> <env:Body> <env:Fault> <faultcode>env:MustUnderstand</faultcode> <faultstring>SOAP Must Understand Error</faultcode> <faultactor>http://www.cs.uchicago.edu/dangulo/transact</faultactor> </env:Fault> </env:Body> </env:Envelope>
Bridge of Death Example • Request <?xml version="1.0" encoding="UTF-8"?> <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org/1999/XMLSchema"> <soap:Body> <m:FavoriteColorRequestMsg xmlns:m="http://www.cs.uchicago.edu/dangulo/soap-methods/"> <question xsi:type="xsd:string"> What is your favorite color? </question> </m:FavoriteColorRequestMsg> </soap:Body> </soap:Envelope>
Bridge of Death Response Example • Response <?xml version="1.0" encoding="UTF-8"?> <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org/1999/XMLSchema"> <soap:Body><m:FavoriteColorResponseMsg xmlns:m="http://www.cs.uchicago.edu/dangulo/soap-methods"> <answer xsi:type="xsd:string">Red...No, Blue...Aarrgh!</answer> </m:FavoriteColorResponseMsg> </soap:Body> </soap:Envelope>
Data Encoding • When sending data over a network • Data must comply with the underlying transmission protocol • Data must be formatted in such a way that both the sending and receiving entities understand its meaning • Even if endpoints are different platforms or languages • Model for SOAP encoding is based on XML data encoding • Encoding style given in Section 5 of the SOAP specification used to be most common encoding style used • Commonly called "SOAP-Section-5 encoding" • namespace: http://schemas.xmlsoap.org/encoding/ • Commonly aliased as SOAP-ENC:
Data Encoding and Schemas • In SOAP, Schemas are used as references to definitions of data elements • Aren't used to validate SOAP message data in standard SOAP processing • However, there's nothing stopping you from doing that • References to Schemas are often used as namespaces in order to disambiguate a serialized data element
Data Encoding and Schemas • SOAP Section 5 uses all of the build-in data types defined in the "XML Schema Part 2 Datatypes" specification (at w3c.org) • These data types need to be disambiguated • namespace: http://www.w3.org/2001/XMLSchema • Commonly aliased as xsd: • used with the data type names • e.g. xsd:string • A datum is given a data type using the type attribute • This attribute must also be disambiguated • namespace: http://www.w3.org/2001/XMLSchema-instance • Commonly aliased as xsi: • e.g. <dialog xsi:type="xsd:string">What is your favorite color?</dialog>
Other Common Namespaces • Envelope • namespace: http://schemas.xmlsoap.org/soap/envelope/ • Common aliases: env or SOAP-ENV or SOAP or soap • Example (we've seen this before) <!?xml version="1.0" encoding="UFT.8"?> <SOAP-ENV:Envelope SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/SOAP/encoding/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/SOAP/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> • Other schemas are commonly used (1999, 2000). • I may have some in my slides! • Aliases used are commonly the same • Doesn't matter to implementations because the SOAP message contains a reference to the correct schema
Data Types • The data type for a given value is never undefined in SOAP • SOAP distinguished between simple types and complex types • A simple type does not contain any named parts, it just contains a single piece of data • Example: string • Example: int • A complex type contains multiple pieces of data that have some relation to each other • Similar to structs or classes or arrays • Individual pieces of data may be accessed by using • an ordinal position in a sequence of values (like arrays) • values that are keys to an associative array (like hash tables) • the names of the constituent parts (like C structs) • There is always a way to distinguish a specific data value within a complex value • referred to as the "accessor" • A names subcomponent of a complex type may be a complex type itself
References Lots of unnecessary duplication • We briefly saw how to declare these in DTDs • Let's see how to use these <roundTableMembers> <member> <name>King Arthur</name> <position>King</position> </member> <member> <name>Sir Robin</name> <position>Knight</position> <king>King Arthur</king> </member> <member> <name>Sir Galahad</name> <position>Knight</position> <king>King Arthur</king> </member> ...