SOAP, WSDL, HTTP, XSD? What the?

Man alive, this web services stuff is confusing!

This article was originally posted on our RPG-XML Suite product website.

Every time I turn around, I hear about a new acronymn that is supposed to somehow fit into the so-called next generation of communication. The number of acronymns and concepts related to web services can seem diabolical to the novice.

But you don’t have to quit your RPG programming job to dive into web services.

Web services have been around long enough now to establish which technologies will and won’t catch on. In the past few years, I have dealt with boatloads of web services, and I have found the methods involved vary. In other words, no set web services methodology exists. Countless specifications claim to prepare us for the next must-have revision, but they are useful only if my business partners choose to adopt them.

When I started my web service XML journey, I was a sponge. I learned every new web service-related technology as it came out, and I tried to find a need that fit the solution in front of me. Sound backwards? It is. So in this article, I take you back to the basics to show how these technologies ease the pain of interplatform communication and how the RPG/System i environment has similar infrastructure.

In the beginning, HTTP gained widespread acceptance as a standard for the client to communicate with the server, and programmers realized that they could piggyback other communication methodologies to facilitate B2B communication. But what should businesses pass to each other? Should they pass HTML forms on the URL? That would be too messy and not relational enough for complex data sets. Should they use existing fixed-length textual format standards such as EDI X.12? We needed an easy-to-use, platform- and language-independent transmission technology — and what’s more independent than a textual document that describes itself? Enter XML.

XML Labels Data

XML is simply a way to label and hold transmitted data so that the receiving party can adequately parse it for the content within. Figure 1 shows an example of labeling data with XML tags, andFigure 2 shows a simple RPG program that uses qualified data structures. I think of an XML document as equivalent to an RPG qualified data structure — both hold and name each piece of data. The main difference between the two is how they are stored in memory. In memory, XML stays the same as in Figure 1 because the naming of the data is actually part of the data. However, the RPG data structure content is relatively positioned and fixed-width (Figure 3) — the data does not describe itself outside of the RPG language.

Now for some issues: What happens when I send an alpha character in the order number and my business partner isn’t expecting it? How do we know which date format to expect in the due attribute field? Can I repeat the <item> element within the <order> tag, or do I need to send one item per order? Business partners need to ask each other questions and discover details about how to format and organize passed data. XML Schema Definitions (XSDs) are the answer.

XML Schema Definitions

XSDs let you apply rules to an XML document, dictating what it can look like, the values fields can contain, how one tag relates to another, and more. Initially, document type definitions (DTDs) were created to facilitate the definition of document rules, but DTDs are going the way of the buffalo because they just don’t address enough needs. That said, many implementations of DTDs still exist, and you should learn their syntax. However, for all future development, I recommend you use XSDs to ensure your web services work with the latest web service developments.

Figure 4 shows an XSD declaring the rules for the order XML document in Figure 1. Wow, that is a lot of text, considering that this is an elementary structure with very few elements and attributes. Be thankful that Websphere Development Studio Client (WDSc) comes with excellent tooling that provides a visual environment to build and maintain XSDs (Figure 5 and Figure 6). However, even if you use the visual tooling, you can benefit from knowing how XSDs work from the ground up — roll up your sleeves, here we go.

XSDs define XML elements and attributes much like RPG D-specs define variables. An XML element can be simpleType or complexType. The main difference is that complexType allows you to define child XML elements and attributes, whereas simpleType does not. In Figure 4, elements order, shipTo, and item are complexType because they have child elements within them and/or attributes further defining them. If you define child elements, the <sequence> tag is required to house the child element definitions. Once you get down to a simpleType element, notice that the primitive data type is declared (e.g., type=”string”).

The minOccurs and maxOccurs attributes require an element to exist or state how many times it can be repeated. If there is no limit to how many times an element can be repeated, specify “unbounded.” You can also use minOccurs and maxOccurs as a roundabout way to define an element as “required,” similar to the way you can define an attribute of an element as required. If an attribute of an element is required, you should specify required=”true” in the attribute definition.

When developing XSDs, you may wonder when to use attributes and when to use elements. There isn’t really a right or wrong way in most cases. Many XML implementations go the “all elements no attributes” route; however, I advocate defaulting all fields to attributes unless an element is needed. I came to that conclusion by thinking of a DB2 record in a physical file. Each record has attributes (e.g., fields) within that make up its data. An order header record has fields (e.g., ordId, due, dropShip), and when we need to hold data for order line items, we branch off to a new order detail physical file record.

One benefit of using attributes instead of elements is that you need only half the bytes to define the attribute — it doesn’t need both a beginning and ending tag to define it. For example, rather than use <customerNumber>11232</customerNumber>, you can use customerNumber=”11232″. Consistency is more important than anything else. Develop a standard in your organization and follow it.

Why Namespaces?

As XML progresses as the de facto standard for data transmission, element-naming conflicts will become inevitable when you bring two business-defined XML documents into a single file. Additionally, when your own standards change, you may need to support old versions along with the new. Namespaces were added to XML to address these concerns. Currently, I see very littlenecessary XML namespace usage outside of the XML specifications themselves (e.g., in WSDL documents to separate XSD, SOAP, and WSDL element specification definitions). Namespaces will become more useful as XML acceptance grows and companies start rewriting first rounds of web service implementations. Namespaces work much like the prefix keyword in RPG programs for redefining fields for a file on the F-spec. Figure 7 shows two physical files that have some same-named fields (i.e., ID, CRTDT). If you use both of these files in the same program, you get compile-time errors stating that field “ID” is defined differently (i.e., 15 packed vs. 15 alpha). You can address this problem with the prefix keyword (Figure 8).

To qualify XML elements in a document, specify the xmlns attribute (Figure 9). The syntax for specifying namespaces is

xmlns:namespace-prefix="namespaceURI"

Omitting namespace-prefix declares that this is the default namespace for the document, which means you don’t have to fully qualify elements belonging to that namespace. You use the namespaceURI portion as a convention to uniquely identify an organization’s elements and attributes, not to look up information about the namespace or execute the URL. However, namespaceURI sometimes points to an actual web page that contains the XSD for the declared namespace.

The <phone> element in Figure 9 is prefixed or qualified with a namespace, because the makeup of the vendor’s <phone> element is different from your company’s <phone> element. When the parser gets to <vendor:phone …>, it knows to look for attributes areaCode, phoneNumber, and ext instead of trying to find the phone number between the begin and end phone tags. If you have only one namespace in a document, you can omit all xmlns declarations. If I have two or more namespaces, I usually pick as the default the one with the most elements to be used. I save more bits and bytes by not having to qualify the majority of the elements.

Simple Object Access Protocol

So far, my examples have shown how XML and XSDs define business data sent over the wire, but I haven’t said much about how actual programming fits into the picture. Simple Object Access Protocol (SOAP) was created to describe and/or implement the rules for program-to-program communication over HTTP using XML. SOAP is often perceived as complicated because of its obscurity — one possible reason for the fact that most implementations use a very small portion of the specification. Additionally, SOAP tools weren’t available immediately, and the details of SOAP were never meant for human eyes to decipher. Rather, SOAP was meant to be under-the-covers technology. Though SOAP isn’t as useful as the other specifications, its tooling is getting better.

For most SOAP implementations, you simply add a SOAP envelope and body to the beginning and end of an XML document. Remember, SOAP is nothing more than text. Figure 10 shows a non-SOAP example of a simple price-calculating XML request, and Figure 11 shows the SOAP equivalent. This example uses a style named Remote Procedure Call (RPC) Encoded — a common SOAP implementation. This approach to SOAP is undesirable because it causes the data type to be sent along with the data (hence, “encoded”), adding more to the already bloated document. Document Literal (Figure 12) is the preferable SOAP style.

When you use Document Literal, you specify the data types for the XML in an XSD within Web Service Description Language (WSDL), which I explain in a moment. Up to this point, we have defined XML as our way to label data for transport via HTTP, and we have XSDs to define how that XML can be formed. We added a SOAP envelope and body to facilitate purist web services. We can now compose the RPG program necessary to create this web service on our System i. Once the programming is complete, we need a way to relay the details of how to make use of our web service. For this task, we need WSDL.

Web Service Description Language

You use WSDL at development time to describe a web service program with procedure names, input/output parameters, the URL of the web service, and the enveloping mechanisms and transport to be used (i.e., SOAP over HTTP). If you are the one creating the web service, you create a WSDL file that other programmers use when they develop code on their end to consume your web service. WSDL files are not used at runtime. They are simply a method for most programming languages to “stub out” their code — in other words, produce a code template that can be filled in with business logic.

In concept, WSDL files are beautiful things: They free you from having to verbally describe the web service to a trading partner. Your trading partner can simply point a browser to a WSDL on your server and know every technical aspect about how to use your web service. The WSDL states the exact URL of the web service it defines (e.g., http://myiSeries.com/cgi-bin/ ws1) and describes what data is required for input and output, all in one document. I liken a WSDL to an RPG service program. Figure 13 shows how you can transpose an RPG prototype into WSDL’s equivalent portType tag. Note that both the prototype and portType are making references to the actual message to be used for input/output structures using keyword likeds and message. We have to look elsewhere in the WSDL document to find how the data types in the message attribute are defined, as we need to look elsewhere in our program to see how the structure specified in likeds is defined. Figure 14 shows how the message tags of a WSDL relate to RPG data structures. They both define the name and data type of information passed into and out of the interfaces.

Now that we’ve defined the names of the subprocedures (i.e., portTypes), we need to state where they are located. To call an object in an RPG service program, you need to know in which library it resides; in WSDL, you use the service element to define the URL of the web service (Figure 15).

Notice the port element and associated binding attribute. The binding portion of the WSDL document connects the <service> to the <portType> and defines the enveloping mechanism and transport to be used (Figure 16). The information in the binding tag could have been placed in either the service tag or the portType, but the creators of the specification separated it out for modularity’s sake.

The type attribute in the binding tag contains the name of the previously defined portType (i.e., ORDSV). There are two operation tags named the same as the operations defined in the portType tag — this is where the defining of the envelope (i.e., SOAP) takes place.

A raw WSDL is even more painful to look at than a raw XSD is. Luckily, WDSc offers visual drag-and-drop WSDL tooling. Figure 17 shows the four portions of the WSDL we discussed. From this screen in WDSc, you can completely compose a WSDL without having to look at the raw text.

Once you have a complete WSDL document in hand, you can fully test it using WDSc. Right-click the WSDL document and select Web Services|Test with Web Services Explorer to open a view that lets you call a procedure and specify necessary input parameters. Click Submit to send the request to the web service specified in the <service> portion of the WSDL. When the process is completed, you can access the request and response XML documents — helpful when debugging or trying to determine what exactly is being sent across the wire.

I always like to know what is going on at the lowest level, so Figure 18 shows a raw HTTP web service request, and Figure 19 shows the response. In this case, SOAP is not involved.

Universal Description, Discovery, and Integration

Say you’ve developed your first web service and everything works great, so your company makes plans to build more. Before long, the web services have multiplied like rabbits, and people continually ask you for URLs of WSDLs. To maintain your sanity, you need a central repository for WSDL documents — this is where a Universal Description, Discovery, and Integration (UDDI) server comes into play. UDDI servers were initially meant to serve the public sector. Companies could post descriptions and prices for their web services on central servers (both IBM and Microsoft hosted public servers). The boon of web services in the early part of the century died off and UDDI died with it, at least in the public sector. Today, UDDI is used mostly on intranets, which is unfortunate considering the number of big-name companies — like UPS and Google — that now offer public web services.

Where Do I Start?

“Web services” is a loosely defined term, and you should use what works best for you and your trading partners. Web services will take time to grow on you, so here is the approach I recommend. At first, simply send XML via HTTP without SOAP. Don’t define it first via XSD, don’t use namespaces, don’t create a WSDL, and don’t publish it to a UDDI repository. Be mindful of all these technologies so you can plan for the future, but test them incrementally.

After you are comfortable with the first approach, determine what you want to pass for data and create an XSD with WDSc. Create an instance of the XML by right-clicking the XSD, then choosing the Generate|XML File option. This will give you the full structure of XML to be passed and let you put it into your RPG program. When you’ve mastered this approach, use WDSc’s graphical editors to create a WSDL from the ground up and test it by right-clicking Web Services|Test with Web Services Explorer.