Thursday, April 10, 2008

Reprint: Consuming Web Service complex types in ColdFusion

In previous posts I have referenced an excellent article written by Doug James and Larry Afrin from the University of South Carolina. The link for that article has since gone dead, but someone helpfully posted a link to the article via a web archive. I am going to reproduce it here so I have an easy reference for it and so its close to my earlier post on array types.


Notes on Interfacing ColdFusion MX to External Web Services Requiring Complex-within-Complex XML Documents as Input


Authors:
Doug James (jamesd@musc.edu)
Larry Afrin, MD (afrinl@musc.edu)
Hollings Cancer Center

Medical University of South Carolina
March 31, 2005

<cfacknowledgement>
The authors would like to acknowledge Macromedia's Tom Jordahl (who we understand is sort
of the principal developer and "guru" of ColdFusion's web services functionality) both for his
assistance in helping them understand how ColdFusion handles certain complex web service
interactions and for his critical review of the following document.
</cfacknowledgement>

<cfdisclaimer>
While the authors have tried to ensure the accuracy and utility of the information below, they
offer the following information with no warranties whatsoever and hereby explicitly state that their employer has had absolutely nothing to do with the development of this document and
therefore also bears no liabilities with regard to how the information presented here may be used.

The authors are also quite sure that despite Tom Jordahl's review of this document, Macromedia
takes no responsibility for this information, either. By their posting of this information, the
authors are not offering themselves as support resources for other developers wrestling with the
problems addressed herein. Requests for assistance in this area that are communicated to the
authors may or may not be acknowledged or answered, solely at the authors' whim. After all, we
do have to attend first to our day jobs (which often spill over into our night jobs, too.) ;-)
</cfdisclaimer>

=======================================================

CFMX is able to communicate with web services hosted in arbitrary computing environments as long as the SOAP standards are followed. The SOAP standards require that all input arguments to a web service be passed as XML documents. The expected format of the XML document for any given input argument is defined either directly in the service's WSDL (Web Services Description Language) document, or by reference in that WSDL document to another data element definition document.

Before proceeding further to discuss how CFMX handles "complex"-type web service input arguments, it is helpful to review how CFMX handles a <cfinvoke> (or equivalent) request in general. CFMX first retrieves the target service's WSDL, then runs this WSDL through the WSDL2Java tool (see below for more information), which outputs Java code defining Java-based classes equivalent to the XML data structures defined in the WSDL. This Java code is then compiled. The input arguments provided to <cfinvoke> are then mapped to the Java types generated by the compilation, and finally the Java stub functions for those types are called. In this fashion, CFMX *automatically* converts input values referenced in <cfinvoke> or
<cfinvokeargument> tags to XML documents of the appropriate formats based on input argument format information provided in the service's WSDL. The burden on the coder, of
course, is to ensure the input values are structured in accordance with what CFMX expects to
find based on the translation by WSDL2Java from an XML-based data structure to a Java-based
data structure.

A similar process is followed to handle the service's return value and other output arguments.

This process is quite straightforward for "simple" type input arguments, such as simple strings or numeric values. Because ColdFusion variables, strictly speaking, are untyped, CFMX
automatically converts simple input ColdFusion variables to the equivalent string-based simple
XML structures.

However, some web service input arguments are XML documents of "complex" type. As briefly
described in the ColdFusion MX 7 documentation at: http://livedocs.macromedia.com/coldfusion/7/htmldocs/00001554.htm for complex-type arguments, CFMX expects to find a ColdFusion structure provided as the value for the input argument. Unfortunately, the ColdFusion MX documentation only provides clear examples of how to compose a ColdFusion structure that corresponds merely to a relatively simple form of a "complex" XML document in which the child elements of an outer "complex" parent element are simple scalar values, as illustrated in Example 1:

---------Example 1:------------------------------------
WSDL snippet:

<s:complexType name="Employee">
<s:sequence>
<s:element minOccurs="1" maxOccurs="1" name="fname" type="s:string" />
<s:element minOccurs="1" maxOccurs="1" name="lname" type="s:string" />
<s:element minOccurs="1" maxOccurs="1" name="age" type="s:int" />
</s:sequence>
</s:complexType>

Sample XML document:

<Employee>
<fname>John</fname>
<lname>Smith</fname>

<age>25</age>
</Employee>
CFML snippet:
<!--- Create a structure using CFScript, then call the web service. --->
<cfscript>
stUser = structNew();
stUser.fname = "John";
stUser.lname = "Smith";
stUser.age = 23;
ws = createObject("webservice", "http://somehost/echosimple.asmx?wsdl");
ws.echoStruct(stUser);

</cfscript>
-------------------------------------------------------

While the above is helpful for "simple" complex-type input arguments, the ColdFusion MX
documentation provides no examples of how to compose a ColdFusion structure that corresponds to a more complex form of a complex-type XML document in which the child elements of an outer "complex" parent element are themselves "complex" parent elements (so-called "complex-within-complex" XML documents), as illustrated in Example 2:

---------Example 2:------------------------------------

WSDL snippet:

<s:complexType name="Employee">
<s:sequence>
<s:element minOccurs="1" maxOccurs="1" name="fname" type="s:string" />
<s:element minOccurs="1" maxOccurs="1" name="lname" type="s:string" />
<s:element minOccurs="0" maxOccurs="unbounded" name="nickname" type="s:nickname" />
<s:element minOccurs="1" maxOccurs="1" name="age" type="s:int" />
<s:element ref="s:address" minOccurs="0" maxOccurs="unbounded" name="address" />
</s:sequence>
<xsd:attribute name="employeeGender" type="string" use="required"/>
</s:complexType>

<s:complexType name="nickname">
<s:simpleContent>
<s:extension base="string" />
</s:simpleContent>
</s:complexType>

<s:complexType name="address">
<s:sequence>
<s:element minOccurs="1" maxOccurs="1" name="street" type="s:string" />
<s:element minOccurs="1" maxOccurs="1" name="street" type="s:string" />
<s:element minOccurs="1" maxOccurs="1" name="state" type="s:state" />
<s:element minOccurs="1" maxOccurs="1" name="zip" type="s:zip" />
</s:sequence>
<xsd:attribute name="addressType" type="string" use="required"/>
</s:complexType>
Sample XML document:
<Employee employeeGender="Male">
<fname>John</fname>
<lname>Smith</fname>
<nickname>Jack</nickname>
<nickname>Johnny</nickname>
<age>25</age>
<address addressType="Home">
<street>25 Main Street</street>
<city>Townville</street>
<state>Anystate</state>
<zip>99999</zip>
</address>
</Employee>
CFML snippet:
<!--- Create a structure using CFScript, then call the web service. --->
<cfscript>
stUser = structNew();
stUser.fname = "John";
stUser.lname = "Smith";
stUser.age = 23;
<!--- If .x corresponds to *element* <x>, then what syntax is used to specify an *attribute*?
How should employeeGender get set up in this struct? Read on to find out. --->
<!--- And what about the *two* nicknames? stUser.nickname = ... clearly won't work.
Read on to find out how this is handled. --->
<!--- And what about <address>? Does that get coded simply as stUser.address = structNew(),
stUser.address.street = "25 Main Street", etc. etc.????? The answer is "No."
Read on to find out more. --->
ws = createObject("webservice", "http://somehost/echosimple.asmx?wsdl");
ws.echoStruct(stUser);
</cfscript>
-------------------------------------------------------
There are just a few key (pardon the pun) principles you need to understand in order to determine how to compose a CF structure that CFMX will map to the properly formatted XML document needed as an input argument to an arbitrary web service:

(1) Principle #1: Any given key in a CF structure will be mapped by CFMX into *either* an
element name *or* an attribute name depending on what role the WSDL says that particular
name should play at that level in the document. In other words, you do not need to differentiate
between attributes and elements in the CFML structure you create; ColdFusion will automatically
handle for you the proper mapping of each key into either an attribute or element as required by
the WSDL.

Following along with Example 2 above, then, stUser.age will get translated into an <age> child *element* within the <Employee> parent element, and stUser.employeeGender will get translated as the employeeGender *attribute* within the <Employee> element.

(2) Elements in a WSDL that can occur more than once (cf. the "address" element in Example 2 above) are represented in the CF struct as an array. OK, but an array of what? Well, it depends on how the WSDL defines the subelements.

Again, following along with Example 2 above, <address> would be coded into the stUser
structure as follows:

  stUser.address = arrayNew(1);
stUser.address[1] = structNew();
stUser.address[1].street = "25 Main Street";
stUser.address[1].city = "Townville";
stUser.address[1].state = "Anystate";
stUser.address[1].zip = "99999";
stUser.address[1].addressType = "Home";
and the nicknames would be coded as follows:
  stUser.nickname = arrayNew(1);
stUser.nickname[1].value = "Jack";
stUser.nickname[2].value = "Johnny";

Hey! Where did ".value" come from? Read on:

(3) As the complexity of the format of the required input XML document increases, it may
become increasingly difficult to "guess," using the above two principles, how the corresponding CF structure should be coded. (Similarly, when working with web service return variables, or
output arguments, of complex-within-complex type, it may become difficult understanding why
CFMX has translated the output into the rather complex structure revealed by <cfdump>.) To help work through this, the coder should apply the WSDL2Java tool (see below for more
information) against the target service's WSDL and carefully examine the output of this tool.
WSDL2Java is an open source utility that takes a WSDL as input and outputs the Java beans
corresponding to the methods of the web service described by the provided WSDL; importantly,
the inputs and outputs for these methods are also defined in the corresponding beans. Upon
careful examination, the WSDL2Java output -- i.e., the Java code defining each bean -- clearly
identifies the expected layouts of the CF structures corresponding to the inputs and outputs of the service's various methods.

For example, one of the things that becomes clear upon examining the WSDL2Java output has to
do with the use of ".value" above. This is complicated, so read carefully: If an element (or
sub-element) of an input argument defined in the WSDL is declared in the WSDL to be of
"complex" type, but the corresponding definition in the WSDL of that complex datatype declares
that the value of that datatype is in fact only a simple scalar value, then CFMX assumes the value to be passed as input for that element (i.e., the value placed between the <x> and </x> tags) will be found in the corresponding CF structure (at the appropriate level) in association with a key named "value".

Below is another example of a web service input argument formatting problem that was solved by examining the output of the WSDL2Java tool. This is a real-world example of using CFMX to interface to a UDDI server. UDDI stands for Universal Data Discovery & Integration. UDDI servers are sort of the Domain Name System (DNS) of the web services world. UDDI servers store information about web services, including their addresses. A given web service could have many implementations around the world. An application that wants to make use of these implementations doesn't have to know their addresses as long as (1) the service and its implementations are registered in UDDI, and (2) the application knows the address of at least one UDDI server (part of the UDDI standard includes, a la the Network News Transport Protocol (NNTP, used by Usenet discussion groups), a replication protocol so that UDDI servers exchange database updates with each other in order for them all to stay in sync). UDDI servers can have a number of programmatic interfaces (e.g., URL-parameterized HTTP GETs), but the most robust interface they support is a web service interface. That's right: you can use web services to query UDDI servers about web services. However, the WSDL for the web service one uses to query a UDDI server is quite complex. More information about UDDI can be found at http://uddi.org.

Here's an example of using the "find_business" method defined in the UDDI WSDL to retrieve information from the target UDDI server about the first 50 businesses known to that server whose names begin with "A":

---------Example 3:------------------------------------

WSDL snippet:
<xsd:complexType name="find_business">
<xsd:sequence>
<xsd:element ref="uddi:name" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element ref="uddi:identifierBag" minOccurs="0"/>
<xsd:element ref="uddi:categoryBag" minOccurs="0"/>
<xsd:element ref="uddi:tModelBag" minOccurs="0"/>
<xsd:element ref="uddi:discoveryURLs" minOccurs="0"/>
<xsd:element ref="uddi:findQualifiers" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="generic" type="string" use="required"/>
<xsd:attribute name="maxRows" type="int" use="optional"/>
</xsd:complexType>

<xsd:complexType name="name">
<xsd:simpleContent>
<xsd:extension base="string">
<xsd:attribute ref="xml:lang" use="optional"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
CFML snippet:
<cfscript>
// define the main outer structure, findBiz in this case, can be named anything one chooses
findBiz = structNew();

//generic and maxRows are attributes (required and optional, respectively) of the uddi:find_business tag
findBiz.generic = "2.0";
findBiz.maxRows = 50;

//uddi:find_business tag accepts one *or more* "name" values, so an array is used
findBiz.name = arrayNew(1);
findBiz.name[1] = structNew();

//uddi:name tag is a complex type with a string value, so 'value' is used as the key to the structure
findBiz.name[1].value = "A";
</cfscript>

<!---
Important note:
In the <cfinvoke> below, the URL to the UDDI WSDL is the value of the "webservice"
parameter; the "method" parameter specifies the UDDI inquiry method called "find_business",
which returns a uddi:businessList object in the returnVariable "busList".

In the UDDI WSDL, the "find_business" method is defined as requiring an input parameter
named "body" of complex type "find_business" which is also defined in the WSDL. The
parameter name "body" shown in the <cfinvoke> below could also have been set up as a
<cfinvokeargument name="body" value="#findBiz#"> tag within the <cfinvoke>.

The actual version 2.0 UDDI WSDL is at http://uddi.org/wsdl/inquire_v2.wsdl. This address
cannot be referenced as the "webservice" parameter in the <cfinvoke> tag because this particular
WSDL does not define the addresses of any actual UDDI servers. Thus, to query a UDDI server,
one has to copy this "generic" WSDL to somewhere else and modify it by adding in the proper
WSDL code to identify at least one actual UDDI server. The <cfinvoke> is then pointed at this
modified WSDL. CFMX will read the modified WSDL from this location, and in this modified
WSDL CFMX will find where it has to go to contact the actual UDDI service being targeted by
the coder.

The WSDL code that identifies an actual UDDI server is as follows. This happens to be the
real code for pointing at the main IBM UDDI server. This code gets inserted just ahead of the
closing </definitions> tag of the "generic" UDDI WSDL.

<service name="InquireSoap">
<port name="InquireSoap" binding="tns:InquireSoap">
<soap:address location="http://uddi.ibm.com/ubr/inquiryapi" />
</port>
</service>
--->

<cfinvoke
webservice = "http://www.webhost.com/modified_uddi_v2.wsdl"
method = "find_business"
body = #findBiz#
returnvariable="busList">
</cfinvoke>
-------------------------------------------------------


And another example from the world of UDDI, this time using the get_businessDetail method:

---------Example 4:------------------------------------

WSDL snippet:
<xsd:complexType name="get_businessDetail">
<xsd:sequence>
<xsd:element ref="uddi:businessKey" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="generic" type="string" use="required"/>
</xsd:complexType>

<xsd:simpleType name="businessKey">
<xsd:restriction base="string"/>
</xsd:simpleType>
CFML snippet:
<cfscript>
// busDetail is outer main structure
busDetail = structNew();

// generic is the required attribute of the "get_businessDetail" tag
busDetail.generic = "2.0";

//businessKey tag can occur many times, so it is mapped to an array
busDetail.businessKey = arrayNew(1);

/*
Because the "get_businessDetail" tag has attributes and sub-elements, the sub-elements have to
be mapped into structures as well, and, similar to the uddi:name above, the sub-element
"businessKey" just contains a string, so the key name 'value' is used. */

for (x = 1; x LTE arrayLen(busKeys); x = x + 1) {
busDetail.businessKey[x] = structNew();
busDetail.businessKey[x].value = busKeys[x]; //busKeys is an array predefined from another process
}
</cfscript>

<!---
As in Example 3, the UDDI WSDL declares that an input parameter named "body", of
complex type "get_businessDetail", is required for the "get_businessDetail" method.
--->
<cfinvoke
webservice = "http://www.webhost.com/modified_uddi_v2.wsdl"
method = "get_businessDetail"
body = #busDetail#
returnvariable="busEntity">
</cfinvoke>
-------------------------------------------------------

MORE INFORMATION ABOUT WSDL2Java:

WSDL2Java is a Java program that was created by the Apache group and is included in the CFMX distribution with the Axis package, the Apache group's implementation of the W3C SOAP standard.

To run WSDL2Java from the command line:

  1. The current directory needs to be the ColdFusion installation's 'lib' directory:
    • Windows: C:\CFusionMX\lib
    • RedHat Linux: /opt/coldfusionmx/lib

  2. Set classpath to include the following jar files:
    • RedHat Linux:
      • axis.jar
      • saaj.jar
      • jaxrpc.jar
      • xercesImpl.jar
      • wsdl4j.jar
      • commons-logging-1.0.2.jar
      • commons-discovery.jar
      • xml-apis.jar
      • activation.jar (This jar file can be found by adding runtime/lib/jrun.jar to the end of the classpath. Alternatively, it can be downloaded from Sun's Java Activation Framework web site.)

    • Windows:
      • SET
        CLASSPATH=axis.jar;saaj.jar;jaxrpc.jar;xercesImpl.jar;wsdl4j.jar;commons-logging-1.0.2.jar;commons-discovery.jar;activation.jar;xml-apis.jar

  3. Run the WSDL2Java program:
       Usage:  java org.apache.axis.wsdl.WSDL2Java [options] WSDL-URI
    '-v' option prints informational messages
    '-o' option is the output directory for emitted files.
    WSDL-URI can be either local or remote


    Example: java org.apache.axis.wsdl.WSDL2Java -v -o C:\wsdl2java_output http://uddi.org/wsdl/inquire_v2.wsdl

  4. More information about WSDL2Java can be found on the Apache Web Service web site, currently http://ws.apache.org/axis/java/user-guide.html#WSDL2JavaBuildingStubsSkeletonsAndDataTypesFromWSDL.


6 comments:

Sami Hoda said...

Some of the text is cut off?

Anonymous said...

This is my favourite web page I've seen all day!

Thanks to everyone, especially to the original author.

jones said...

Nice blog...
visit also coldfusion example

Anonymous said...

this is the most informative page on consuming webservices with coldfusion i have found today ...
and i'm searching the whole day

Spike Pierson said...

Hello, I just wanted to take a minute to tell you that you have a great site! Keep up the good work.

Jack said...

This is absolutely critical information for any realistic use of createObject("webservice", ...). I don't know how anyone made any use of that coldfusion feature before you made this blog post. Thanks!