Thursday, December 20, 2012

XSLT: Select Distinct in XSL 1.0

cThe further one dives into XSLT, it may become necessary to extract a list of unique values from an XML document. This is commonly done in SQL through the SELECT DISTINCT statement, unfortunately, there is no direct equivalent in XSLT 1.0.

In order to perform this sort of functionality, one must leverage some of the more advanced aspects of XSLT including the preceding-sibling:: or another such "axis" as it's known in XSL.

To better understand, lets look at an example.  Given the following XSD snippet:

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:element name="houseCategory" abstract="true"/>
    <xsd:element name="houseCategoryText" type="xsd:string" substitutionGroup="houseCategory"/>
    <xsd:element name="houseCategoryFlag" type="xsd:boolean" substitutionGroup="houseCategory"/>
    <xsd:element name="housePurchaseDateRepresentation" abstract="true"/>
    <xsd:element name="housePurchaseDate" type="xsd:date" substitutionGroup="housePurchaseDateRepresentation"/>
    <xsd:element name="housePurchaseDateTime" type="xsd:dateTime" substitutionGroup="housePurchaseDateRepresentation"/>
</xsd:schema>

The following XSLT will extract a list of unique substitutionGroup attribute values from above and list them in the output: 

<?xml version="1.0" encoding="UTF-8"?>
  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
  <xsl:output method="xml" indent="yes"/>
  <xsl:template match="*">
    <xsl:for-each
        select="/xsd:schema/xsd:element/@substitutionGroup[not(. = ../preceding-sibling::xsd:element/@substitutionGroup/.)]">
        
        <xsl:element name="uniqueSubstitutionGroup">
            <xsl:value-of select="."/>
        </xsl:element>
        
    </xsl:for-each>    
  </xsl:template>
</xsl:stylesheet>

The resulting output would appear something like the following:

<?xml version="1.0" encoding="utf-8"?>
<uniqueSubstitutionGroup>houseCategory</uniqueSubstitutionGroup>
<uniqueSubstitutionGroup>housePurchaseDateRepresentation</uniqueSubstitutionGroup>

The information surrounding this question was sourced in part from information provided on Stack Overflow here

Wednesday, October 24, 2012

XSLT: Namespace From Prefix in XSL 1.0

When transforming XML Schema (XSD) files in XSLT 1.0, it quickly becomes necessary to resolve the namespace from the prefix of any given element or attribute.  Sometimes these elements and attributes are not stored as actual elements and attributes, which makes it necessary to resolve them in a different manner. For example, in the following XSD snippet, iso_639-3:LanguageCodeSimpleType is not a node, element nor is it an attribute, rather it is a text value of the type attribute.

<xsd:schema 
  xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
  xmlns:iso_639-3="http://niem.gov/niem/iso_639-3/2.0">
    <xsd:attribute name="languageCode" 
      type="iso_639-3:LanguageCodeSimpleType"/>
</xsd:schema>

One way to resolve what namespace the iso_639-3 prefix is associated with, is to extract the prefix from the text and use the following simple XSLT 1.0 function/template.

<xsl:template name="namespaceFromPrefix">
  <xsl:param name="sPrefix"/>
  <xsl:value-of select="/*/namespace::*[name() = $sPrefix]"/>
</xsl:template>

If one were to pass in the prefix iso_639-3 the return value would be http://niem.gov/niem/iso_639-3/2.0

Wednesday, October 17, 2012

Schematron: Using Schematron Client-Side!

As browsers continue to evolve and incorporate more standards, it is starting to become possible to better leverage XML directly in client-side browsers. 

A great series of articles (4 in all) has been posted by a colleague about how to leverage XML in forms capture and even use Schematron to help validate it!  Check it out http://udiminished.blogspot.com/2011/12/simplify-with-xml-data-model-part-1.html

Friday, September 21, 2012

XSLT: Extract a Branch from an XML Tree

Often you may wish to extract a portion of an XML file including all of its children elements so as to better deal with it or further transform or handle it elsewhere. 

For example, in web-service-based development, it is common for a developer to “extract” the payload from a request or response envelope.  This can also be in situations where a NIEM developer wishes to obtain the payload portion of a LEXS package and further process, store or display it. 

With XSLT, it only takes the following few lines of code to pull a branch of XML out of a larger package.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">
    <xsl:output method="xml" indent="yes"/>
    
    <xsl:template match="/uml">
        <xsl:apply-templates select="XMI" mode="RecursiveDeepCopy" />
    </xsl:template> 
    
    <xsl:template match="@*|node()" mode="RecursiveDeepCopy">
        <xsl:copy>
            <xsl:copy-of select="@*" />
            <xsl:apply-templates mode="RecursiveDeepCopy" />
        </xsl:copy> 
    </xsl:template> 

</xsl:stylesheet>

In the above example, the XMI/* elements and attributes are being extracted from the /uml/ container element. Before the transform, the XML would look something like this:

<uml>
  <XMI>
    <XMI.header/>
    <XMI.content/>
  </XMI>
</uml>

And after it would look like this:

<XMI>
  <XMI.header/>
  <XMI.content/>
</XMI>

For information on similar transforms, simply Google or Bing "Identity transforms."

Saturday, September 8, 2012

XSD: Extending Code Lists with xsd:union

In certain circumstances it is necessary to add elements to an existing NIEM enumeration (or code list).  In these situations one may choose to simply recreate a new list with all the same elements already defined in a NIEM code type and simply add those which do not yet exist.  However, when the code list is larger than a few elements (such as a state code list with at least 50 valid values), using xsd:union as an option becomes more appealing.

The xsd:union provides a way to combine simple data types together to form a larger and more comprehensive data type.  An example would be simply adding “ZZ” to a list of US Postal Service State (USPS) Codes to communicate an unknown or invalid state.  This can be accomplished by extending the existing USPS code list in several steps.

Step 1 – Create a New Simple Type With New Values

<!-- Simple code value to add ZZ as a valid value -->
<xsd:simpletype name="USStateCodeDefaultSimpleType">
  <xsd:restriction base="xsd:token">
   <xsd:enumeration value="ZZ">
    <xsd:annotation>
     <xsd:documentation>UNKNOWN</xsd:documentation>
    </xsd:annotation>
   </xsd:enumeration>
  </xsd:restriction>
</xsd:simpletype>

Step 2 – Use xsd:union to Join the Values with Existing Values

<!-- New simple time combining my custom enum with the standard usps one --> 
<xsd:simpleType name="LocationStateCodeSimpleType">
  <xsd:union memberTypes="usps:USStateCodeSimpleType my:USStateCodeDefaultSimpleType"/>
</xsd:simpleType>

Step 3 – Wrap the New Simple Data Type in a Complex Type

<!-- New complexType required to add s:id and s:idref to the definition -->
<xsd:complexType name="LocationStateCodeType">
  <xsd:simpleContent>
    <xsd:extension base="aoc_code:LocationStateCodeSimpleType"> 
      <xsd:attributeGroup ref="s:SimpleObjectAttributeGroup"/>
    </xsd:extension>
  </xsd:simpleContent>
</xsd:complexType>

Step 4 – Create Element Instantiating the New Code List

<!-- Element declaration allowing use of our new data type -->
<xsd:element name="NewStateCode" type="my:LocationStateCodeType" substitutionGroup="nc:LocationStateCode"/>

Now any place an nc:LocationStateCode can be use, our extended code list can be used instead.

Wednesday, July 25, 2012

XSLT: Convert Standard U.S. Date into xsd:date Format

When working with XML in the United States (U.S.), one will often find dates which have been formatted in the traditional U.S. Short Format even though XML Schema (XSD) enforces a more locale-neutral format.  This means often converting data from:

<SomeUsDate>1/10/2001</SomeUsDate>

Into:

<SomeXsdDate>2001-01-10</SomeXsdDate>

If one is using XSLT 2.0, this can simply be done by including and calling the function within the FunctX library here

In XSLT 1.0, a very limited set of string manipulation functionality exists. Even so, it is possible (although convoluted) to convert a typical U.S.-formatted date into an XML-Schema enforced date. A possible solution is listed below:

<xsl:template name="aoc:txfDateFormat">
         <xsl:param name="UsDate"/>

         <xsl:choose>

             <!-- Test to see if date contains a date with slashes in it. -->
             <xsl:when test="contains($UsDate, '/')">
                 <xsl:choose>

                     <!-- 2 Digit Month, 2 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=2 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=2)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-',substring-before($UsDate, '/'), '-', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>

                     <!-- 1 Digit Month, 2 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=1 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=2)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-0',substring-before($UsDate, '/'), '-', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>

                     <!-- 2 Digit Month, 1 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=2 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=1)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-',substring-before($UsDate, '/'), '-0', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>

                     <!-- 1 Digit Month, 1 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=1 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=1)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-0',substring-before($UsDate, '/'), '-0', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>
                 </xsl:choose>
             </xsl:when>

             <!-- Omit element if not. -->
             <xsl:otherwise/>
         </xsl:choose>
   </xsl:template>