Showing posts with label XSLT. Show all posts
Showing posts with label XSLT. Show all posts

Thursday, December 20, 2012

XSLT: Select Distinct in XSL 1.0

cThe further one dives into XSLT, it may become necessary to extract a list of unique values from an XML document. This is commonly done in SQL through the SELECT DISTINCT statement, unfortunately, there is no direct equivalent in XSLT 1.0.

In order to perform this sort of functionality, one must leverage some of the more advanced aspects of XSLT including the preceding-sibling:: or another such "axis" as it's known in XSL.

To better understand, lets look at an example.  Given the following XSD snippet:

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:element name="houseCategory" abstract="true"/>
    <xsd:element name="houseCategoryText" type="xsd:string" substitutionGroup="houseCategory"/>
    <xsd:element name="houseCategoryFlag" type="xsd:boolean" substitutionGroup="houseCategory"/>
    <xsd:element name="housePurchaseDateRepresentation" abstract="true"/>
    <xsd:element name="housePurchaseDate" type="xsd:date" substitutionGroup="housePurchaseDateRepresentation"/>
    <xsd:element name="housePurchaseDateTime" type="xsd:dateTime" substitutionGroup="housePurchaseDateRepresentation"/>
</xsd:schema>

The following XSLT will extract a list of unique substitutionGroup attribute values from above and list them in the output: 

<?xml version="1.0" encoding="UTF-8"?>
  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
  <xsl:output method="xml" indent="yes"/>
  <xsl:template match="*">
    <xsl:for-each
        select="/xsd:schema/xsd:element/@substitutionGroup[not(. = ../preceding-sibling::xsd:element/@substitutionGroup/.)]">
        
        <xsl:element name="uniqueSubstitutionGroup">
            <xsl:value-of select="."/>
        </xsl:element>
        
    </xsl:for-each>    
  </xsl:template>
</xsl:stylesheet>

The resulting output would appear something like the following:

<?xml version="1.0" encoding="utf-8"?>
<uniqueSubstitutionGroup>houseCategory</uniqueSubstitutionGroup>
<uniqueSubstitutionGroup>housePurchaseDateRepresentation</uniqueSubstitutionGroup>

The information surrounding this question was sourced in part from information provided on Stack Overflow here

Wednesday, October 24, 2012

XSLT: Namespace From Prefix in XSL 1.0

When transforming XML Schema (XSD) files in XSLT 1.0, it quickly becomes necessary to resolve the namespace from the prefix of any given element or attribute.  Sometimes these elements and attributes are not stored as actual elements and attributes, which makes it necessary to resolve them in a different manner. For example, in the following XSD snippet, iso_639-3:LanguageCodeSimpleType is not a node, element nor is it an attribute, rather it is a text value of the type attribute.

<xsd:schema 
  xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
  xmlns:iso_639-3="http://niem.gov/niem/iso_639-3/2.0">
    <xsd:attribute name="languageCode" 
      type="iso_639-3:LanguageCodeSimpleType"/>
</xsd:schema>

One way to resolve what namespace the iso_639-3 prefix is associated with, is to extract the prefix from the text and use the following simple XSLT 1.0 function/template.

<xsl:template name="namespaceFromPrefix">
  <xsl:param name="sPrefix"/>
  <xsl:value-of select="/*/namespace::*[name() = $sPrefix]"/>
</xsl:template>

If one were to pass in the prefix iso_639-3 the return value would be http://niem.gov/niem/iso_639-3/2.0

Friday, September 21, 2012

XSLT: Extract a Branch from an XML Tree

Often you may wish to extract a portion of an XML file including all of its children elements so as to better deal with it or further transform or handle it elsewhere. 

For example, in web-service-based development, it is common for a developer to “extract” the payload from a request or response envelope.  This can also be in situations where a NIEM developer wishes to obtain the payload portion of a LEXS package and further process, store or display it. 

With XSLT, it only takes the following few lines of code to pull a branch of XML out of a larger package.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">
    <xsl:output method="xml" indent="yes"/>
    
    <xsl:template match="/uml">
        <xsl:apply-templates select="XMI" mode="RecursiveDeepCopy" />
    </xsl:template> 
    
    <xsl:template match="@*|node()" mode="RecursiveDeepCopy">
        <xsl:copy>
            <xsl:copy-of select="@*" />
            <xsl:apply-templates mode="RecursiveDeepCopy" />
        </xsl:copy> 
    </xsl:template> 

</xsl:stylesheet>

In the above example, the XMI/* elements and attributes are being extracted from the /uml/ container element. Before the transform, the XML would look something like this:

<uml>
  <XMI>
    <XMI.header/>
    <XMI.content/>
  </XMI>
</uml>

And after it would look like this:

<XMI>
  <XMI.header/>
  <XMI.content/>
</XMI>

For information on similar transforms, simply Google or Bing "Identity transforms."

Wednesday, July 25, 2012

XSLT: Convert Standard U.S. Date into xsd:date Format

When working with XML in the United States (U.S.), one will often find dates which have been formatted in the traditional U.S. Short Format even though XML Schema (XSD) enforces a more locale-neutral format.  This means often converting data from:

<SomeUsDate>1/10/2001</SomeUsDate>

Into:

<SomeXsdDate>2001-01-10</SomeXsdDate>

If one is using XSLT 2.0, this can simply be done by including and calling the function within the FunctX library here

In XSLT 1.0, a very limited set of string manipulation functionality exists. Even so, it is possible (although convoluted) to convert a typical U.S.-formatted date into an XML-Schema enforced date. A possible solution is listed below:

<xsl:template name="aoc:txfDateFormat">
         <xsl:param name="UsDate"/>

         <xsl:choose>

             <!-- Test to see if date contains a date with slashes in it. -->
             <xsl:when test="contains($UsDate, '/')">
                 <xsl:choose>

                     <!-- 2 Digit Month, 2 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=2 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=2)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-',substring-before($UsDate, '/'), '-', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>

                     <!-- 1 Digit Month, 2 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=1 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=2)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-0',substring-before($UsDate, '/'), '-', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>

                     <!-- 2 Digit Month, 1 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=2 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=1)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-',substring-before($UsDate, '/'), '-0', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>

                     <!-- 1 Digit Month, 1 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=1 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=1)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-0',substring-before($UsDate, '/'), '-0', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>
                 </xsl:choose>
             </xsl:when>

             <!-- Omit element if not. -->
             <xsl:otherwise/>
         </xsl:choose>
   </xsl:template>

Wednesday, October 20, 2010

XSLT: Transform XML into nc:ContactInformation Structure

This is a short post to show how to leverage XSLT to convert a simple and generic XML file into NIEM-conformant XML as it pertains to the nc:ContactInformation block. 

This is a very common situation where a "non-NIEM" data stream is received and needs to be converted to a conformant structure.  Take the following sample non-NIEM XML instance:

<?xml version="1.0" encoding="UTF-8" ?>     
<SomeBatchOfStuff>
   <Person>
       <Name>John Doe</Name>
       <PhoneNumber>212-111-2222</PhoneNumber>
   </Person>
   <Person>
       <Name>Sally Smith</Name>
       <PhoneNumber>212-333-4444</PhoneNumber>
   </Person>
</SomeBatchOfStuff>

If this very logical structure needed to be converted into nc:Person and nc:ContactInformatoin elements (with an nc:PersonContactInformationAssociation object to link the two together), the following XSLT could be used:

<?xml version="1.0" encoding="UTF-8" ?>     
<xsl:stylesheet  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ns0="SomeNonConformantDocumentNamespace" version="1.0" exclude-result-prefixes="xs">
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    <xsl:template match="/">
        <xsl:variable name="var1_instance_InputSchema" select="."/>
        <MyNIEMConformantDocument xmlns="MyNIEMDocumentNamespace" xmlns:i="http://niem.gov/niem/appinfo/2.0" xmlns:nc="http://niem.gov/niem/niem-core/2.0" xmlns:niem-xsd="http://niem.gov/niem/proxy/xsd/2.0" xmlns:s="http://niem.gov/niem/structures/2.0">
            <!-- Loop through the Persons and create an NIEM Conformant Person -->
            <xsl:for-each select="$var1_instance_InputSchema/SomeBatchOfStuff/Person">
                <xsl:variable name="NonConformantPerson" select="."/>
                <nc:Person>
                    <xsl:attribute name="s:id">
                        <xsl:value-of select="generate-id(.)"/>
                    </xsl:attribute>
                    <nc:PersonName>
                        <xsl:for-each select="$NonConformantPerson/Name">
                            <nc:PersonFullName>
                                <xsl:value-of select="string(.)"/>
                            </nc:PersonFullName>
                        </xsl:for-each>
                    </nc:PersonName>
                </nc:Person>
            </xsl:for-each>
            
            <!-- Loop through the phone numbers and create a NIEM Conformant Contact Information -->
            <xsl:for-each select="$var1_instance_InputSchema/SomeBatchOfStuff/Person/PhoneNumber">
                <nc:ContactInformation>
                    <xsl:attribute name="s:id">
                        <xsl:value-of select="generate-id(.)"/>
                    </xsl:attribute>
                    <nc:ContactTelephoneNumber>
                        <nc:FullTelephoneNumber>
                            <nc:TelephoneNumberFullID>
                                <xsl:value-of select="string(.)"/>
                            </nc:TelephoneNumberFullID>
                        </nc:FullTelephoneNumber>
                    </nc:ContactTelephoneNumber>
                </nc:ContactInformation>
            </xsl:for-each>
            
            <!-- Loop through the phone numbers and create a NIEM Conformant Contact Information Association -->
            <xsl:for-each select="$var1_instance_InputSchema/SomeBatchOfStuff/Person">
                <nc:PersonContactInformationAssociation>
                    <nc:PersonReference>
                    <xsl:attribute name="s:ref">
                        <xsl:value-of select="generate-id(.)"/>
                    </xsl:attribute>
                    </nc:PersonReference>
                    <nc:ContactInformationReference>
                        <xsl:attribute name="s:ref">
                            <xsl:value-of select="generate-id(./PhoneNumber)"/>
                        </xsl:attribute>
                    </nc:ContactInformationReference>
                </nc:PersonContactInformationAssociation>
            </xsl:for-each>
        </MyNIEMConformantDocument>
    </xsl:template>                    
</xsl:stylesheet>

The XSLT heavily leverages the XSLT generate-id() function in order to work its magic and result in the following NIEM-conformant XML file:

<?xml version="1.0" encoding="UTF-8" ?>   
<MyNIEMConformantDocument xmlns="MyNIEMDocumentNamespace" xmlns:i="http://niem.gov/niem/appinfo/2.0" xmlns:nc="http://niem.gov/niem/niem-core/2.0" xmlns:niem-xsd="http://niem.gov/niem/proxy/xsd/2.0" xmlns:s="http://niem.gov/niem/structures/2.0" xmlns:ns0="SomeNonConformantDocumentNamespace">
   <nc:Person s:id="d0e3">
      <nc:PersonName>
         <nc:PersonFullName>John Doe</nc:PersonFullName>
      </nc:PersonName>
   </nc:Person>
   <nc:Person s:id="d0e12">
      <nc:PersonName>
         <nc:PersonFullName>Sally Smith</nc:PersonFullName>
      </nc:PersonName>
   </nc:Person>
   <nc:ContactInformation s:id="d0e8">
      <nc:ContactTelephoneNumber>
         <nc:FullTelephoneNumber>
            <nc:TelephoneNumberFullID>212-111-2222</nc:TelephoneNumberFullID>
         </nc:FullTelephoneNumber>
      </nc:ContactTelephoneNumber>
   </nc:ContactInformation>
   <nc:ContactInformation s:id="d0e17">
      <nc:ContactTelephoneNumber>
         <nc:FullTelephoneNumber>
            <nc:TelephoneNumberFullID>212-333-4444</nc:TelephoneNumberFullID>
         </nc:FullTelephoneNumber>
      </nc:ContactTelephoneNumber>
   </nc:ContactInformation>
   <nc:PersonContactInformationAssociation>
      <nc:PersonReference s:ref="d0e3"/>
      <nc:ContactInformationReference s:ref="d0e8"/>
   </nc:PersonContactInformationAssociation>
   <nc:PersonContactInformationAssociation>
      <nc:PersonReference s:ref="d0e12"/>
      <nc:ContactInformationReference s:ref="d0e17"/>
   </nc:PersonContactInformationAssociation>
</MyNIEMConformantDocument>

Thursday, February 11, 2010

XSLT: Using the generate-id() Function

NIEM utilizes ID and IDREF elements heavily throughout the data standard.  While this is native to the W3C specification for XML Schema files (.XSD) and in no way “unique” to NIEM, it is used much more heavily in NIEM than in many other national and international standards. 

When converting or transforming to NIEM from another data standard, it quickly becomes necessary to generate unique identifiers in a common and consistent manner for key “noun” elements such as Persons, Places, Vehicles, and the like.  A number of home-grown functions are scattered around the Internet to do this, however a native XSLT function already exists to perform this task called generate-id()

Say the following non-NIEM-conformant XML payload is provided to a system processing citation data:

<CitationBatch>
  <Citation>
    <CitationNumber>123456</CitationNumber>
    <CitationDefendant>
      <FirstName>John</FirstName>
      <LastName>Doe</LastName>
      <PhoneNumber>123-456-7890</PhoneNumber>
    </CitationDefendant>
    <!-- Remainder Omitted -->
  </Citation>
<CitationBatch>

Within NIEM the <CitationDefendant> element above is termed the <j:CitationSubject> and includes a <nc:RoleOfPersonReference> rather than embedding all person information as child elements within the citation.  Additionally, the phone number for any given person is contained within a <nc:ContactInformation> element. 

The XSLT generate-id() function accepts a specific xml node as its input parameter and will consistently provide a unique ID for that node no matter where or how many times it is called from within the XSLT.  For example, take the following XSLT snippets:

<xsl:for-each select="$xmlInputFile/CitationBatch/Citation">
        <xsl:variable name="xmlCiteNode" select="."/>
        <j:CitationSubject>
            <nc:RoleOfPersonReference>
                <xsl:attribute name="s:ref">
                    <xsl:value-of select="generate-id($xmlCiteNode/CitationDefendant)"/>
                </xsl:attribute>
            </nc:RoleOfPersonReference>
        </j:CitationSubject>
    </xsl:for-each>
    ....
    ....
    ....
    <xsl:for-each select="$xmlInputFile/CitationBatch/Citation/CitationDefendant">
        <xsl:variable name="xmlCiteSubjectNode" select="."/>
        <nc:Person>
            <xsl:attribute name="s:id">
                <xsl:value-of select="generate-id($xmlCiteSubjectNode)"/>
            </xsl:attribute>
        </nc:Person>
    </xsl:for-each>
    ....
    ....

Even though the generate-id() function is called in two places within the transform, using two different variable names, the function will return the same exact yet unique ID as the XPath for both variables resolve to the same element in the input schema.  The output of the above would appear as the following:

....
....
<j:CitationSubject>
    <nc:RoleOfPersonReference s:ref="d0e8"/>
</j:CitationSubject>
....
....
<nc:Person s:id="d0e8"/>

This powerful function within XSLT dramatically ease ID and IDREF usage within XML and makes implementation of transforms to NIEM relatively trivial.

Monday, January 11, 2010

XSLT: Transform Date and Time Elements into nc:DateTime

While NIEM practitioners tend to merge Date and Time elements together into single nc:DateTime elements, we often find that the outside world separates these into two fields in their XML data packages.  For example, if someone were to use Java XForms or Microsoft InfoPath to capture data in an electronic form, it is common to separate these out into their component parts.

For example, assume a NIBRS report form exists and has discrete date and time values.  Using XSLT to merge these is quite simple and can be done using the concat() function as show here:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet 
    version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:ns="SomeNibrsOffenseReportNamespace"
    exclude-result-prefixes="ns">
    
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    
    <xsl:template match="/">
        <xsl:variable name="sInputSchema" select="."/>
        
        <OffenseReportDocument xmlns="SomeNiemOffenseReportNamespace" xmlns:j="http://niem.gov/niem/domains/jxdm/4.0" xmlns:nc="http://niem.gov/niem/niem-core/2.0">
          <j:Offense>
            <nc:ActivityDate>
              <nc:DateTime>
                <xsl:value-of select="concat(string($sInputSchema/ns:NibrsForm/ns:OffenseDate), 'T', string(sInputSchema/ns:NibrsForm/ns:OffenseTime))"/>
              </nc:DateTime>
            </nc:ActivityDate>
          </j:Offense>
        </OffenseReportDocument>
    </xsl:template>
</xsl:stylesheet>

In the above example, the concat() function allows us to merge the date (e.g. ‘2010-01-01’), the letter ‘T’, and the time (e.g. ‘12:00:00’) into a single string which in turn can be evaluated as a nc:dateTime element. 

1-13-10 – Edit for typo