Thursday, December 20, 2012

XSLT: Select Distinct in XSL 1.0

cThe further one dives into XSLT, it may become necessary to extract a list of unique values from an XML document. This is commonly done in SQL through the SELECT DISTINCT statement, unfortunately, there is no direct equivalent in XSLT 1.0.

In order to perform this sort of functionality, one must leverage some of the more advanced aspects of XSLT including the preceding-sibling:: or another such "axis" as it's known in XSL.

To better understand, lets look at an example.  Given the following XSD snippet:

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:element name="houseCategory" abstract="true"/>
    <xsd:element name="houseCategoryText" type="xsd:string" substitutionGroup="houseCategory"/>
    <xsd:element name="houseCategoryFlag" type="xsd:boolean" substitutionGroup="houseCategory"/>
    <xsd:element name="housePurchaseDateRepresentation" abstract="true"/>
    <xsd:element name="housePurchaseDate" type="xsd:date" substitutionGroup="housePurchaseDateRepresentation"/>
    <xsd:element name="housePurchaseDateTime" type="xsd:dateTime" substitutionGroup="housePurchaseDateRepresentation"/>
</xsd:schema>

The following XSLT will extract a list of unique substitutionGroup attribute values from above and list them in the output: 

<?xml version="1.0" encoding="UTF-8"?>
  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
  <xsl:output method="xml" indent="yes"/>
  <xsl:template match="*">
    <xsl:for-each
        select="/xsd:schema/xsd:element/@substitutionGroup[not(. = ../preceding-sibling::xsd:element/@substitutionGroup/.)]">
        
        <xsl:element name="uniqueSubstitutionGroup">
            <xsl:value-of select="."/>
        </xsl:element>
        
    </xsl:for-each>    
  </xsl:template>
</xsl:stylesheet>

The resulting output would appear something like the following:

<?xml version="1.0" encoding="utf-8"?>
<uniqueSubstitutionGroup>houseCategory</uniqueSubstitutionGroup>
<uniqueSubstitutionGroup>housePurchaseDateRepresentation</uniqueSubstitutionGroup>

The information surrounding this question was sourced in part from information provided on Stack Overflow here

Wednesday, October 24, 2012

XSLT: Namespace From Prefix in XSL 1.0

When transforming XML Schema (XSD) files in XSLT 1.0, it quickly becomes necessary to resolve the namespace from the prefix of any given element or attribute.  Sometimes these elements and attributes are not stored as actual elements and attributes, which makes it necessary to resolve them in a different manner. For example, in the following XSD snippet, iso_639-3:LanguageCodeSimpleType is not a node, element nor is it an attribute, rather it is a text value of the type attribute.

<xsd:schema 
  xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
  xmlns:iso_639-3="http://niem.gov/niem/iso_639-3/2.0">
    <xsd:attribute name="languageCode" 
      type="iso_639-3:LanguageCodeSimpleType"/>
</xsd:schema>

One way to resolve what namespace the iso_639-3 prefix is associated with, is to extract the prefix from the text and use the following simple XSLT 1.0 function/template.

<xsl:template name="namespaceFromPrefix">
  <xsl:param name="sPrefix"/>
  <xsl:value-of select="/*/namespace::*[name() = $sPrefix]"/>
</xsl:template>

If one were to pass in the prefix iso_639-3 the return value would be http://niem.gov/niem/iso_639-3/2.0

Wednesday, October 17, 2012

Schematron: Using Schematron Client-Side!

As browsers continue to evolve and incorporate more standards, it is starting to become possible to better leverage XML directly in client-side browsers. 

A great series of articles (4 in all) has been posted by a colleague about how to leverage XML in forms capture and even use Schematron to help validate it!  Check it out http://udiminished.blogspot.com/2011/12/simplify-with-xml-data-model-part-1.html

Friday, September 21, 2012

XSLT: Extract a Branch from an XML Tree

Often you may wish to extract a portion of an XML file including all of its children elements so as to better deal with it or further transform or handle it elsewhere. 

For example, in web-service-based development, it is common for a developer to “extract” the payload from a request or response envelope.  This can also be in situations where a NIEM developer wishes to obtain the payload portion of a LEXS package and further process, store or display it. 

With XSLT, it only takes the following few lines of code to pull a branch of XML out of a larger package.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">
    <xsl:output method="xml" indent="yes"/>
    
    <xsl:template match="/uml">
        <xsl:apply-templates select="XMI" mode="RecursiveDeepCopy" />
    </xsl:template> 
    
    <xsl:template match="@*|node()" mode="RecursiveDeepCopy">
        <xsl:copy>
            <xsl:copy-of select="@*" />
            <xsl:apply-templates mode="RecursiveDeepCopy" />
        </xsl:copy> 
    </xsl:template> 

</xsl:stylesheet>

In the above example, the XMI/* elements and attributes are being extracted from the /uml/ container element. Before the transform, the XML would look something like this:

<uml>
  <XMI>
    <XMI.header/>
    <XMI.content/>
  </XMI>
</uml>

And after it would look like this:

<XMI>
  <XMI.header/>
  <XMI.content/>
</XMI>

For information on similar transforms, simply Google or Bing "Identity transforms."

Saturday, September 8, 2012

XSD: Extending Code Lists with xsd:union

In certain circumstances it is necessary to add elements to an existing NIEM enumeration (or code list).  In these situations one may choose to simply recreate a new list with all the same elements already defined in a NIEM code type and simply add those which do not yet exist.  However, when the code list is larger than a few elements (such as a state code list with at least 50 valid values), using xsd:union as an option becomes more appealing.

The xsd:union provides a way to combine simple data types together to form a larger and more comprehensive data type.  An example would be simply adding “ZZ” to a list of US Postal Service State (USPS) Codes to communicate an unknown or invalid state.  This can be accomplished by extending the existing USPS code list in several steps.

Step 1 – Create a New Simple Type With New Values

<!-- Simple code value to add ZZ as a valid value -->
<xsd:simpletype name="USStateCodeDefaultSimpleType">
  <xsd:restriction base="xsd:token">
   <xsd:enumeration value="ZZ">
    <xsd:annotation>
     <xsd:documentation>UNKNOWN</xsd:documentation>
    </xsd:annotation>
   </xsd:enumeration>
  </xsd:restriction>
</xsd:simpletype>

Step 2 – Use xsd:union to Join the Values with Existing Values

<!-- New simple time combining my custom enum with the standard usps one --> 
<xsd:simpleType name="LocationStateCodeSimpleType">
  <xsd:union memberTypes="usps:USStateCodeSimpleType my:USStateCodeDefaultSimpleType"/>
</xsd:simpleType>

Step 3 – Wrap the New Simple Data Type in a Complex Type

<!-- New complexType required to add s:id and s:idref to the definition -->
<xsd:complexType name="LocationStateCodeType">
  <xsd:simpleContent>
    <xsd:extension base="aoc_code:LocationStateCodeSimpleType"> 
      <xsd:attributeGroup ref="s:SimpleObjectAttributeGroup"/>
    </xsd:extension>
  </xsd:simpleContent>
</xsd:complexType>

Step 4 – Create Element Instantiating the New Code List

<!-- Element declaration allowing use of our new data type -->
<xsd:element name="NewStateCode" type="my:LocationStateCodeType" substitutionGroup="nc:LocationStateCode"/>

Now any place an nc:LocationStateCode can be use, our extended code list can be used instead.

Wednesday, July 25, 2012

XSLT: Convert Standard U.S. Date into xsd:date Format

When working with XML in the United States (U.S.), one will often find dates which have been formatted in the traditional U.S. Short Format even though XML Schema (XSD) enforces a more locale-neutral format.  This means often converting data from:

<SomeUsDate>1/10/2001</SomeUsDate>

Into:

<SomeXsdDate>2001-01-10</SomeXsdDate>

If one is using XSLT 2.0, this can simply be done by including and calling the function within the FunctX library here

In XSLT 1.0, a very limited set of string manipulation functionality exists. Even so, it is possible (although convoluted) to convert a typical U.S.-formatted date into an XML-Schema enforced date. A possible solution is listed below:

<xsl:template name="aoc:txfDateFormat">
         <xsl:param name="UsDate"/>

         <xsl:choose>

             <!-- Test to see if date contains a date with slashes in it. -->
             <xsl:when test="contains($UsDate, '/')">
                 <xsl:choose>

                     <!-- 2 Digit Month, 2 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=2 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=2)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-',substring-before($UsDate, '/'), '-', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>

                     <!-- 1 Digit Month, 2 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=1 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=2)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-0',substring-before($UsDate, '/'), '-', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>

                     <!-- 2 Digit Month, 1 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=2 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=1)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-',substring-before($UsDate, '/'), '-0', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>

                     <!-- 1 Digit Month, 1 Digit Day -->
                     <xsl:when
                         test="string-length(substring-before($UsDate, '/'))=1 and (string-length(substring-before(substring-after($UsDate, '/'), '/'))=1)">
                         <nc:Date>
                             <xsl:value-of
                                 select="concat(substring-after(substring-after($UsDate, '/'), '/'),'-0',substring-before($UsDate, '/'), '-0', substring-before(substring-after($UsDate, '/'), '/'))"
                             />
                         </nc:Date>
                     </xsl:when>
                 </xsl:choose>
             </xsl:when>

             <!-- Omit element if not. -->
             <xsl:otherwise/>
         </xsl:choose>
   </xsl:template>

Friday, March 18, 2011

Editorial: Tablets in the Public Sector - Part 1

As 2011 has been termed by some in the media as "The Year of the Tablet," (Fox, CIO & NYT) this is the first part in a series of articles meant to articulate the role Tablet PC's could potentially take in the Public Sector and how our NIEM community might react to this trend. 

While not directly related to Schematron or XSLT, we would be remiss in ignoring modern trends in our application of these technologies.  The fact is, that unless our tools and technologies are adapted to support these trends, they risk being left behind.

This first article will focus on the Tablet and the various roles it can potentially take in the Public Sector.  Some in the media have simply divided the device's role into two oversimplified categories of Content Creation and Content Consumption.  While it is true, devices of any type can do these two things to differing degrees, it is probably easier to understand Tablet technologies in terms of actual public sector environment use cases.

In the Boardroom

In meetings the Tablet most often replaces the paper meeting handouts.  In one large city they estimated that over $1m per year could be saved in paper and printing costs by simply eliminating meeting handouts in favor of electronic meeting materials.  An electronic medium also makes it easy to store, preserve and reexamine materials presented at a later date or by those not who were unable to physically attend the meeting.

An example of where this use case is already becoming a reality is in the NLETS board, where each of the board members is issued a tablet upon which they are provided meeting materials and access to any other critical information prior to and during each meeting.

It is also possible for the Tablet to take the role of the "notebook" or the medium where one takes notes about the meeting, however more often than not, the lack of a stylus or even keyboard (barring separate accessories) for rapid data entry makes this impractical at this point in time.

Current Advantages

Current Disadvantages

- Size & Weight
- Paper Cost Savings
- Ease of Later Referencing
- Ease of Real-Time Update
- Poor Annotation Capabilities

In the Courtroom

There are a number of situations where paper was the predominant medium for hundreds of years only to very recently be replaced with more electronic options.  In the late 1990's, attorney's began to adopt Notebook PC's yet they have never been fully accepted in that environment.  Many courtrooms were never built with a PC in mind therefore the lack of power outlets, desk space and network connectivity quickly became impediments. 

With the advent of Tablet technology, there is a renewed interest in this technology by defense and prosecution alike.  An example trial was recently documented by the Mac Litigator where it highlighted the use of an Tablet in a four day Jury Trial.  Many of the issues with a laptop are overcome by the use of an unobtrusive, always connected, long-lasting Tablet. 

Current Advantages

Current Disadvantages

- Size & Desk Footprint
- Paper Cost Savings
- Battery Life
- Real-Time Updates
- Real-Time Reference
- Poor Data Input Capabilities
- Possible Courtroom Distraction
- Enterprise Security Risk

In the Field

The public sector employs a large number of individuals involved in on-site investigations of various kinds.  Whether it is the city building inspector, a fire marshal or a police detective, they all have similar needs to gather and process data in the field.  For years now, investigators of this kind have been using laptop computers to address this need, however this technology has always been relegated to the car or the office as it would get in the way of the investigator at the work site.  Tablets seem to fit this niche well as their size, weight and connectivity is better suited to the job.

Some agencies are already beginning to investigate and adopt Tablet technology in this manner.  The Knox County Sheriff replaced a number of their detectives' laptops with Tablets as described by an article on the NLETS website

Wireless Tablet technology coupled with GPS, maps and satellite imagery makes Tablets extremely relevant to those public sector employees working in the construction field.  With a Tablet, a building inspector or DOT foreman can walk onto a job site and determine where there may be troubles quickly and easily.  These can also give real-time access to building codes and municipal ordnances as well as architectural images, plans and blueprints.

Current Advantages

Current Disadvantages

- Size & Weight
- Real-Time Reference
- Real-Time Transmission
- Hardware Price Point
- Poor Data Input Capabilities
- Lack of "Rugged" Models
- Enterprise Security Risks

In the Classroom

It seems like whenever the topic of Tablets or eBooks is brought up, a spotlight is placed on the classroom and the typical college student hefting around hundreds of pounds worth of textbooks.  While on the surface and in 10 second sound bites, this makes sense; the classroom setting has experienced significant struggles making Tablets and eBooks a reality in the classroom.  A recent article by the Chronicle for Higher Learning points out a number of the difficulties experienced across several different institutions. 

Regardless of the problems currently being experienced, as the they are addressed, the classroom is a very likely place for Tablet technology to take a strong future foothold.  As a Melbourne Trinity College study points out, even with the disadvantages, the usage of Tablets are still expected to increase at the University level in the upcoming years.

Current Advantages

Current Disadvantages

- Real-Time Learning Feedback
- Size & Weight
- Battery Life
- Hardware Price Point
- eTextbook Prices Still High
- Poor Annotation Capabilities
- Potential Classroom Distraction


In summary, Tablets can be seen and used in a number of environments within the public sector and one should not be surprised as they begin to become less of a consumer toy and more of an industry tool. 

Please bear in mind that we fully acknowledge that the editorial opinions stated in the article above are based on industry generalizations and we fully understand that for every generalization or rule there is an exception.