Showing posts with label XML. Show all posts
Showing posts with label XML. Show all posts

Wednesday, October 2, 2013

jQuery: NIEM Parsing with jQuery

This series of articles will look at how to leverage jQuery to access and manipulate NIEM-conformant XML payloads. The first article looks simply at the capability to access, parse and read the data contained in a nc:AddressFullText element, strip whitespace from it, and place it in a text area on a web page. Later articles will explain how to handle multiple occurrences of a single element and eventually map the address on a webpage.
jQuery is a powerful scripting tool used by many modern developers in creating and processing web pages. We will be using it as a client-side script where the user's browser is working on an XML stream directly without doing work on a server. This is one approach to leveraging the power of a client's own browser in the current "AJAX" development stagey.
The first step in leveraging jQuery is actually including the appropriate library on your web-page. There are both compressed and uncompressed versions of this library available on jquery.com and in many cases you can simply link to a hosted copy of the library on the web. For this exercise, let's assume we have a copy of the library hosted on the same server (and in the same directory) as the html page we're writing. In this case, we will include the script by using the following:
<!-- jQuery Include (change to match the file name and location where you have the jQuery library stored) -->
<script src="jquery-1.10.2.js"></script>
 
 
With the library included on our HTML page, we can access all the features available to us within our on-page JavaScript. In order to use a stream of XML data, we'll need to first parse it. in jQuery we simply use the $.parseXML(string) command. This will return for us a parsed XML document. As we don't want our users to receive unhandled exceptions, and there is never a guarantee the XML we process is without error, we want to be sure to use try/catch when parsing our file. Here is what the snippet of parsing would look like:
//the try & catch block support for .parseXML() requies jQuery 1.9 or newer!
  try{
       var xmlDoc = $.parseXML(xml);
       var $xml = $(xmlDoc);
     } catch (err) {
       alert("Error parsing XML!  Likely invalid XML or undefined namespace prefix.");
       return;
     }
 
Once we have successfully parsed the XML document, we can access portions of the document using a number of jQuery-supported methods. One of the simplest methods (and the one we will use today) is the xmlDocument.find('pattern') method. It simply returns an element [array] for all the elements it finds that match the pattern passed into the function. For example, if we want to "find" all of the elements called PersonName, we would simply pass it 'PersonName'. Here, we will assume we're working with nc:Address elements that contain plain text addresses and our expanded code will look something like this:
$address = $xml.find("nc\\:AddressFullText, AddressFullText");

You will notice we passed in two parameters. This is to work around a compatibility issue between Web-Kit-Based Browsers (Chrome, Firefox, Safari, etc.) and Microsoft Internet Explorer (IE). IE allows us to use an escape-sequenced colon to define our element prefix (\\:), however the Web-Kit browsers do not. By passing it both a prefixed and simple element name, both browser types will accept and process the find properly.
Since we are processing a plain text field, we will likely have some extra whitespace (spaces and line-feeds) that we don't want. We can simply use a string.replace('pattern') with a regex to clear out what we don't want and save the text value contained in our element by using the .text() method as shown in the following.
//Clear Whitespaces and save text value of the element in a string variable
textAddress = $address.text().replace(/^(\s*)|(\s*)$/g, '').replace(/\s+/g, ' ');
 
Once we have the value saved in a variable, we can now use it in any way we want directly in our script. The following is the full HTML and JavaScript/jQuery code for this example including a text field for entering NIEM-conformant XML and a text field for the parsed data.
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>nc:Address jQuery Parse Example</title>

    <!-- jQuery Include (change to match the file name and location where you have the jQuery library stored) -->
    <script src="jquery-1.10.2.js"></script> 

    <!-- On Page JavaScript -->
    <script>
        
        function resolveAddress() {
            var textAddress;
            var $address;
            var xml = textField.value;

            //the try & catch block support for .parseXML() requies jQuery 1.9 or newer!
            try{
                var xmlDoc = $.parseXML(xml);
                var $xml = $(xmlDoc);
            } catch (err) {
                alert("Error parsing XML!  Likely invalid XML or undefined namespace prefix.");
                return;
            }

            //jQuery .find()
            //find any and all values stored in the (possibly repeating) element nc:AddressFullText
            //The following returns an ELEMENT and requires we use .text() to access the return value
            $address = $xml.find("nc\\:AddressFullText, AddressFullText");
            //Clear Whitespaces
            textAddress = $address.text().replace(/^(\s*)|(\s*)$/g, '').replace(/\s+/g, ' ');
            //Output Value
            textFindOutput.value = textAddress;


        }
    </script>

</head>
<body>
    <div>
    <p style="font-family:'Segoe UI'">
        Enter or paste in a valid nc:Address block and click on the "Parse Data" button.
    </p>

    <textarea id="textField" style="width:100%;height:100px">
<myExchange xmlns:nc='http://niem.gov/niem/niem-core/2.0'>
  <nc:Address>
    <nc:AddressFullText>
        One Microsoft Way
        Redmond, WA 98052
    </nc:AddressFullText>
  </nc:Address>
</myExchange>
    </textarea>

    <p><button id="go" onclick="resolveAddress()">Parse Data</button></p>
    </div>
    <div>
        <p style="font-family:'Segoe UI'">jQuery .find("nc\\:AddressFullText") using .text() to access the <u>text</u> found.</p>
        <textarea id="textFindOutput" style="width:100%;height:100px"></textarea>
    </div>

     

</body>
</html>
Additionally, you can simply access a sample of this code working here.

Thursday, December 20, 2012

XSLT: Select Distinct in XSL 1.0

cThe further one dives into XSLT, it may become necessary to extract a list of unique values from an XML document. This is commonly done in SQL through the SELECT DISTINCT statement, unfortunately, there is no direct equivalent in XSLT 1.0.

In order to perform this sort of functionality, one must leverage some of the more advanced aspects of XSLT including the preceding-sibling:: or another such "axis" as it's known in XSL.

To better understand, lets look at an example.  Given the following XSD snippet:

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:element name="houseCategory" abstract="true"/>
    <xsd:element name="houseCategoryText" type="xsd:string" substitutionGroup="houseCategory"/>
    <xsd:element name="houseCategoryFlag" type="xsd:boolean" substitutionGroup="houseCategory"/>
    <xsd:element name="housePurchaseDateRepresentation" abstract="true"/>
    <xsd:element name="housePurchaseDate" type="xsd:date" substitutionGroup="housePurchaseDateRepresentation"/>
    <xsd:element name="housePurchaseDateTime" type="xsd:dateTime" substitutionGroup="housePurchaseDateRepresentation"/>
</xsd:schema>

The following XSLT will extract a list of unique substitutionGroup attribute values from above and list them in the output: 

<?xml version="1.0" encoding="UTF-8"?>
  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
  <xsl:output method="xml" indent="yes"/>
  <xsl:template match="*">
    <xsl:for-each
        select="/xsd:schema/xsd:element/@substitutionGroup[not(. = ../preceding-sibling::xsd:element/@substitutionGroup/.)]">
        
        <xsl:element name="uniqueSubstitutionGroup">
            <xsl:value-of select="."/>
        </xsl:element>
        
    </xsl:for-each>    
  </xsl:template>
</xsl:stylesheet>

The resulting output would appear something like the following:

<?xml version="1.0" encoding="utf-8"?>
<uniqueSubstitutionGroup>houseCategory</uniqueSubstitutionGroup>
<uniqueSubstitutionGroup>housePurchaseDateRepresentation</uniqueSubstitutionGroup>

The information surrounding this question was sourced in part from information provided on Stack Overflow here

Saturday, September 8, 2012

XSD: Extending Code Lists with xsd:union

In certain circumstances it is necessary to add elements to an existing NIEM enumeration (or code list).  In these situations one may choose to simply recreate a new list with all the same elements already defined in a NIEM code type and simply add those which do not yet exist.  However, when the code list is larger than a few elements (such as a state code list with at least 50 valid values), using xsd:union as an option becomes more appealing.

The xsd:union provides a way to combine simple data types together to form a larger and more comprehensive data type.  An example would be simply adding “ZZ” to a list of US Postal Service State (USPS) Codes to communicate an unknown or invalid state.  This can be accomplished by extending the existing USPS code list in several steps.

Step 1 – Create a New Simple Type With New Values

<!-- Simple code value to add ZZ as a valid value -->
<xsd:simpletype name="USStateCodeDefaultSimpleType">
  <xsd:restriction base="xsd:token">
   <xsd:enumeration value="ZZ">
    <xsd:annotation>
     <xsd:documentation>UNKNOWN</xsd:documentation>
    </xsd:annotation>
   </xsd:enumeration>
  </xsd:restriction>
</xsd:simpletype>

Step 2 – Use xsd:union to Join the Values with Existing Values

<!-- New simple time combining my custom enum with the standard usps one --> 
<xsd:simpleType name="LocationStateCodeSimpleType">
  <xsd:union memberTypes="usps:USStateCodeSimpleType my:USStateCodeDefaultSimpleType"/>
</xsd:simpleType>

Step 3 – Wrap the New Simple Data Type in a Complex Type

<!-- New complexType required to add s:id and s:idref to the definition -->
<xsd:complexType name="LocationStateCodeType">
  <xsd:simpleContent>
    <xsd:extension base="aoc_code:LocationStateCodeSimpleType"> 
      <xsd:attributeGroup ref="s:SimpleObjectAttributeGroup"/>
    </xsd:extension>
  </xsd:simpleContent>
</xsd:complexType>

Step 4 – Create Element Instantiating the New Code List

<!-- Element declaration allowing use of our new data type -->
<xsd:element name="NewStateCode" type="my:LocationStateCodeType" substitutionGroup="nc:LocationStateCode"/>

Now any place an nc:LocationStateCode can be use, our extended code list can be used instead.

Wednesday, October 20, 2010

XSLT: Transform XML into nc:ContactInformation Structure

This is a short post to show how to leverage XSLT to convert a simple and generic XML file into NIEM-conformant XML as it pertains to the nc:ContactInformation block. 

This is a very common situation where a "non-NIEM" data stream is received and needs to be converted to a conformant structure.  Take the following sample non-NIEM XML instance:

<?xml version="1.0" encoding="UTF-8" ?>     
<SomeBatchOfStuff>
   <Person>
       <Name>John Doe</Name>
       <PhoneNumber>212-111-2222</PhoneNumber>
   </Person>
   <Person>
       <Name>Sally Smith</Name>
       <PhoneNumber>212-333-4444</PhoneNumber>
   </Person>
</SomeBatchOfStuff>

If this very logical structure needed to be converted into nc:Person and nc:ContactInformatoin elements (with an nc:PersonContactInformationAssociation object to link the two together), the following XSLT could be used:

<?xml version="1.0" encoding="UTF-8" ?>     
<xsl:stylesheet  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ns0="SomeNonConformantDocumentNamespace" version="1.0" exclude-result-prefixes="xs">
    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
    <xsl:template match="/">
        <xsl:variable name="var1_instance_InputSchema" select="."/>
        <MyNIEMConformantDocument xmlns="MyNIEMDocumentNamespace" xmlns:i="http://niem.gov/niem/appinfo/2.0" xmlns:nc="http://niem.gov/niem/niem-core/2.0" xmlns:niem-xsd="http://niem.gov/niem/proxy/xsd/2.0" xmlns:s="http://niem.gov/niem/structures/2.0">
            <!-- Loop through the Persons and create an NIEM Conformant Person -->
            <xsl:for-each select="$var1_instance_InputSchema/SomeBatchOfStuff/Person">
                <xsl:variable name="NonConformantPerson" select="."/>
                <nc:Person>
                    <xsl:attribute name="s:id">
                        <xsl:value-of select="generate-id(.)"/>
                    </xsl:attribute>
                    <nc:PersonName>
                        <xsl:for-each select="$NonConformantPerson/Name">
                            <nc:PersonFullName>
                                <xsl:value-of select="string(.)"/>
                            </nc:PersonFullName>
                        </xsl:for-each>
                    </nc:PersonName>
                </nc:Person>
            </xsl:for-each>
            
            <!-- Loop through the phone numbers and create a NIEM Conformant Contact Information -->
            <xsl:for-each select="$var1_instance_InputSchema/SomeBatchOfStuff/Person/PhoneNumber">
                <nc:ContactInformation>
                    <xsl:attribute name="s:id">
                        <xsl:value-of select="generate-id(.)"/>
                    </xsl:attribute>
                    <nc:ContactTelephoneNumber>
                        <nc:FullTelephoneNumber>
                            <nc:TelephoneNumberFullID>
                                <xsl:value-of select="string(.)"/>
                            </nc:TelephoneNumberFullID>
                        </nc:FullTelephoneNumber>
                    </nc:ContactTelephoneNumber>
                </nc:ContactInformation>
            </xsl:for-each>
            
            <!-- Loop through the phone numbers and create a NIEM Conformant Contact Information Association -->
            <xsl:for-each select="$var1_instance_InputSchema/SomeBatchOfStuff/Person">
                <nc:PersonContactInformationAssociation>
                    <nc:PersonReference>
                    <xsl:attribute name="s:ref">
                        <xsl:value-of select="generate-id(.)"/>
                    </xsl:attribute>
                    </nc:PersonReference>
                    <nc:ContactInformationReference>
                        <xsl:attribute name="s:ref">
                            <xsl:value-of select="generate-id(./PhoneNumber)"/>
                        </xsl:attribute>
                    </nc:ContactInformationReference>
                </nc:PersonContactInformationAssociation>
            </xsl:for-each>
        </MyNIEMConformantDocument>
    </xsl:template>                    
</xsl:stylesheet>

The XSLT heavily leverages the XSLT generate-id() function in order to work its magic and result in the following NIEM-conformant XML file:

<?xml version="1.0" encoding="UTF-8" ?>   
<MyNIEMConformantDocument xmlns="MyNIEMDocumentNamespace" xmlns:i="http://niem.gov/niem/appinfo/2.0" xmlns:nc="http://niem.gov/niem/niem-core/2.0" xmlns:niem-xsd="http://niem.gov/niem/proxy/xsd/2.0" xmlns:s="http://niem.gov/niem/structures/2.0" xmlns:ns0="SomeNonConformantDocumentNamespace">
   <nc:Person s:id="d0e3">
      <nc:PersonName>
         <nc:PersonFullName>John Doe</nc:PersonFullName>
      </nc:PersonName>
   </nc:Person>
   <nc:Person s:id="d0e12">
      <nc:PersonName>
         <nc:PersonFullName>Sally Smith</nc:PersonFullName>
      </nc:PersonName>
   </nc:Person>
   <nc:ContactInformation s:id="d0e8">
      <nc:ContactTelephoneNumber>
         <nc:FullTelephoneNumber>
            <nc:TelephoneNumberFullID>212-111-2222</nc:TelephoneNumberFullID>
         </nc:FullTelephoneNumber>
      </nc:ContactTelephoneNumber>
   </nc:ContactInformation>
   <nc:ContactInformation s:id="d0e17">
      <nc:ContactTelephoneNumber>
         <nc:FullTelephoneNumber>
            <nc:TelephoneNumberFullID>212-333-4444</nc:TelephoneNumberFullID>
         </nc:FullTelephoneNumber>
      </nc:ContactTelephoneNumber>
   </nc:ContactInformation>
   <nc:PersonContactInformationAssociation>
      <nc:PersonReference s:ref="d0e3"/>
      <nc:ContactInformationReference s:ref="d0e8"/>
   </nc:PersonContactInformationAssociation>
   <nc:PersonContactInformationAssociation>
      <nc:PersonReference s:ref="d0e12"/>
      <nc:ContactInformationReference s:ref="d0e17"/>
   </nc:PersonContactInformationAssociation>
</MyNIEMConformantDocument>

Friday, November 13, 2009

Schematron: Validating NIEM Documents Against Non-Conformant Code Lists

Schematron rules and assertions are based upon XPath statements, which allow for a number of powerful XML querying capabilities. Two XPath capabilities leveraged and outlined in this section are doc() and XPath predicates which allow us to validate data captured in an NIEM XML instance against external code list of any kind.

Lets assume a scenario where we would like to validate an exchange document’s category against a predefined list of enumerated values.  This list is maintained by an outside party in a format other than NIEM and changes on a fairly regular basis. 

Traditionally, a NIEM practitioner would take this list and define an enumeration within an extension schema to enforce this code list.  Each time the third party makes a change to that code list, an updated NIEM extension schema would be created and redistributed.  This maintenance-intensive process could become overwhelming therefore the team chose instead to simply adopt the third-party list and keep it in the following non-conformant format relying instead on Schematron to perform the validation:

<?xml version="1.0" encoding="UTF-8"?>
<!-- List of Valid code Values -->
<CategoryList>
  <Category>a</Category>
  <Category>b</Category>
</CategoryList>

As shown in the above, the valid categories include the values “a” and “b”.  An example of a NIEM-conformant XML payload would look something like the following:

<ns:SomeDocument 
    xmlns:nc="http://niem.gov/niem/niem-core/2.0"    
    xmlns:ns="http://www.niematron.org/SchematronTestbed"
    schemaLocation"http://www.niematron.org/SchematronTestbed  ./SomeDocument.xsd">
  <nc:DocumentCategoryText>A</nc:DocumentCategoryText>
  <!-- Remaining Document Elements Omitted -->
</ns:SomeDocument>

In this example, the developers would like to perform the validation ignoring case, therefore the Schematron rule to validate the nc:DocumentCategoryText against the third-party-provided list would look something like the following:

<pattern id="eDocumentCategory">
  <title>Verify the document category matches the external list of valid categories.</title>
  <rule context="/ns:SomeDocument">
    <let name="sText" value="lower-case(nc:DocumentCategoryText)"/>
    <assert test="count(doc('./CategoryList.xml')/CategoryList/Category[. = $sText]) &gt; 0">
      Invalid document category.
    </assert>
  </rule>
</pattern>

Lets look at some of the key statements in the above Schematron example breaking it into individual parts. 

  • lower-case(nc:DocumentCategoryText) – This statement encapsulated in a <let> tag converts the text in the NIEM payload to lower case thereby ignoring deviations from the code list due to case.  It is then stored in a temporary variable named $sText.
  • doc('.CategoryList.xml')/… – This effectively points the parser at the third-party provided file (in this example assumed to be in the same directory as the .sch file) so that elements from that file can be referenced using the XPath in addition to elements in the source payload document. 
  • …/Category[. = $sText] – The usage of the square brackets ([ and ])  in  an XPath statement is considered a predicate.  Any number of predicate statements can be made to help filter values contained within an XPath, but in this case, the expression tells the parser to select all of the Category elements with the value contained in the variable $sText.
  • count(…) &gt; 0 – The XQuery count function returns the number of elements contained in the XPath.  If no match to the category existed, the count would return a value of zero, therefore we want to ensure the value is greater than zero meaning a match existed in the external code list.

Friday, November 6, 2009

Schematron: Enforce String Patterns in Schematron

In the general area of XML schemas, XSD “patterns” are commonly used to enforce special string formatting constraints.  This is a very powerful tool when a document recipient wishes to ensure that the sender provides string data in a consistent format.  A common example is the usage of a string constraint is to validate the structure of a Social Security Number (SSN).  This would be expressed in a typical schema in the following manner:

<xsd:simpleType name="SsnSimpleType">
    <xsd:restriction base="xsd:string">
        <xsd:pattern value="[0-9]{3}[\-][0-9]{2}[\-][0-9]{4}" />
    </xsd:restriction>
</xsd:simpleType>

As with most parts of NIEM, much of the model is based on inheritance which makes enforce of simple data types, such as that shown above, cumbersome and awkward.  Semantically, the correct element for an SSN would be under:

nc:Person/nc:PersonSSNIdentification/ nc:IdentificationID

Since nc:PersonSSNIdentification is an nc:IdentificationType, if one were to enforce SSN formatting on nc:IdentificationID, any other part of the schema that is derived from nc:IdentificationType would also need to abide by the same rules (e.g. Driver License Number, State ID Number, Document Identification, etc.).  In the past this situation led to one thing. . . extension.

With Schematron, extension for this purpose could be avoided.  Rather than enforcing the string constraints in the XSD file, instead the IEPD publisher could enforce this constraint within the Schematron rules document instead.  The following is an example of what code would be required in Schematron to accomplish this purpose:

<pattern id="ePersonSSN">
  <title>Verify person social security number is in the correct format.</title>
  <rule context="/ns:SomeDocument/nc:Person/nc:PersonSSNIdentification">
    <assert test=
      "count(tokenize(nc:IdentificationID,'[0-9]{3}-[0-9]{2}-[0-9]{4}')) 
      - 1 = 1">
       Social security number must be in the proper format (e.g. 11-222-3333).
    </assert>
  </rule>
</pattern>

By using the Schematron approach, the semantically equivalent element is preserved in the schema and only the appropriate identifier is subjected to the constraint.

This approach can be further extended to address any number of string constraints.  Another example would be ensuring an identification number only contains digits and has a string length of 5 or more.  This could be done by using the following XQuery count() query instead:

count(tokenize(nc:IdentificationID, '\d')) &gt; 5

This very powerful approach to constraining strings is yet another reason to take a real good look at Schematron in conjunction with your NIEM IEPDs.

Wednesday, November 4, 2009

Schematron: Correct nc:DateRepresentation Usage

The inherent flexibility of NIEM proves to be an incredibly beneficial when used correctly, however this benefit can also be one of its largest banes.  Sometimes this flexibility can lead to confusion when implementers attempt to deploy a NIEM exchange which is “valid” according to the XSD, yet not what the recipient is expecting. 

One such example is NIEM’s usage of substitution groups where a variety of data elements are legal according to the schema, but rarely are all of these legal options accounted for by the recipient’s adapter.  Take NIEM’s DateType as an example.  It employs the explicit substitution group (abstract data element) of nc:DateRepresentation which can be one of several different data types.  This representation can be replaced with a date (2009-01-01), a date/time (2009-01-01T12:00:00), a month and a year (01-2009), etc. 

Lets assume for a minute that a document has two different dates: a document filed date, and a person’s birth date.  The publisher’s intention is that filed date be a “timestamp” which includes both a date and a time, while the birth date is simply a date including a month, day and year.  A valid sample XML payload would look something like the following:

<?xml version="1.0" encoding="UTF-8"?>
<ns:SomeDocument>
  <nc:DocumentFiledDate>
    <nc:DateTime>2009-01-01T01:00:00</nc:DateTime>
  </nc:DocumentFiledDate>
  <nc:Person>
    <nc:PersonBirthDate>
      <nc:Date>1970-01-01</nc:Date>
    </nc:PersonBirthDate>
  </nc:Person>
</ns:SomeDocument>

The Schematron code to enforce the publisher’s intentions could appear as the following:

<pattern id="eDocumentDateTime">
  <title>Verify the document filed date includes a date/Time</title>
  <rule context="ns:SomeDocument/nc:DocumentFiledDate">
    <assert test="nc:DateTime">
      A date and a time must be provided as the document filed date.
    </assert>
  </rule>
</pattern>
<pattern id="ePersonBirthDate">
  <title>Ensure the person's birth date is an nc:Date.</title>
  <rule context="ns:SomeDocument/nc:Person/nc:PersonBirthDate">
    <assert test="nc:Date">
      A person's birth date must be a full date.
    </assert>
  </rule>
</pattern>

This is a great example of how Schematron can help clarify a publisher’s intent as NEIM-conformant services are developed and deployed.

Wednesday, October 21, 2009

Schematron: License Plate State is Required when a Number Exists

A common practice in transportation and law enforcement is to document a vehicle’s license plate number.  In many situations, this plate number must be accompanied by the state which issued the license plate. 

In NIEM, a vehicle’s license plate is contained within the nc:ConveyanceRegistrationPlateIdentification element which is an nc:IdentificationType.  Using schema cardinality, one could make a the state required by simply assigning a minOccurs=”1” to the nc:IdentificationJurisdiction element, however this can often cause more problems than it solves for two key reasons:

  1. Making jurisdiction required through schema cardinality makes it required globally throughout the exchange even if it doesn’t apply in those scenarios as many other elements in a typical NIEM exchange are also nc:IdentificationType data types.
  2. nc:IdentificationJurisdiction is an abstract data element that can be replaced with any number of elements, not all of which are enumerated state values.  Some are country codes, some are province codes for other countries and others are simply free-text. 

This presents another ideal use case for Schematron.  The following example code segment ensures a NCIC plate issuing state is included any time a Plate Identification exists:

<pattern id="eVehiclePlateState">
  <title>Ensure a plate state is included with a plate number.</title>
  <rule context="ns:MyDocument/nc:Vehicle/nc:ConveyanceRegistrationPlateIdentification">
    <assert test="j:IdentificationJurisdictionNCICLISCode">
      A plate state must be included with vehicle license plate.
    </assert>
  </rule>
</pattern>

The same segment can be modified to enforce any of the available jurisdiction code lists.  For example, an exchange in Canada may wish to check for the existence of j:IdentificationJurisdictionCanadianProvinceCode instead of j:IdentificationJurisdictionNCICLISCode

Monday, October 12, 2009

Schematron: Use Phase for Errors and Warnings

Schematron allows for grouping of rules not only by XPath, but also through association using the “Phase” element. While this has long been recommended as an approach for improving validation performance and unit testing, Phase also serves as an excellent way to group together and differentiate between critical errors and simple warnings.

For example, an agency might choose to have some minimum data restrictions surrounding Officers and Agencies on a electronic citation that can not be overlooked, and at the same time have warnings surrounding statue codes that do not match the known state code values. In Schematron the following would the the code matching this scenario:

<!-- Rules resulting in just warnings (should not prevent submission) -->
<phase id="Warnings">
  <active pattern="validStatute"/>
</phase>
<!-- Rules resulting in errors (must prevent submission) -->
<phase id="Errors">
  <active pattern="minOfficerData"/>
  <active pattern="minAgencyData"/>
</phase>

The pattern attribute is an IDREF to a related pattern ID somewhere in the document. A developer can then create code to prevent the submission of any validation errors resulting from only one of the above phases. Schematron validation engines typically have command line switches or parameters to specify which phase should be run. For example, in Xerces’ implementation of Saxon, the parameter “phase=x” is used where x is one of the phase id’s listed above or “#ALL” if all phases should be processed.

Wednesday, October 7, 2009

Schematron: Officer Has a Last Name

This is the first in a series of code example articles that will be posted to give NIEM developers a head start in using Schematron. This example will show how to perform a test across multiple branches or nodes of a typical NIEM schema as law enforcement officer is a role played by a person in NIEM schemas. Take the following example XML code:

<ns:SomeDocument>
<j:Citation>  
   <j:CitationIssuingOfficial>
     <nc:RoleOfPersonReference s:ref="P1"/>  
   </j:CitationIssuingOfficial>
</j:Citation>
<nc:Person s:id="P1">
   <nc:PersonName><nc:PersonSurName>Smith</nc:PersonSurName></nc:PersonName>
</nc:Person>
</ns:SomeDocument>

One way in which to test for the existence of a last name is to match the ID with the officer's REF and test to be sure the string length is greater than 1 as shown in the following example (using XSLT2 & ISO Schematron):

<pattern id="eOfficerData"> 
    <let name="sOfficerRef" value="ns:SomeDocument/j:Citation/ j:CitationIssuingOfficial/nc:RoleOfPersonReference/@s:ref"/> 
    <rule context="ns:SomeDocument/nc:Person"> 
        <report test="@s:id = $sOfficerRef and string-length(nc:PersonName/nc:PersonSurName) < 1"> 
            Officers last name must be provided. 
        </report> 
    </rule> 
</pattern>

In theory the same test can be done using the XQuery id() function however use of the id function is HIGHLY dependent on the parser's capabilities.

Monday, October 5, 2009

To ISO or Not to ISO

Schematron comes in two common flavors, Schemtron v1.5 (older) and ISO Schematron (newer). While there are a number of differences that can be read about on the official Schematron Website, it can sometimes be confusing about which version is "best". The following table presents some of the differences between the two:
Featurev1.5ISO
VarriablesNot supportedlet element available.
Query LanguageXSLT 1.0/XPath 1.0XSLT 1.0/XPath 1.0, XSLT 2.0/XPath 2.0, EXSLT, STX, XSLT 1.1, etc.
Abstract Patterns & InheritanceNot SupportedSupported
value-of Element
(helpful in debugging and error messaging)
Not SupportedSupported
xsl:key ElementSupportedNot Supported
(Workaround Exists)
flag AttributeNot SupportedSupported
SVRLNot SupportedSupported
include elementNot SupportedSupported
My suggestion would be to go with ISO Schematron, for the following reasons:
  • Support for variables is important when working with ID and IDREF (more about this in a later blog).
  • XQuery 2.0 functions provide a number of goodies that would be hard to pass up.
  • ISO Schematron is a recognized ISO standard. . . which should count for something in a standards-based community.
Feel free to comment if you know any other key differences or disagree with my suggestion. Update: 2009-10-05 - Correct typo.