SeqXML is a package that contains several classes to
In order to parse the stream, the parser needs to have intimate knowledge about the structure of the stream. The structure of the stream must be described with an interface definition.
This document describes
First download the zip file from the sourceforge project page.
The installation is very simple. Install the jar file SeqXML.jar in your classpath. Please verify that you also have Xerces parser (version 1.3.x and up) in your class path.
Test the installation by typing
java com.softwareag.benelux.test.CreateXMLStream <url to employee idl file>. The idl file is included in the zip file and must be placed in a directory that can be reached by your Web server.
After executing this test program the result must look as follows
Creating test data ...
de Grijs Rudolf 20-11-1961De Raaf 13 Culemborg 4312DJ 0345 54 41 57 rudolf.de_grijs@softwareag.com
Convert to XML ...
<?xml version="1.0" encoding="UTF-8"?>
<employee><lastname>de Grijs</lastname><firstname>Rudolf</firstname><birthdate>20-11-1961</birthdate><address><street>De Raaf 13</street><city>Culemborg</city><postalcode>4312DJ</postalcode></address><address><street></street><city></city><postalcode></postalcode></address><contact><phone>0345 54 41 57</phone><phone></phone><email>rudolf.de_grijs@softwareag.com</email></contact></employee>
And back again
de Grijs Rudolf 20-11-1961De Raaf 13 Culemborg 4312DJ 0345 54 41 57 rudolf.de_grijs@softwareag.com
Sequential streams have been around for decades. Think for example about work files or a message that is passed between applications. Nowadays XML is the choice to exchange messages between (remote) applications (SOAP is a good example) and the representation independent content.
The SeqXML package is the solution to handle sequential streams.
SeqXML has three major classes
The current version of SeqXML can only parse character data, since it is the only format that can be handled in a platform independent manner. Future version of SeqXML might support Java native data types (todo).
Furthermore the size of the fields need to be fixed, i.e. variable length fields are not supported. Neither are variable sized groups or vectors. Due to this requirement the size of the sequential stream will have a fixed size (todo).
This does not hold for the XML stream. But the structure of the XML file must exactly match the interface definition (todo).
All three classes use an interface definition in order to parse the sequential data. This means that you must provide every instance with this interface definition. All three classes implement interface IdlParser in order to guarantee a consistent manner of initialization.
Class ParseSeqStream is used to handle sequential streams with a fixed layout. After that you have initialized the class with the proper idl file you must assign the sequential stream (or create an empty stream) using method createBuffer().
Next you can get the root of the parse tree with method getRoot(), which returns a FieldInterface (instance of Member). From here you can address all fields in the sequential stream with the methods of class FieldInterface.
As mentioned in the previous paragraph, method getRoot returns the root element of the parsed structure. During the parsing of the idl file an in memory presentation is build where every group is represented by class Member (decorated HashMap) and every field is represented by class Field.
An instance of class Member can occur more than once, effectively creating a repeating group. The same can be done with a field (= array). A Member can contain other Members and Fields. This way the layout of the sequential stream is defined by this data structure.
A Field instance defines the location of the sequential, i.e. if you have an instance of a Field then you can extract or assign a value. The next example will illustrate this data structure.
1 LastName (A30) /* string of length 30
1 FirstName (A10)
1 Address
2 Street (A30)
2 City (A30)
2 PostalCode (A10)
If you would use the above interface definition, the next in-memory presentation would be built
If you would invoke getRoot() then you would get a reference to "Employee". From there you can get to the other Members/Fields. As you can see, a Member is always an intermediate node.
Interface FieldInterface looks as follows (not complete)
public interface FieldInterface {
FieldInterface getElement(String key) throws SeqXMLException ;
FieldInterface[] getElements(String key) throws SeqXMLException ;
void setValue(String newValue) throws SeqXMLException;
String getValue() throws SeqXMLException;}
Method getElement is used to get a Member or Field. Use method getElements if your data stucture contains repeating groups and/or arrays. This interface does not contain an iterator. The only way to get to a Field is through an explicit path, e.g.
postcode = root.getElement("Address").getElement("Postcode");
Assume there are two addresses and you need the second one (delivery address). In that case you can get to the city field as follows
deliveryCity = root.getElement("Address")[1].getElement("City");
Depending if you initialized the buffer (createBuffer(String)) with a value or that you created an empty buffer you can read/write to the buffer using a Field object, e.g.
System.out.println("Delivery
city is " + deliveryCity.getValue());
// Modifying this value
deliveryCity.setValue("Moscow");
This class is used to convert a fixed sequential stream to XML. Just as the other two classes, you first need to initialize this class with an interface definition.
Next you need to assign the (fixed) sequential stream and then you can convert this stream to XML:
seq2xml.setSeqStream(charstream);
Document doc = seq2xml.CreateXMLFile("employee");
In the last step you must provide a name for the root element. If you would use the result of the previous example, then the result of the last step would be (after serializing Document doc):
<?xml version="1.0" encoding="utf-8"?>
<employee>
<LastName>..</LastName>
<FirstName>..</FirstName>
<Address>
<Street>...</Street>
<City>Moscow</City>
<PostalCode>...</PostalCode>
</Address>
</employee>
The conversion is straightforward: every field is converted to an element and the value of the fields are the values of the corresponding elements (actually text nodes of the elements).
Please note that for every field there is an element, even if this field contains no value (in a fixed stream, this is all spaces of zeroes).
XmlToSeq is the inverse function of SeqToXml, i.e. XmlToSeq can convert a DOM document to a sequential stream.
After initializing XmlToSeq, you can convert a DOM Document to a sequential stream
xml2seq.ConvertToSeq(String) or
xml2seq.ConvertToSeq(org.w3c.dom.Document)
xml2seq.ConvertToSeq(java.net.URL)
After you have successfully converted the DOM document, you can get result with
String result = xml2seq.getSeqStream();
If the interface definition is out of balance with the XML document structure ConvertToSeq(..) will fail.
The next example shows you in one program how you
The following interface definition will be used (http://127.0.0.1/data/employee.idl)
1 lastname (A40)
1 firstname (A20)
1 birthdate (A10)
1 address (/2) /* i.e. group occurs two times
2 street (A30)
2 city (A10)
2 postalcode (A10)
1 contact
2 phone (A20/2)
2 email (A60)
String charstream = null;
try {
// build sequential stream
java.net.URL url = new java.net.URL("http://127.0.0.1/data/employee.idl");
ParseSeqStream testdata = new ParseSeqStream();
testdata.initialize(url); // build in-memory presentation
testdata.createBuffer(); // create an empty buffer
// pass a string if you want to read the data!
IField root = testdata.getRoot(); // get the root. From here you can get all fields ...
root.getElement("lastname").setValue("de Grijs");
root.getElement("firstname").setValue("Rudolf");
root.getElement("birthdate").setValue("20-11-1961");
IField[] Address = root.getElements("address"); // address is a group
Address[0].getElement("street").setValue("De Raaf 13"); // only assign first address
Address[0].getElement("city").setValue("Culemborg");
Address[0].getElement("postalcode").setValue("4312DJ");
IField contact = root.getElement("contact");
IField[] phones = contact.getElements("phone"); // phone is an array with two occurences
phones[0].setValue("0345 54 41 57");
contact.getElement("email").setValue("rudolf.de_grijs@softwareag.com");
System.out.println("Test data ...\n" + testdata); // toString() writes out the result ..
charstream = testdata.getSeqStream();
// convert to XML
SeqToXml seq2xml = new SeqToXml();
seq2xml.initialize(url); // initialize parser with interface definition
seq2xml.setSeqStream(charstream); // assign previous character stream
Document doc = seq2xml.CreateXMLFile("employee"); // create XML file
org.apache.xml.serialize.XMLSerializer ser = new org.apache.xml.serialize.XMLSerializer(System.out,
new org.apache.xml.serialize.OutputFormat());
ser.serialize(doc);
System.out.println(xmlString); // write out the result
// And now we will convert the data back ...
XmlToSeq xml2seq = new XmlToSeq();
xml2seq.initialize(url);
xml2seq.ConvertToSeq(doc); // convert previous document back. Result is the same as charstream.
System.out.println("And back again\n" + xml2seq.getSeqStream());
}
catch(Exception exc) {
System.out.println(exc);
}
This simple example shows you how easy it is to
The syntax of the interface definition follows the syntax of the interface definition as it is used by EntireX SDK tools.
There are only a few differences:
1 multidimens (/3,2)
can be rewritten to
1 multidimens(/3)
2 secdimens (/2)
Furthermore class IField contains various methods to convert the value of a field to Java native data types.
group-definition:: <level> <name> [<occurence count>]
field-definition:: <level> <name> (<type><length>[<occurence count>])
<level>:: {number+} level is a number that corresponds to the level of the group or field.
<name>:: any combination of letters or numbers without any restrictions.
But you are not allowed to use spaces (this includes tabs, linefeeds and carriage returns), since a space is used as delimiter.
<type>:: {A|N} A is alphanumeric and N is numeric.
<length>:: {number+|number+.number+}
As you can see you can also specify a precision. The number before the decimal point defines the number of digits before the decimal point and the number behind the decimal point defines the number of digits behind the decimal point.
<occurence count>::{number+}
number >= 0
This section describes what needs to be done
Up till know there are no known bugs. But that can change if someone is willing to pickup 3 from the previous paragraph ;0).