SDSU CS 635 Advanced Object-Oriented Design & Programming
Spring Semester, 2002
Assignment 3
    Assignment Index        
© 2002, All Rights Reserved, SDSU & Roger Whitney
San Diego State University -- This page last updated 12-Feb-02

Assignment 3

Due Feb 26.

The goal of assignment 3 is to implement the Composite and the Visitor patterns. Keep in mind that in programs this small patterns may not be reasonable.

1. Read XML description from a file and build a composite tree with nodes of type Document, text and header.

2. Create a visitor to print out just the headers as html headers in the order they appear in XML file. The headers are to be separated by <br>.

3. Create a visitor to print out the tree as a single document html document. The headers and text should appear in the document in the same order as they appear in the XML file.

Grading


Percent of Grade
Working Code
15%
Unit Tests
15%
Comments
10%
Quality of Code
30%
Proper implementation of Patterns
30%

We will be using XML to define trees. The XML is simple enough that you may wish to parse the XML directly. I will explain how to use the SAX interface to XML parsers. SAX and XML parser is part of the standard VW5i4 Smalltalk library. The Java XML parser and documentation can be found at: http://java.sun.com/xml/jaxp/index.html. An XML parser for C++ can be found at: http://xml.apache.org/xerces-c/index.html. Documentation and examples can also be found at that site.



The XML

The XML we will use can be described using BNF like syntax as:

CS635Document ::= <CS635Document>(CS635Document* | (header [text*] )* | text*)*</CS635Document>
header ::= <header>char+</header>
text ::= <text>char+</text>

The proper way to define XML is with a DTD (Document Type Definition). Here is the DTD.

<!DOCTYPE CS635Document [
   <!ELEMENT CS635Document ( (CS635Document* | ( header , text*)* | text* )* )>
   <!ELEMENT header (#PCDATA)>
   <!ELEMENT text (#PCDATA)>
]>

So an example XML file is:
<?xml version="1.0" ?> 
<CS635Document>
   <header>This is an example</header>
   <text>Not much here</text>
   <CS635Document>
      <text>Just text here</text>
   </CS635Document>
</CS635Document>


If you wish to validate the XML we can add the DTD to get

<?xml version="1.0" ?>
<!DOCTYPE CS635Document [
<!ELEMENT CS635Document ( (CS635Document* | ( header , text*)* | text* )* )>
<!ELEMENT header (#PCDATA)>
<!ELEMENT text (#PCDATA)> ]>
<CS635Document>
   <header>This is an example</header>
   <text>Not much here</text>
   <CS635Document>
      <text>Just text here</text>
   </CS635Document>
</CS635Document>


SAX

SAX, Simple API for XML, is one way to use an XML parser to parse XML. In SAX you create a class which the parser hands parts of the XML. In Smalltalk this class is a subclass of XML.SAXDriver. In Java it implements the interface org.xml.sax .ContentHandler. In C++ the class is derived from DefaultHandler. There are a number of methods that the XML parser will call on this class. The important ones for this assignment are:

startDocument
public void startDocument()
virtual void startDocument (    ) 
Called when the parser starts to parse a document

endDocument
public void endDocument ()
virtual void    endDocument ()
Called when the parser ends a document

startElement: namespaceURI localName: localName qName: name attributes: attributes
public void startElement(java.lang.String uri,
                         java.lang.String localName,
                         java.lang.String qName,
                         Attributes atts)
                  throws SAXException
virtual void    startElement (const XMLCh *const uri,const XMLCh *const localname,const XMLCh *const qname,const Attributes &attrs)
Called when the parser starts to parse a start tag (<header> <text> < CS635Document>. We are only interested in the name of the tag. In our case the localName and qName are the same and will be the name of the tag.

public void endElement(java.lang.String uri,
                       java.lang.String localName,
                       java.lang.String qName)
                throws SAXException
virtual void    endElement (const XMLCh *const uri,const XMLCh *const localname,const XMLCh *const qname)
Called when the parser parses an end tag (</header> </text> </CS635Document>. We are only interested in the name of the tag. In our case the localName and qName are the same and will be the name of the tag.

characters: aString
public void characters(char[] ch,  int start, int length) throws SAXException
virtual void characters (const XMLCh *const chars, 
const unsigned int length)
The parser passes the text between the start and end tags to our code using this method.

Simple Example
I will use the following DTD to define the XML for this example.

<!DOCTYPE Sample [
<!ELEMENT Sample (  ( a* , b*)*)>
<!ELEMENT a (#PCDATA)>
<!ELEMENT b (#PCDATA)> ]>

This states that the first tag in the document must be:

<Sample>
</Sample>

The sample tag can contain tags a and b. The tags a and b can occur in any order and appear multiple times. The tags a and b contain only text. So the following are valid

<Sample>
   <a>hi</a>
   <b>mom</b>
   <b> how are</b>
   <a> you </a>
</Sample>

<Sample>
   <a>cat</a>
</Sample>


Smalltalk Example

The SAXDriverExample class implements the content handler interface. The TextNode class represents the tags a and b.

SAXDriverExample
Smalltalk.CS635 defineClass: #SAXDriverExample
   superclass: #{XML.SAXDriver}
   indexedType: #none
   private: false
   instanceVariableNames: 'currentNode root '
   classInstanceVariableNames: ''
   imports: ''
   category: 'Assignment-3'!

CS635.SAXDriverExample comment:
'SAXDriverExample is an example of a SAX content handler in Smalltalk 
Instance Variables:
   currentNode   <TextNode>   represents the tag the parser is currently parsing.
   root   <Collection of TextNode>   Contains the elements or tags in the XML document. Does not contain the top level tag.
'!

!CS635.SAXDriverExample methodsFor: 'content handler'!

characters: aString
   currentNode add: aString!

startDocument
   root := OrderedCollection new!

startElement: namespaceURI localName: localName qName: name attributes: attributes 
   localName = 'a' | (localName = 'b') 
      ifTrue: 
         [currentNode := TextNode new.
         root add: currentNode]! !


TextNode

Smalltalk.CS635 defineClass: #TextNode
   superclass: #{Core.Object}
   indexedType: #none
   private: false
   instanceVariableNames: 'text '
   classInstanceVariableNames: ''
   imports: ''
   category: 'Assignment-3'!

CS635.TextNode comment:
'TextNode represents the text in a tag. In this example both tags a and b just contain text. This example does not need this class. SAXDriver could use a collection in instead. However, you will have to provide classes to represent each type of tag in the XML, so I use TextNode
Instance Variables:
   text   <String>   text of the XML tag'!

!CS635.TextNode methodsFor: 'accessing'!

add: aString
   text isNil ifTrue:[text := String new].
   text := text , aString!

printOn: aStream
   aStream
      nextPut: $:; 
      nextPutAll: text;
      nextPut: $:! !


Using the classes

|page builder exampleDispatcher |
page := '<?xml version="1.0" ?>
<!DOCTYPE Sample [
<!ELEMENT Sample (  ( a* , b*)*)>
<!ELEMENT a (#PCDATA)>
<!ELEMENT b (#PCDATA)> ]>
<Sample>
   <a>hi</a>
   <b>mom</b>
   <b>how are</b>
   <a>you</a>
</Sample>'.
builder := SAXDriverExample new.
exampleDispatcher := SAXDispatcher new contentHandler: builder.
XMLParser 
   processDocumentString: page 
   beforeScanDo: 
      [:parser | 
      parser
         saxDriver:(exampleDispatcher); 
         validate: true].
builder inspect.

To read from a file use:

| builder exampleDispatcher |

builder := SAXDriverExample new.
exampleDispatcher := SAXDispatcher new contentHandler: builder.
XMLParser 
   processDocumentInFilename: 'page' 
   beforeScanDo: 
      [:parser | 
      parser
         saxDriver:(exampleDispatcher); 
         validate: true].
builder inspect.

Where the file page contains the XML document. If you do not want to include the DTD in the XML document then use parser validate: false.

Java Example
A shorter example to show how to call the parser.

import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.SAXParser;
import java.util.Vector;
import java.io.File;
   
public class SAXDriverExample extends DefaultHandler
   {
   private Vector root;
   
   public static void main(String argv[])
      {
      SAXDriverExample handler = new SAXDriverExample();
   
        // Use the default (non-validating) parser
      SAXParserFactory factory = SAXParserFactory.newInstance();
      try 
         {
         SAXParser saxParser = factory.newSAXParser();
         saxParser.parse( new File("sample"), handler );
         } 
      catch (Throwable t) 
         {
         t.printStackTrace();
         }
      System.out.println( handler.root());
      }
      
   public void startDocument()
      {
      root = new Vector();
      }
   
   public void characters(char[] ch, int start, int length) 
      {
      root.addElement( new String( ch, start, length));
      }
   
   public Vector root()
      {
      return root;
      }
   }
The file sample contains:

<?xml version="1.0" ?>
<Sample>
   <a>hi</a>
   <b>mom</b>
   <b>how are</b>
   <a>you</a>
</Sample>

Copyright ©, All rights reserved.
2002 SDSU & Roger Whitney, 5500 Campanile Drive, San Diego, CA 92182-7700 USA.
OpenContent license defines the copyright on this document.

    visitors since 12-Feb-02