sdsu.util
Class SimpleTokenizer

java.lang.Object
  |
  +--sdsu.util.SimpleTokenizer

public class SimpleTokenizer
extends java.lang.Object

This class performs some simple parsing of strings or streams. The input is a sequence of ascii characters. The sequence is divided into tokens, whitespace, and comments. Comments start with the comment character and continue to the next newline (\n ) character. Comments are removed from the input characters and not returned as part of a token. A token is string from the current location to the next separator or whitespace character. Characters defined as whitespace (tab, newline, and space default values ) help delineate tokens but are not part of tokens. That is whitespace characters are removed after finding a token. If a token must contain whitespace character, a possible separator, or comment character, the token must be placed between two quote characters. A quoted token can contain a quote character.

Version:
1.2 2 June 1998
Author:
Roger Whitney (whitney@cs.sdsu.edu)
See Also:
Stringizer, TokenCharacters

Constructor Summary
SimpleTokenizer(java.io.InputStream tokenSource)
          Create a SimpleTokenizer on tokenSource with default settings
SimpleTokenizer(java.io.InputStream tokenSource, TokenCharacters charTable)
          Create a SimpleTokenizer on tokenSource
SimpleTokenizer(java.io.Reader tokenSource)
          Create a SimpleTokenizer on tokenSource with default settings
SimpleTokenizer(java.io.Reader tokenSource, TokenCharacters charTable)
          Create a SimpleTokenizer on tokenSource
SimpleTokenizer(java.lang.String parsable)
          Create a SimpleTokenizer on string with default settings
SimpleTokenizer(java.lang.String parsable, TokenCharacters charTable)
          Create a SimpleTokenizer on string
 
Method Summary
 boolean hasMoreElements()
          Returns true if not at end of source stream or source string
 boolean hasMoreTokens()
          Returns true if not at end of source stream or source string
 java.lang.String nextToken()
          Returns string containing all characters up to the given separator, unquoted whitespace, or EOF if the separator is not found.
 java.lang.String nextToken(java.lang.String newSeparators)
          Returns string containing all characters up to the given separator, unquoted whitespace, or EOF if the separator is not found.
 char separator()
          Returns the separator found by the last call to nextToken
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleTokenizer

public SimpleTokenizer(java.lang.String parsable)
Create a SimpleTokenizer on string with default settings

SimpleTokenizer

public SimpleTokenizer(java.lang.String parsable,
                       TokenCharacters charTable)
Create a SimpleTokenizer on string
Parameters:
commentChar - character used to indicate start of a comment
quoteChar - character used to quote a string containing special characters
whitespace - characters used for whitespace. Use null or empty string for no whitespace characters

SimpleTokenizer

public SimpleTokenizer(java.io.InputStream tokenSource)
Create a SimpleTokenizer on tokenSource with default settings

SimpleTokenizer

public SimpleTokenizer(java.io.Reader tokenSource)
Create a SimpleTokenizer on tokenSource with default settings

SimpleTokenizer

public SimpleTokenizer(java.io.InputStream tokenSource,
                       TokenCharacters charTable)
Create a SimpleTokenizer on tokenSource
Parameters:
commentChar - character used to indicate start of a comment
beginQuoteChar - character used to start a quote of a string containing special characters
endQuoteChar - character used to end a quote of a string containing special characters
whitespace - characters used for whitespace. Use null or empty string for no whitespace characters

SimpleTokenizer

public SimpleTokenizer(java.io.Reader tokenSource,
                       TokenCharacters charTable)
Create a SimpleTokenizer on tokenSource
Parameters:
commentChar - character used to indicate start of a comment
beginQuoteChar - character used to start a quote of a string containing special characters
endQuoteChar - character used to end a quote of a string containing special characters
whitespace - characters used for whitespace. Use null or empty string for no whitespace characters
Method Detail

hasMoreTokens

public boolean hasMoreTokens()
Returns true if not at end of source stream or source string

hasMoreElements

public boolean hasMoreElements()
Returns true if not at end of source stream or source string

separator

public char separator()
Returns the separator found by the last call to nextToken

nextToken

public java.lang.String nextToken(java.lang.String newSeparators)
                           throws java.io.IOException
Returns string containing all characters up to the given separator, unquoted whitespace, or EOF if the separator is not found. The separator is removed from the stream, but not returned as part of token.
Parameters:
separator - can be any character except the current comment or quote character
Throws:
java.io.IOException - If separator or EOF does not follow this token

nextToken

public java.lang.String nextToken()
                           throws java.io.IOException
Returns string containing all characters up to the given separator, unquoted whitespace, or EOF if the separator is not found. The separator is removed from the stream, but not returned as part of token.
Parameters:
separator - set of characters to be used as separator after token. Can be any non-null or nonempty string of characters except the current comment or quote character
Throws:
java.io.IOException - If separator or EOF does not follow this token