sdsu.util
Class TokenCharacters

java.lang.Object
  |
  +--sdsu.util.TokenCharacters

public class TokenCharacters
extends java.lang.Object
implements java.io.Serializable

This class maintains special characters used in parsing strings into tokens. It keeps track of whitespace, characters to indicate the start of a comment, quote characters used to quote tokens that contain special characters, an escape character, and characters that separate tokens.
Version Info
1.0 Added fromString, fromLabeledData, toLabeledData

Version:
1.1 2 June 1998
Author:
Roger Whitney (whitney@cs.sdsu.edu)
See Also:
SimpleTokenizer, Stringizer, Serialized Form

Field Summary
static java.lang.String COMMENT_CHAR
          Default character (#) used to indicate start of comment.
static char ESCAPE_CHAR
          Character (/) used to preceed a quote character or escape character in a quoted token.
static char QUOTE_CHAR
          Default character (') used to delineate the start and end of a quoted token.
static java.lang.String WHITESPACE
          Default characters treated as whitespace.
 
Constructor Summary
TokenCharacters()
          Create TokenCharacters with default values.
TokenCharacters(java.lang.String separators)
          Create TokenCharacters with given characters for token separators and default values for the rest of parameters.
TokenCharacters(java.lang.String separators, java.lang.String commentChars, char beginQuoteChar, char endQuoteChar, java.lang.String whitespace)
          Create a TokenCharacters object with given values
 
Method Summary
 void addQuoteChars(char beginQuote, char endQuote)
          Add the quote pair beginQuote-endQuote to the pairs recognized as char pairs to quote a token.
 boolean containsEscapeableChar(java.lang.String token)
          Returns true if c needs to be escaped in a quoted token.
 boolean containsTokenTerminator(java.lang.String token)
          Returns true if contains a character that indicates the end of a token.
 java.lang.String escapeToken(java.lang.String token)
          Places escape character before any quote character or the escape character.
 void fromLabeledData(LabeledData dataMembers)
          Recreates a TokenCharacter object from a LabeledData object
 void fromString(java.lang.String stateData)
          Recreates a TokenCharacter object from a string.
 char getCommentChar()
          Returns a character that indicates start of a comment.
 boolean isBeginQuote(char c)
          Returns true if c indicates the start of a quoted token
 boolean isComment(char c)
          Returns true if c indicates the start of a comment
 boolean isEndQuote(char c)
          Returns true if c indicates the end of a quoted token
 boolean isEOL(char c)
          Returns true if c is Mac, Unix, or PC EOL character
 boolean isEscape(char c)
          Returns true if c is an escape character
 boolean isQuotePair(char beginQuote, char endQuote)
          Returns true if beginQuote and endQuote are matching begin ending quote characters
 boolean isSeparator(char c)
          Returns true if c is a separator character
 boolean isTokenTerminator(char c)
          Returns true if c indicates the end of an unquoted token IE c is a whitespace, separator or comment character
 boolean isWhitespace(char c)
          Returns true if c is a whitespace character
 java.lang.String quoteToken(java.lang.String token)
          Surrounds token with begin-end quote pair Returns the quoted token
 boolean requiresEscaping(char c)
          Returns true if c needs to be escaped in a quoted token.
 void setSeparatorChars(java.lang.String newSeparators)
          Set the current set of separators to newSeparators
 LabeledData toLabeledData()
          Returns a string representing the object.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

WHITESPACE

public static final java.lang.String WHITESPACE
Default characters treated as whitespace. Default is space, tab, line feed (ascii 10, \n in unix) and carriage return ( ascii 13, usd as newline char on PC and Macs )

ESCAPE_CHAR

public static final char ESCAPE_CHAR
Character (/) used to preceed a quote character or escape character in a quoted token.

COMMENT_CHAR

public static final java.lang.String COMMENT_CHAR
Default character (#) used to indicate start of comment.

QUOTE_CHAR

public static final char QUOTE_CHAR
Default character (') used to delineate the start and end of a quoted token. Tokens are quoted when they contain special characters.
Constructor Detail

TokenCharacters

public TokenCharacters()
Create TokenCharacters with default values. You must set separators before using the new object.

TokenCharacters

public TokenCharacters(java.lang.String separators)
Create TokenCharacters with given characters for token separators and default values for the rest of parameters.

TokenCharacters

public TokenCharacters(java.lang.String separators,
                       java.lang.String commentChars,
                       char beginQuoteChar,
                       char endQuoteChar,
                       java.lang.String whitespace)
Create a TokenCharacters object with given values
Parameters:
commentChar - character used to indicate start of a comment
beginQuoteChar - character used to start a quote of a string containing special characters
endQuoteChar - character used to end a quote of a string containing special characters
whitespace - characters used for whitespace. Use null or empty string for no whitespace characters
Method Detail

toLabeledData

public LabeledData toLabeledData()
Returns a string representing the object. The string can be used to recreate the object

fromString

public void fromString(java.lang.String stateData)
                throws java.io.IOException
Recreates a TokenCharacter object from a string. Only required field is separators=avalue

fromLabeledData

public void fromLabeledData(LabeledData dataMembers)
Recreates a TokenCharacter object from a LabeledData object

addQuoteChars

public void addQuoteChars(char beginQuote,
                          char endQuote)
Add the quote pair beginQuote-endQuote to the pairs recognized as char pairs to quote a token.

getCommentChar

public char getCommentChar()
Returns a character that indicates start of a comment.

setSeparatorChars

public void setSeparatorChars(java.lang.String newSeparators)
Set the current set of separators to newSeparators

isEOL

public boolean isEOL(char c)
Returns true if c is Mac, Unix, or PC EOL character

isEscape

public boolean isEscape(char c)
Returns true if c is an escape character

isWhitespace

public boolean isWhitespace(char c)
Returns true if c is a whitespace character

isSeparator

public boolean isSeparator(char c)
Returns true if c is a separator character

isBeginQuote

public boolean isBeginQuote(char c)
Returns true if c indicates the start of a quoted token

isEndQuote

public boolean isEndQuote(char c)
Returns true if c indicates the end of a quoted token

isComment

public boolean isComment(char c)
Returns true if c indicates the start of a comment

isTokenTerminator

public boolean isTokenTerminator(char c)
Returns true if c indicates the end of an unquoted token IE c is a whitespace, separator or comment character

isQuotePair

public boolean isQuotePair(char beginQuote,
                           char endQuote)
Returns true if beginQuote and endQuote are matching begin ending quote characters

requiresEscaping

public boolean requiresEscaping(char c)
Returns true if c needs to be escaped in a quoted token. That is if c is a quote character or the escape character

containsEscapeableChar

public boolean containsEscapeableChar(java.lang.String token)
Returns true if c needs to be escaped in a quoted token. That is if c is a quote character or the escape character

escapeToken

public java.lang.String escapeToken(java.lang.String token)
Places escape character before any quote character or the escape character. Returns the modified token

containsTokenTerminator

public boolean containsTokenTerminator(java.lang.String token)
Returns true if contains a character that indicates the end of a token. That is whitespace, comment char or a separator

quoteToken

public java.lang.String quoteToken(java.lang.String token)
Surrounds token with begin-end quote pair Returns the quoted token