Thea
Classes | Public Types | Public Member Functions | Static Public Member Functions | Protected Member Functions | List of all members
TextInputStream Class Reference

A simple style tokenizer for reading text files. More...

#include <TextInputStream.hpp>

Inheritance diagram for TextInputStream:
NamedObject Noncopyable INamedObject

Classes

class  BadMSVCSpecial
 Thrown while parsing a number of the form 1. More...
 
struct  Settings
 Tokenizer configuration options. More...
 
class  TokenException
 Thrown when a token cannot be read. More...
 
class  WrongString
 String read from input did not match expected string. More...
 
class  WrongSymbol
 Thrown by the read methods if a symbol string does not match the expected string. More...
 
class  WrongTokenType
 Thrown by the read methods if a token is not of the expected type. More...
 

Public Types

Public Member Functions

char const * getName () const
 Get the name of the object. More...
 
std::string getPath () const
 Get the path to the file from which this input is drawn, or the first few characters of the string if created from a string. More...
 
bool hasMore ()
 Returns true while there are tokens remaining. More...
 
Token peek ()
 Get a copy of the next token in the input stream, but don't remove it from the input stream. More...
 
int peekCharacterNumber ()
 Get the character number (relative to the line) for the next token in the input stream. More...
 
int peekLineNumber ()
 Get the line number for the next token. More...
 
void push (Token const &t)
 Take a previously read token and push it back at the front of the input stream. More...
 
Token read ()
 Read the next token (which will be the Type::END token if !hasMore()). More...
 
bool readBoolean ()
 Read a boolean. More...
 
std::string readComment ()
 Like readCommentToken(), but returns the token's string. More...
 
void readComment (std::string const &s)
 Read a specific comment token. More...
 
Token readCommentToken ()
 Read a comment token and return it. More...
 
std::string readLine ()
 Read from the beginning of the next token until the following newline and return the result as a string, ignoring all parsing in between. More...
 
std::string readNewline ()
 Like readNewlineToken(), but returns the token's string. More...
 
void readNewline (std::string const &s)
 Read a specific newline token. More...
 
Token readNewlineToken ()
 Read a newline token and return it. More...
 
double readNumber ()
 Read one token (or possibly two) as a number. More...
 
Token readSignificant ()
 Calls read() until the result is not a newline or comment. More...
 
std::string readString ()
 Like readStringToken, but returns the token's string. More...
 
void readString (std::string const &s)
 Read a specific string token. More...
 
Token readStringToken ()
 Read a string token and return it. More...
 
std::string readSymbol ()
 Like readSymbolToken(), but returns the token's string. More...
 
void readSymbol (std::string const &symbol)
 Read a specific symbol token. More...
 
void readSymbols (std::string const &s1, std::string const &s2)
 Read a series of two specific symbols. More...
 
void readSymbols (std::string const &s1, std::string const &s2, std::string const &s3)
 Read a series of three specific symbols. More...
 
void readSymbols (std::string const &s1, std::string const &s2, std::string const &s3, std::string const &s4)
 Read a series of four specific symbols. More...
 
Token readSymbolToken ()
 Read a symbol token and return it. More...
 
int8 setName (char const *s)
 Set the name of the object from a C-style string. More...
 
virtual int8 setName (std::string const &s)
 Set the name of the object from a std::string. More...
 
 TextInputStream (std::string const &path_, Settings const &settings=Settings::defaults())
 Open a file for reading formatted text input. More...
 
 TextInputStream (FS fs, std::string const &str, Settings const &settings=Settings::defaults())
 Creates input directly from a string. More...
 

Static Public Member Functions

static bool parseBoolean (std::string const &_string)
 Extract a boolean value from a string. More...
 
static double parseNumber (std::string const &_string)
 Extract a number from a string. More...
 

Protected Member Functions

std::string const & getNameStr () const
 Access the name string directly, for efficiency. More...
 

Detailed Description

A simple style tokenizer for reading text files.

TextInputStream handles a superset of C++, Java, Matlab, and Bash code text including single line comments, block comments, quoted strings with escape sequences, and operators. TextInputStream recognizes several categories of tokens, which are separated by white space, quotation marks, or the end of a recognized operator:

The special ".." and "..." tokens are always recognized in addition to normal C++ operators. Additional tokens can be made available by changing the Settings.

Negative numbers are handled specially because of the ambiguity between unary minus and negative numbers – see the note for TextInputStream::read().

Inside quoted strings escape sequences are converted. Thus the string token for ["a\\nb"] is 'a', followed by a newline, followed by 'b'. Outside of quoted strings, escape sequences are not converted, so the token sequence for [a\nb] is symbol 'a', symbol '\', symbol 'nb' (this matches what a C++ parser would do). The exception is that a specified TextInputStream::Settings::otherCommentCharacter preceeded by a backslash is assumed to be an escaped comment character and is returned as a symbol token instead of being parsed as a comment (this is what a LaTeX or VRML parser would do).

Assumes that the file is not modified once opened.

Derived from the G3D library: http://g3d.sourceforge.net

Examples

TextInputStream ti(TextInputStream::FROM_STRING, "name = 'Max', height = 6");
Token t;
t = ti.read();
assert(t.type == Token::Type::SYMBOL);
assert(t.sval == "name");
t = ti.read();
assert(t.type == Token::Type::SYMBOL);
assert(t.sval == "=");
std::string name = ti.read().sval;
ti.read();
TextInputStream ti(TextInputStream::FROM_STRING, "name = 'Max', height = 6");
ti.readSymbols("name", "=");
std::string name = ti.readString();
ti.readSymbols(",", "height", "=");
double height = ti.readNumber();

Definition at line 273 of file TextInputStream.hpp.

Member Enumeration Documentation

enum FS

A flag indicting the source of a stream.

Definition at line 590 of file TextInputStream.hpp.

Constructor & Destructor Documentation

TextInputStream ( std::string const &  path_,
Settings const &  settings = Settings::defaults() 
)
explicit

Open a file for reading formatted text input.

Definition at line 1308 of file TextInputStream.cpp.

TextInputStream ( FS  fs,
std::string const &  str,
Settings const &  settings = Settings::defaults() 
)

Creates input directly from a string.

The first argument must be TextInputStream::FROM_STRING.

Definition at line 1323 of file TextInputStream.cpp.

Member Function Documentation

char const* getName ( ) const
virtualinherited

Get the name of the object.

Implements INamedObject.

Definition at line 78 of file NamedObject.hpp.

std::string const& getNameStr ( ) const
protectedinherited

Access the name string directly, for efficiency.

Definition at line 98 of file NamedObject.hpp.

std::string getPath ( ) const

Get the path to the file from which this input is drawn, or the first few characters of the string if created from a string.

Definition at line 604 of file TextInputStream.hpp.

bool hasMore ( )

Returns true while there are tokens remaining.

Definition at line 250 of file TextInputStream.cpp.

bool parseBoolean ( std::string const &  _string)
static

Extract a boolean value from a string.

Returns
True if toLower(_string) == "true", else false.

Definition at line 71 of file TextInputStream.cpp.

double parseNumber ( std::string const &  _string)
static

Extract a number from a string.

Includes MSVC specials parsing

Definition at line 77 of file TextInputStream.cpp.

Token peek ( )

Get a copy of the next token in the input stream, but don't remove it from the input stream.

Definition at line 138 of file TextInputStream.cpp.

int peekCharacterNumber ( )

Get the character number (relative to the line) for the next token in the input stream.

See also
peek(), peekLineNumber()

Definition at line 156 of file TextInputStream.cpp.

int peekLineNumber ( )

Get the line number for the next token.

See also
peek(), peekCharacterNumber()

Definition at line 150 of file TextInputStream.cpp.

void push ( Token const &  t)

Take a previously read token and push it back at the front of the input stream.

Can be used in the case where more than one token of read-ahead is needed (i.e. when peek doesn't suffice).

Definition at line 244 of file TextInputStream.cpp.

Token read ( )

Read the next token (which will be the Type::END token if !hasMore()).

Signed numbers can be handled in one of two modes. If the option TextInputStream::Settings::signedNumbers is true, a '+' or '-' immediately before a number is prepended onto that number and if there is intervening whitespace, it is read as a separate symbol. If TextInputStream::Settings::signedNumbers is false, read() does not distinguish between a plus or minus symbol next to a number and a positive/negative number itself. For example, "x - 1" and "x -1" will be parsed the same way by read(). In both cases, readNumber() will contract a leading "-" or "+" onto a number.

Definition at line 162 of file TextInputStream.cpp.

bool readBoolean ( )

Read a boolean.

If the next input token is not a boolean, throws WrongTokenType.

Definition at line 1103 of file TextInputStream.cpp.

std::string readComment ( )

Like readCommentToken(), but returns the token's string.

Use this method (rather than readCommentToken) if you want the token's value but don't really care about its location in the input. Use of readCommentToken is encouraged for better error reporting.

Definition at line 1216 of file TextInputStream.cpp.

void readComment ( std::string const &  s)

Read a specific comment token.

If the next token in the input is a comment matching s, it will be consumed. Use this method if you want to match a specific comment from the input. In that case, typically error reporting related to the token is only going to occur because of a mismatch, so no location information is needed by the caller.

WrongTokenType will be thrown if the next token in the input stream is not a comment. WrongString will be thrown if the next token in the input stream is a comment but does not match the s parameter. When an exception is thrown, no tokens are consumed.

Definition at line 1222 of file TextInputStream.cpp.

Token readCommentToken ( )

Read a comment token and return it.

Use this method (rather than readComment) if you want the token's location as well as its value.

WrongTokenType will be thrown if the next token in the input stream is not a comment. When an exception is thrown, no tokens are consumed.

Definition at line 1201 of file TextInputStream.cpp.

std::string readLine ( )

Read from the beginning of the next token until the following newline and return the result as a string, ignoring all parsing in between.

The newline is not returned in the string, and the following token read will be a newline or end of file token (if they are enabled for parsing).

Definition at line 177 of file TextInputStream.cpp.

std::string readNewline ( )

Like readNewlineToken(), but returns the token's string.

Use this method (rather than readNewlineToken) if you want the token's value but don't really care about its location in the input. Use of readNewlineToken() is encouraged for better error reporting.

Definition at line 1252 of file TextInputStream.cpp.

void readNewline ( std::string const &  s)

Read a specific newline token.

If the next token in the input is a newline matching s, it will be consumed. Use this method if you want to match a specific newline from the input. In that case, typically error reporting related to the token is only going to occur because of a mismatch, so no location information is needed by the caller.

WrongTokenType will be thrown if the next token in the input stream is not a newline. WrongString will be thrown if the next token in the input stream is a newline but does not match the s parameter. When an exception is thrown, no tokens are consumed.

Definition at line 1258 of file TextInputStream.cpp.

Token readNewlineToken ( )

Read a newline token and return it.

Use this method (rather than readNewline) if you want the token's location as well as its value. WrongTokenType will be thrown if the next token in the input stream is not a newline. When an exception is thrown, no tokens are consumed.

Definition at line 1237 of file TextInputStream.cpp.

double readNumber ( )

Read one token (or possibly two) as a number.

If the first token in the input is a number, it is returned directly. If TextInputStream::Settings::signedNumbers is false and the input stream contains a '+' or '-' symbol token immediately followed by a number token, both tokens will be consumed and a single token will be returned by this method.

WrongTokenType will be thrown if one of the input conditions described above is not satisfied. When an exception is thrown, no tokens are consumed.

Definition at line 1121 of file TextInputStream.cpp.

Token readSignificant ( )

Calls read() until the result is not a newline or comment.

Definition at line 44 of file TextInputStream.cpp.

std::string readString ( )

Like readStringToken, but returns the token's string.

Use this method (rather than readStringToken) if you want the token's value but don't really care about its location in the input. Use of readStringToken is encouraged for better error reporting.

Definition at line 1180 of file TextInputStream.cpp.

void readString ( std::string const &  s)

Read a specific string token.

If the next token in the input is a string matching s, it will be consumed. Use this method if you want to match a specific string from the input. In that case, typically error reporting related to the token is only going to occur because of a mismatch, so no location information is needed by the caller.

WrongTokenType will be thrown if the next token in the input stream is not a string. WrongString will be thrown if the next token in the input stream is a string but does not match the s parameter. When an exception is thrown, no tokens are consumed.

See also
readString(), readStringToken(), readLine()

Definition at line 1186 of file TextInputStream.cpp.

Token readStringToken ( )

Read a string token and return it.

Use this method (rather than readString) if you want the token's location as well as its value.

WrongTokenType will be thrown if the next token in the input stream is not a string. When an exception is thrown, no tokens are consumed.

Definition at line 1165 of file TextInputStream.cpp.

std::string readSymbol ( )

Like readSymbolToken(), but returns the token's string.

Use this method (rather than readSymbolToken) if you want the token's value but don't really care about its location in the input. Use of readSymbolToken() is encouraged for better error reporting.

Definition at line 1288 of file TextInputStream.cpp.

void readSymbol ( std::string const &  symbol)

Read a specific symbol token.

If the next token in the input is a symbol matching symbol, it will be consumed. Use this method if you want to match a specific symbol from the input. In that case, typically error reporting related to the token is only going to occur because of a mismatch, so no location information is needed by the caller.

WrongTokenType will be thrown if the next token in the input stream is not a symbol. WrongSymbol will be thrown if the next token in the input stream is a symbol but does not match the symbol parameter. When an exception is thrown, no tokens are consumed.

Definition at line 1294 of file TextInputStream.cpp.

void readSymbols ( std::string const &  s1,
std::string const &  s2 
)

Read a series of two specific symbols.

See readSymbol().

Definition at line 756 of file TextInputStream.hpp.

void readSymbols ( std::string const &  s1,
std::string const &  s2,
std::string const &  s3 
)

Read a series of three specific symbols.

See readSymbol().

Definition at line 763 of file TextInputStream.hpp.

void readSymbols ( std::string const &  s1,
std::string const &  s2,
std::string const &  s3,
std::string const &  s4 
)

Read a series of four specific symbols.

See readSymbol().

Definition at line 774 of file TextInputStream.hpp.

Token readSymbolToken ( )

Read a symbol token and return it.

Use this method (rather than readSymbol) if you want the token's location as well as its value.

WrongTokenType will be thrown if the next token in the input stream is not a symbol. When an exception is thrown, no tokens are consumed.

Definition at line 1273 of file TextInputStream.cpp.

int8 setName ( char const *  s)
virtualinherited

Set the name of the object from a C-style string.

Returns
True if the name was successfully set, else false (e.g. if the name is read-only). In the default implementation, the function always returns true.

Implements INamedObject.

Definition at line 86 of file NamedObject.hpp.

virtual int8 setName ( std::string const &  s)
virtualinherited

Set the name of the object from a std::string.

Returns
True if the name was successfully set, else false (e.g. if the name is read-only). In the default implementation, the function always returns true.

Definition at line 94 of file NamedObject.hpp.


The documentation for this class was generated from the following files: