|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
The RegularGrammar
interface represents a factory for
generating regular expressions, typically for the purpose of
constructing RegularTokens
. Each newXXX
method allocates and returns a new RegularExpression
object which implements the XXX
interface. These
regex Objects
are then used to construct more complex
RegularExpressions
, eventually to be resubmitted to
the RegularGrammar
object with a name using the
newToken()
method. By this fashion one builds up a
set of named regular expressions, perhaps to be transformed to a
DFA
which recognizes the tokens implied by the
regexes. When the token construction phase is complete, calling
compile()
returns a 'compiled' version of the language.
The compilation process typically involves giving the appropriate
objects unique integer id's such that future set manipulation can
be done numerically rather than using full-scale
Objects
.
In this way, one can think of the RegularGrammar
interface as the 'thing' humans assemble and the
RegularSet
as the 'thing' machines use to do more
interesting things like build DFA
s.
Method Summary | |
RegularSet |
compile()
When token construction is complete, compile()
compiles and returns a RegularSet object which can
be used for generation of DFA 's, for example. |
Epsilon |
getEpsilon()
Returns the Epsilon symbol in the (rare) case one
needs it. |
CharClass |
newCharClass()
Allocates and returns a new CharClass expression
([^-a-z]). |
CharString |
newCharString(String s)
Allocates and returns a new CharString expression
('+') wrapping the given RegularExpression . |
Closure |
newClosure(RegularExpression re)
Allocates and returns a new Closure expression
('*') wrapping the given RegularExpression . |
Concatenation |
newConcatenation(RegularExpression left,
RegularExpression right)
Allocates and returns a new Concatenation
expression from the given left and right
RegularExpressions . |
Interval |
newInterval(char c)
Allocates and returns a new Interval expression
over the given char . |
Interval |
newInterval(int lo,
int hi)
Allocates and returns a new Interval expression
over the given character range from lo to hi, inclusive. |
Option |
newOption(RegularExpression re)
Allocates and returns a new Option expression
('?') wrapping the given RegularExpression . |
PositiveClosure |
newPositiveClosure(RegularExpression re)
Allocates and returns a new PositiveClosure
expression ('+') wrapping the given
RegularExpression . |
RegularToken |
newToken(int tokenID,
String name,
RegularExpression regex)
Allocates and returns a new RegularToken mapping
the given name to the given RegularExpression . |
RegularToken |
newToken(int tokenID,
String name,
String regex)
Allocates a new RegularToken in this grammar
having the given tokenID number, name, and regex. |
Union |
newUnion()
Allocates and returns a new Union expression. |
Method Detail |
public Concatenation newConcatenation(RegularExpression left, RegularExpression right)
Concatenation
expression from the given left and right
RegularExpressions
.public Closure newClosure(RegularExpression re)
Closure
expression
('*') wrapping the given RegularExpression
.public PositiveClosure newPositiveClosure(RegularExpression re)
PositiveClosure
expression ('+') wrapping the given
RegularExpression
. Note that the
PositiveClosure
is a shortcut for
concatenation-closure. Therefore, a+ expands to aa*.public Interval newInterval(int lo, int hi)
Interval
expression
over the given character range from lo to hi, inclusive.public Interval newInterval(char c)
Interval
expression
over the given char
. Note newInterval(97,
97)
has the same meaning as
newInterval('a')
under the ascii or unicode
charset.public Option newOption(RegularExpression re)
Option
expression
('?') wrapping the given RegularExpression
. Note
that Option
is not an atomic
RegularExpression
. Thus, 'a?' expands to the
Union
(a|Epsilon
).public CharString newCharString(String s)
CharString
expression
('+') wrapping the given RegularExpression
. Note
CharString
is not a fundamental expression. Thus,
'abc' expands to the concatenation sequence a-b-c.public Union newUnion()
Union
expression.
Subsequent modification of the Union
is required
(i.e. an empty union is invalid).public CharClass newCharClass()
CharClass
expression
([^-a-z]). Subsequent modification of the character class is
required (i.e. an empty character class is invalid).public RegularToken newToken(int tokenID, String name, RegularExpression regex)
RegularToken
mapping
the given name to the given RegularExpression
.
This is the 'special' newXXX()
method in that it
does not return a RegularExpression
, but a
Token
. The RegularToken
is returned
to potentially facilitate it's incorporation into other
languages such as the
ContextFreeLanguage.newTerminal(Token)
method.
Therefore, calling newToken()
not only makes a
RegularToken
object on the regex, it becomes
associated into the grammar.public RegularToken newToken(int tokenID, String name, String regex)
RegularToken
in this grammar
having the given tokenID number, name, and regex. The
RegularGrammar
is then responsible for parsing the
regex string and generating a RegularExpression
.public Epsilon getEpsilon()
Epsilon
symbol in the (rare) case one
needs it.public RegularSet compile()
compile()
compiles and returns a RegularSet
object which can
be used for generation of DFA
's, for example. The
compilation process is essentially making sure
Intervals
each get a unique ID and concatenating
ExpressionTerminators
where appropriate.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |