Go to the first, previous, next, last section, table of contents.
The regular expression syntax in STT is pretty standard. Whitespace is
never significant, however -- any literal space characters must be
introduced with the space character escape `\s'. Any literal
double-quotation mark `"' must be escaped since regexps are always
enclosed in double-quotations.
Op
-
Definition
|
-
Union: a list of alternate choices that can be matched, like
`a|b|c'.
none
-
Concatenation: a list of atoms that must be matched in sequence, like
`a b c' or `abc'.
[]
-
Character Classes: A syntactic convenience for alternation of character
intervals: `[\r\n]', `[a-z]'. Negation of character classes
inverts the sense of the inclusion: `[^a-z]'. If the dash
character `-' is one of the characters in the class, it must be
the first member in the class: `[-=+]'. Whitespace within the
brackets is not significant and characters that would normally have to
be escaped do not. The ones that do include: backslash `\\',
close-bracket `\', semicolon `\;', and all the whitespace
escapes `\s', `\r', `\n', `\t', `\v'.
Octal and Unicode escapes can be used as well.
Op
-
Definition
*
-
Closure: zero-or-more occurrences must exist
+
-
Positive-closure: one or more occurrences must exist
?
-
Optional: zero-or-one occurrences must exist
Op
-
Definition
\\
-
literal backslash
\s
-
literal space
\n
-
literal newline
\r
-
literal carriage return
\t
-
literal horizontal tab
\v
-
literal vertical tab
\+
-
literal plus sign
\*
-
literal asterisk
\?
-
literal question-mark
\(
-
literal open-parenthesis
\)
-
literal close-parenthesis
\[
-
literal open-bracket
\]
-
literal close-bracket
\|
-
literal pipe
\"
-
literal double-quote (necessary since regexps are enclosed in double
quotes)
Octal and unicode escapes match the following regular expressions,
respectively:
OCTAL_ESCAPE matches " \\ [0-3] [0-7] [0-7] ";
UNICODE_ESCAPE matches " \\ u [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] ";
From lowest to highest: union, concatentation, quantification, atom
(char | escape | char-class), grouping.
IDENTIFIER matches " [_a-z] [-_a-zA-Z0-9] ";
WHITESPACE matches " [\n \r \t \v \s]+ ";
BEVERAGE matches " coffee | tea | cola ";
CAFFEINE matches " caff(ei|ie)ne ";
Go to the first, previous, next, last section, table of contents.