A regular expression specifies a set of strings to be matched. It contains
text characters and operator characters.
The operator characters are the following:
" \ [ ] ^ - ? . * + | ( ) $ / { } % < >
Everything else is a text character.
An operator character may be turned into a text character by enclosing
it in quotes, or by preceding it with a \ (backslash).
Matching multiple occurences of characters:
a* matches zero or more occurences of a
a+ matches one or more occurences of a
a{m,n} matches m through n occurences of a
Checking for context:
a/b matches "a" but only if followed by b (the b is not matched).
a$ matches "a" only if "a" occurs at the end of a line (i.e. right before a newline). The newline is not matched.
^a matches "a" only if "a" occurs at the beginning of a line (i.e right after a newline).
Sets of characters
[abc] matches any charcter that is an "a", "b" or "c" (and nothing else).
[a-zA-Z0-9] Matches any letter (uppercase or lowercase) or digit.
[^abc] matches any charcter but "a", "b" and "c". Note the two
uses of "^" for context as well as forming the complement.
Miscellaneous:
Grouping: Use () to group expressions
Optional arguments: e.g. ab?c matches abc and ac
Blanks: can only be matched by putting a blank inside quotes.
Tabs: Either by pressing the key between quotes, or by its escape sequence "\t"
Newlines: By its escape sequence "\n".
Default: the "." matches everything except a newline.
Within the square brackets most operators lose their special meanings. The exceptions are: "\" and "-". The "^" loses its usual meaning but takes on a new one.
"\n" always matches newline, with or without the quotes. If you want to match the character "\" followed by "n", use \\n