Node:GNU Regexp Operators, Next:Case-sensitivity, Previous:Character Lists, Up:Regexp
gawk-Specific Regexp Operators
GNU software that deals with regular expressions provides a number of
additional regexp operators. These operators are described in this
section and are specific to
they are not available in other
Most of the additional operators deal with word matching.
For our purposes, a word is a sequence of one or more letters, digits,
or underscores (
balls, as a separate word.
cratebut it does not match
\Bis essentially the opposite of
There are two other operators that work on buffers. In Emacs, a
buffer is, naturally, an Emacs buffer. For other programs,
gawk's regexp library routines consider the entire
string to match as the buffer.
The operators are:
$ always work in terms of the beginning
and end of strings, these operators don't add any new capabilities
awk. They are provided for compatibility with other
In other GNU software, the word-boundary operator is
that conflicts with the
awk language's definition of
as backspace, so
gawk uses a different letter.
An alternative method would have been to require two backslashes in the
GNU operators, but this was deemed too confusing. The current
method of using
\y for the GNU
\b appears to be the
lesser of two evils.
The various command-line options
(see Command-Line Options)
gawk interprets characters in regexps:
gawkprovides all the facilities of POSIX regexps and the previously described GNU regexp operators. GNU regexp operators described in Regular Expression Operators. However, interval expressions are not supported.
\wmatches a literal
w). Interval expressions are allowed.
awkregexps are matched. The GNU operators are not special, interval expressions are not available, nor are the POSIX character classes (
[[:alnum:]], etc.). Characters described by octal and hexadecimal escape sequences are treated literally, even if they represent regexp metacharacters.
--traditionalhas been provided.