正则表达式参考手册

摘要:正则表达式参考手册
Abstract: Regular Expression Reference

Character classes

Exp. Match
. Matches any character except line breaks. Equivalent to [^\n\r].
\w Matches any word character (alphanumeric & underscore). Only matches low-ascii characters (no accented or non-roman characters). Equivalent to [A-Za-z0-9_]
\W Matches any character that is not a word character (alphanumeric & underscore). Equivalent to [^A-Za-z0-9_]
\d Matches any digit character (0-9). Equivalent to [0-9].
\D Matches any character that is not a digit character (0-9). Equivalent to [^0-9].
\s Matches any whitespace character (spaces, tabs, line breaks).
\S Matches any character that is not a whitespace character (spaces, tabs, line breaks).
[ABC] Match any character in the set.
[^ABC] Match any character that is not in the set.
a-z Matches a character having a character code between the two specified characters inclusive.

Anchors

Exp. Match
^ Matches the beginning of the string, or the beginning of a line if the multiline flag (m) is enabled. This matches a position, not a character.
$ Matches the end of the string, or the end of a line if the multiline flag (m) is enabled. This matches a position, not a character.
\b Matches a word boundary position such as whitespace, punctuation, or the start/end of the string. This matches a position, not a character.
\B Matches any position that is not a word boundary. This matches a position, not a character.

Escape charactors

Exp. Match
\. \* \\ \+ \* \? \{ \( \[ escaped special characters
\t \v \n \r \f tab, vertical tab, linefeed, carriage return, FORM FEED character (char code 12).
\u00A9 unicode escaped ©
\000 Octal escaped character in the form \000(null). Value must be less than 255 (\377).
\xFF Hexadecimal escaped character in the form \xFF.

Groups and Lookaround

Quantifiers indicate that the preceding token must be matched a certain number of times. By default, quantifiers are greedy, and will match as many characters as possible.
Alternation acts like a boolean OR, matching one sequence or another.

Exp. Match
(abc) capture group
\1 Back reference. Matches the results of a previous capture group. For example \1 matches the results of the first capture group & \3 matches the third.
(?:abc) Non-capturing group. Groups multiple tokens together without creating a capture group.
(?=abc) Positive lookahead. Matches a group after the main expression without including it in the result.
(?!abc) Negative lookahead. Specifies a group that can not match after the main expression (if it matches, the result is discarded).
(?<=ABC) Positive lookbehind. Matches a group before the main expression without including it in the result.
(?<!ABC) Negative lookbhind. Specifies a group that can not match before the main expression (if it matches, the result is discarded).

Quantifiers and Alternation

Exp. Match
+ Matches 1 or more of the preceding token.
* Matches 0 or more of the preceding token.
{1,3} Quantifier.Matches the specified quantity of the previous token. {1,3} will match 1 to 3. {3} will match exactly 3. {3,} will match 3 or more.
? Optional. Matches 0 or 1 of the preceding token, effectively making it optional.
? Makes the preceding quantifier lazy, causing it to match as few characters as possible. By default, quantifiers are greedy, and will match as many characters as possible.

Note:

1
| Acts like a boolean OR. Matches the expression before or after the |.It can operate within a group, or on a whole expression. The patterns will be tested in order.

Reference