Documentation > Regular Expressions > Backreferences \1
Backreferences \1
Backreferences match the previously captured subexpression. Subexpressions are marked with parentheses and numbered from 1 to 9. Use \1 to refer to the leftmost subexpression, \2 to refer to the second subexpression, and so on.
(\d+) \1 |
Find two equal numbers with space between them (e.g., 123 123) |
\<(\w+) \1\> |
Find repeating words (e.g., the the) |
<a href=("|')http://[^'"]+\1> |
An external link (in single or double quotes) |
Non-capturing group
Sometimes you don't want to capture a subexpression, but only to group a part of your regex with parentheses. In this case, you can use a non-capturing group. Type (?: ) instead of the usual parentheses.
(http|ftp)://[^" >]+ |
Find http:// or ftp:// URL |
(?:http|ftp)://[^" >]+ |
Ditto (without capturing “http or ftp”) |
If you don't have any references to the “http or ftp” subexpression, then the second example (with non-capturing group) is more appropriate. You will be unable to refer to the subexpression by using \1, because the subexpression will not be saved.
Capturing and non-capturing groups can be mixed in a regex:
| Search for | Replace to | Explanation |
(?:http|ftp)://([^" >]+) |
\1 |
Find an URL and remove the protocol from it |
Backreferences refer only to the usual, capturing parentheses; non-capturing groups are skipped from numeration. So, \1 refers to ([^" >]+), not to (?:http|ftp).
Non-capturing groups are useful in complex regular expressions, because the number of subexpressions is limited to 9 (from \1 to \9), so you may want to capture only what is needed.
(?:abc)+ |
Find “abc” repeating one or more times |
In this example, it makes no sense to capture the subexpression, because it's always equal to “abc”. So, the non-capturing group is a better choice here.
This is a page from Aba Search and Replace help file.