Atomic groups (?>)
Using atomic groups, you can optimize a slow regex that performs excessive backtracking.
(.*?,){20}slithy |
A slow regex |
(?>.*?,){20}slithy |
A faster regex |
(?:(?=(.*?,))\1){20}slithy |
A faster regex using a lookahead |
The first regular expression matches a comma-separated list of 21 values, the last of which is slithy
. If the last item is not slithy
, the regex engine backtracks to the last but one item and tries to include the comma in it. It fails. All possible combinations of items are tried in this way, which takes time proportional to 220 ≈ 1 million combinations.
It makes no sense because slithy
is not here, no matter how many combinations you try. Besides, a list with less than 21 values must be skipped; only “slithy” in the 21th position should be matched, so a comma must not be included in an item text.
A lookahead can be used to optimize the slow regular expression, but the syntax is overlong. So, it's recommended to use atomic groups: put a question mark and a “greater than” sign after the opening parenthesis (?> )
. The repetitions in these parentheses will not backtrack. When the second regular expression cannot find slithy
, it immediately fails without trying 2N combinations.
This is a page from Aba Search and Replace help file.
- Welcome to Aba
- Getting started
- How-to guides
- Selecting the files to search in
- Inserting some text at the beginning of each file
- Replacing multiple lines of text
- Searching in Unicode files
- Replacing in binary files
- Performing operations with the found files
- Undoing a replacement
- Saving search parameters for further use
- Removing private data
- Adding or removing Aba from Explorer context menu
- Integrating Aba with Total Commander
- Integrating Aba with Free Commander
- Integrating Aba with Directory Opus
- Regular Expressions
- User interface
- Command line
- Troubleshooting
- Glossary
- Version history
- Credits