Aba logo

Atomic groups (?>)

Using atomic groups, you can optimize a slow regex that performs excessive backtracking.

(.*?,){20}slithy A slow regex
(?>.*?,){20}slithy A faster regex
(?:(?=(.*?,))\1){20}slithy A faster regex using lookahead

The first regular expression matches a comma-separated list of 21 values, the last of which is slithy. If the last item is not slithy, the regex engine backtracks to the last but one item and tries to include the comma in it. It fails. All possible combinations of items are tried in this way, which takes time proportional to 220 ≈ 1 million combinations.

It makes no sense, because slithy is not here, no matter how much combinations you try. Besides, a list with less than 21 values must be skipped; only “slithy” in the 21th position should be matched, so a comma must not be included in a item text.

Lookahead can be used to optimize the slow regular expression, but the syntax is overlong. So, it's recommended to use atomic groups: put a question mark and a “greater than” sign after the opening parenthesis (?> ). The repetitions in these parentheses will not backtrack. When the second regular expression cannot find slithy, it immediately fails without trying 2N combinations.

This is a page from Aba Search and Replace help file.