More concise/efficient regex to match a string within matching parentheses -
i match strings, foo appears within select([...]), if possibly occurring parentheses match together. e.g. match select(((foo))) or select(x(())(foo(x))()x((y)x)x()) not select((foo) or select(x(foo)y().
i know, have limit maximum number of nested parentheses , came following regular expression solve problem 1 additional pair of parentheses:
select\((?: (?:[^()]*|[^()]*\([^()]*\)[^()]*)* foo (?:[^()]*|[^()]*\([^()]*\)[^()]*)* | (?:[^()]*|[^()]*\([^()]*\)[^()]*)* \([^()]*foo[^()]*\) (?:[^()]*|[^()]*\([^()]*\)[^()]*)* )\) that means within select([...]) either match foo no or 1 pair of parentheses in front or behind or match foo within 1 pair of parentheses , no or 1 pair of parentheses in front or behind.
does have neater solution this?
expanding regex solve problem 2 additional pair of parentheses this:
select\((?: (?:[^()]*|[^()]*\((?:[^()]*|[^()]*\([^()]*\)[^()]*)*\)[^()]*)* foo (?:[^()]*|[^()]*\((?:[^()]*|[^()]*\([^()]*\)[^()]*)*\)[^()]*)* | (?:[^()]*|[^()]*\((?:[^()]*|[^()]*\([^()]*\)[^()]*)*\)[^()]*)* \((?: (?:[^()]*|[^()]*\([^()]*\)[^()]*)* foo (?:[^()]*|[^()]*\([^()]*\)[^()]*)* | (?:[^()]*|[^()]*\([^()]*\)[^()]*)* \([^()]*foo[^()]*\) (?:[^()]*|[^()]*\([^()]*\)[^()]*)* )\) (?:[^()]*|[^()]*\((?:[^()]*|[^()]*\([^()]*\)[^()]*)*\)[^()]*)* )\) whereby indented part previous regex , no or 1 pair of parentheses parts have been expanded no or 1 or 2 pair of parentheses.
i put last regex on regex101: https://www.regex101.com/r/fj6cr4/1
the problem regex (and more further expanded versions) quite time-consuming, i'm hoping better ideas.
there 2 things should simplify (and speed up) regex:
(?: [^()]* | [^()]*\([^()]*\)[^()]* )*example of catastrophic backtracking. outer, repeated group should have 2 alternatives: sequence of non-parenthesised characters or such sequence between parenthesis:(?: [^()]+ | \([^()]*\) )*you mixing non-parenthesised characters
[^()]*both alternatives.instead of doing
…foo…|…\(foo\)…, better should…(?:foo|\(foo\))…don't have repeat lengthy…thing.
with two, smaller expression becomes
select\( (?: [^()]+ | \([^()]*\) )* (?: foo | \([^()]*foo[^()]*\) ) (?: [^()]+ | \([^()]*\) )* \) i'll leave applying these onto larger expression you.
Comments
Post a Comment