More concise/efficient regex to match a string within matching parentheses -


i match strings, foo appears within select([...]), if possibly occurring parentheses match together. e.g. match select(((foo))) or select(x(())(foo(x))()x((y)x)x()) not select((foo) or select(x(foo)y().

i know, have limit maximum number of nested parentheses , came following regular expression solve problem 1 additional pair of parentheses:

select\((?:     (?:[^()]*|[^()]*\([^()]*\)[^()]*)*     foo     (?:[^()]*|[^()]*\([^()]*\)[^()]*)*     |     (?:[^()]*|[^()]*\([^()]*\)[^()]*)*     \([^()]*foo[^()]*\)     (?:[^()]*|[^()]*\([^()]*\)[^()]*)* )\) 

that means within select([...]) either match foo no or 1 pair of parentheses in front or behind or match foo within 1 pair of parentheses , no or 1 pair of parentheses in front or behind.

does have neater solution this?

expanding regex solve problem 2 additional pair of parentheses this:

select\((?:     (?:[^()]*|[^()]*\((?:[^()]*|[^()]*\([^()]*\)[^()]*)*\)[^()]*)*     foo     (?:[^()]*|[^()]*\((?:[^()]*|[^()]*\([^()]*\)[^()]*)*\)[^()]*)*     |     (?:[^()]*|[^()]*\((?:[^()]*|[^()]*\([^()]*\)[^()]*)*\)[^()]*)*     \((?:         (?:[^()]*|[^()]*\([^()]*\)[^()]*)*         foo         (?:[^()]*|[^()]*\([^()]*\)[^()]*)*         |         (?:[^()]*|[^()]*\([^()]*\)[^()]*)*         \([^()]*foo[^()]*\)         (?:[^()]*|[^()]*\([^()]*\)[^()]*)*     )\)     (?:[^()]*|[^()]*\((?:[^()]*|[^()]*\([^()]*\)[^()]*)*\)[^()]*)* )\) 

whereby indented part previous regex , no or 1 pair of parentheses parts have been expanded no or 1 or 2 pair of parentheses.

i put last regex on regex101: https://www.regex101.com/r/fj6cr4/1

the problem regex (and more further expanded versions) quite time-consuming, i'm hoping better ideas.

there 2 things should simplify (and speed up) regex:

  • (?: [^()]* | [^()]*\([^()]*\)[^()]* )* example of catastrophic backtracking. outer, repeated group should have 2 alternatives: sequence of non-parenthesised characters or such sequence between parenthesis:

    (?: [^()]+ | \([^()]*\) )* 

    you mixing non-parenthesised characters [^()]* both alternatives.

  • instead of doing …foo…|…\(foo\)…, better should …(?:foo|\(foo\))… don't have repeat lengthy thing.

with two, smaller expression becomes

select\( (?: [^()]+ | \([^()]*\) )* (?: foo | \([^()]*foo[^()]*\) ) (?: [^()]+ | \([^()]*\) )* \) 

i'll leave applying these onto larger expression you.


Comments

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

php - Nothing but 'run(); ' when browsing to my local project, how do I fix this? -

php - How can I echo out this array? -