MATLAB Answers

negative with positive lookbehind regex issue

11 views (last 30 days)
Sebastian
Sebastian on 17 Jul 2016
Commented: Sebastian on 18 Jul 2016
I'm trying to find the location of open parenthesis, '(', that are preceded by a number, but not when that number is an integer also preceded by the characters 'O' or 'S'.
Example:
str = '12()+F34()+O56()';
should return ind = [3,9], i.e. the open parenthesis following the 12 and 34, but not O56
I tried this:
ind = regexp(str,'(?<=((?<![OS])[0-9]+))[\(]');
but it gives me all of them (ind:[3,9,15]). It does however exclude the cases when there is just a single number after 'O' or 'S' (e.g. str = '12()+F34()+O5()'; -> ind:[3,9])
Does anyone know the proper regular expression for this?
Matlab version: 7.13.0.564 (R2011b)

  1 Comment

Stephen Cobeldick
Stephen Cobeldick on 18 Jul 2016
@Sebastian: You might like to try using my FEX submission makeregexp:
which lets you interactively develop regular expressions and see regexp's outputs change as you type.

Sign in to comment.

Accepted Answer

Stephen Cobeldick
Stephen Cobeldick on 18 Jul 2016
Edited: Stephen Cobeldick on 18 Jul 2016
Try this regular expression: |(?<=\d+)(?<![OS]\d+)\(|
It relies on the fact that lookaround operations do not consume any characters: the first lookaround matches some digits, the second then checks that any digits are not preceded by the letters O or S. Here it is tested:
>> str = '12()+F34()+O56()';
>> regexp(str,'(?<=\d+)(?<![OS]\d+)\(')
ans =
3 9
To help develop this regular expression I used my FEX submission makeregexp:
which lets you interactively develop regular expressions and see regexp's outputs change as you type.

  1 Comment

Sebastian
Sebastian on 18 Jul 2016
"lookaround operations do not consume any characters"
Of course, how did I forget. That explains everything. Thanks for the nice regex too.

Sign in to comment.

More Answers (1)

Azzi Abdelmalek
Azzi Abdelmalek on 17 Jul 2016
str = '12()+F34()+O56()';
ii1=regexp(str,'(?<=[OS]\d+)(\()' )
ii2=regexp(str,'(?<=\d+)(\()' )
out=setdiff(ii2,ii1)

  1 Comment

Stephen Cobeldick
Stephen Cobeldick on 18 Jul 2016
Sebastian's "Answer" moved here:
Thanks Azzi, I will consider that solution. Do you know what is wrong with my initial expression?
Right now I'm resorting to
ind = [regexp(strFunc,'(?<=[^OS0-9][0-9]+)[\(]'),regexp(strFunc,'^[0-9]+[\(]','end')]
which seems to get the job done, but is not exactly the prettiest.

Sign in to comment.

Sign in to answer this question.