Extract only text between quotes of a string

Question

0 votes

Folks, could use an assist. New to REGEXP but persistent. Desire only the literal text between quotes for this string:

imt = "e";

[subchunk] = regexp(textline,'\".*?\"','match','ignorecase');

imt = subchunk;

When I return the argument from my readtext file function, and print using disp(imt), I get this:

'"e"'

When I only want:

e

I assume the single quotes are associated with the disp() function?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Walter Roberson on 1 Mar 2012

Edited: John Kelly on 26 Feb 2015

Open in MATLAB Online

1 vote

subchunk = regexp(textline, '(?<=")[^"]+(?=")', 'match');
imt = subchunk{1};

You could also consider just using a basic strfind() for '"', removing everything up to the first match and everything from the second match on.

8 Comments
Show 6 older comments Hide 6 older comments

Walter Roberson on 2 Mar 2012

The {1} is to account for your "I assume the single quotes are associated with the disp() function?"

The ignorecase option is not needed because you have no case-sensitive characters in the pattern that you are matching against.

[^"]+ means to look for a one or more characters that are not double-quotes (extending as far as possible). That is the first operation logically executed. Then, the (?<=") before that says that the potential match just located is not to be considered (and another search is to be done) unless there was a double-quote immediately before the stretch of non-double-quotes. Then the (?=") after says that the potential match just located is not to be considered (and another search is to be done) unless there is a double-quote immediately after the stretch of non-double-quotes. The look-behind and look-ahead at the double-quotes do not extend the match at all: the match will not include the double-quotes.

Another way of thinking about the pattern is, "Look for a double-quote, but do not include it in the match. Then look for the longest stretch of non-double-quote characters after that, with at least one character, and those non-double-quotes are to be included in the match. Then look right after the potential match and ensure there is a double-quote after it, but do not include that double-quote in the match.

You can _almost_ simplify the expression to '(?<=")[^"]+' but the difference between that and what I wrote is that what I wrote must have a trailing double-quote whereas the shorter version would be allowed to end at the end of the string even if no double-quote had been found.

Walter Roberson on 2 Mar 2012

(?<=") is always "look behind" (from where you are), so

(?<=")[^"]+(?<=") would mean to look behind for a double-quote, match a bunch of non-double-quote stuff, and then look behind from between the last non-double quote and the next character (or end of string) to see if the previous character was a double-quote. Which it could not be because it was only non-double-quotes in that pattern.

Look-behind from where you are, look ahead from where you are, different operators.

The output from regexp() is always a cell array, even when only one thing is being returned. You can return that cell array, and that might be appropriate in some cases, but be sure you do not try to switch() on the cell array itself: switch on the _content_ of the cell array. The

imt = subchunk{1};

strips away the cell array layer, leaving imt as a plain string (which _can_ be switch()'d on.)

Kirk woellert on 2 Mar 2012

Awesome. Thanks for your help.

Sign in to comment.

Answer 2

Pierre Harouimi on 8 Mar 2022

0 votes

From R2020b, you can use pattern, much easier than the complicated regexp function.

For your pb, you the extractBetween function seems the best one:

subchunk = extractBetween("'e'", "'", "'")

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Extract only text between quotes of a string

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

8 Comments
Show 6 older comments Hide 6 older comments

More Answers (1)

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

Extract only text between quotes of a string

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

8 Comments Show 6 older comments Hide 6 older comments

More Answers (1)

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

8 Comments
Show 6 older comments Hide 6 older comments

0 Comments
Show -2 older comments Hide -2 older comments