Info

This question is closed. Reopen it to edit or answer.

field delimiter uses any char as to declare a new field instead of full string

1 view (last 30 days)
Hi all,
I've been having a problem with function strread (I also tried it with textscan, but the problem still).
Assume I have this line: str= 'Name=conserved hypothetical protein;description=null;locus=gene3654c'
I want to use as field delimiter 'locus='
Then, I would have as last field gene3654c.
I've been trying the following (and some variations): id = strread(str,'%s', 'delimiter', 'locus=')
However, the output of id becomes: id =
'Name'
''
''
'n'
'erved hyp'
'theti'
'a'
'pr'
'tein;de'
''
'ripti'
'n'
'n'
''
''
';'
''
''
''
''
''
'gene3654'
And it cuts the "c" of gene3654. But I don't want it.
I could solve this problem changing the way I parse my string. However, a question is still in my mind:
How do I use as full string as field delimiter? i.e., using as delimiter 'locus=', it should return: id = 'Name=conserved hypothetical protein;description=null;' 'gene3654c'

Answers (2)

Jiro Doke
Jiro Doke on 9 Mar 2011
id = regexp(str, '(locus=)', 'split')

Matt Fig
Matt Fig on 9 Mar 2011
Just so you understand what happened when you set the delimiter to 'locus=', MATLAB interprets the delimiter property as a list of individual characters to use as delimiters. Thus each character in 'locus=' was used as a delimiter. Jiro showed you how to use REGEXP to specify 'locus=' as a delimiter. You could also get just the string following 'locus=' by using REGEXP:
regexp(str,'(?<=locus\=).*','match')
I am not sure what you mean by, "Then, I would have as last field Rv3654c." Where does the Rv come from?
  2 Comments
Jiro Doke
Jiro Doke on 9 Mar 2011
Ah, good explanation about why it didn't work for strread.
I assumed "Rv" was a typo. At the end of the question, he has "gene3654c".

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!