How can i convert the text to DNA sequence ?
11 views (last 30 days)
Show older comments
hayder al gburi
on 29 Sep 2020
Commented: hayder al gburi
on 11 Nov 2020
How can i convert the text to DNA sequence ?
3 Comments
Accepted Answer
Adam Danz
on 13 Oct 2020
Edited: Adam Danz
on 13 Oct 2020
Use regexprep to replace the matches in key(:,1) with the definitions in key(:,2). Since some DNA codes are special characters in regular expressions (e.g. %, ^), you must add an escape character to them which is done programmatically below.
txt = 'HIdE2$^';
key = {
'H' 'AAGC'
'I' 'CCAA'
'd' 'ACGA'
'E' 'CCCA'
'2' 'CAGA'
'$' 'AAAA'
'^' 'GGGG'
};
% add escape char
needsEscape = ismember(key(:,1),{'$','^'}); % add more if needed
key(needsEscape,1) = strcat('\',key(needsEscape,1));
% Replace codes with DNC seq.
DNAseq = regexprep(txt,key(:,1),key(:,2));
Result:
DNAseq =
'AAGCCCAAACGACCCACAGAAAAAGGGG'
4 Comments
More Answers (1)
hayder al gburi
on 8 Nov 2020
2 Comments
Adam Danz
on 10 Nov 2020
It sounds like step #2 can be solved by using the method described in my answer.
I think step 3 can also be solved by using the same method after you rearrange the table into columns and use a v-lookup approach.
See Also
Categories
Find more on Genomics and Next Generation Sequencing in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!