parsing information between two tags in matlab
2 views (last 30 days)
Show older comments
I am trying to parse nodes and edges information from Xml file. Here is a part from XML file:
<node id="1677" label="O60711">
<att name="shared name" value="O60711" type="string"/>
<att name="name" value="O60711" type="string"/>
<att name="selected" value="0" type="boolean"/>
<att name="Taxonomy ID" value="9606" type="string"/>
<att name="Taxonomy Name" value="human" type="string"/>
<att name="Human Readable Label" value="LPXN" type="string"/>
<att name="uniprotkb_accession" value="O60711" type="string"/>
<att name="uniprot" type="list">
<att name="uniprot" value="O60711" type="string"/>
<att name="uniprot" value="B4DV71" type="string"/>
<att name="uniprot" value="LPXN" type="string"/>
<att name="uniprot" value="Q53FW6" type="string"/>
<att name="uniprot" value="Q6FI07" type="string"/>
<att name="uniprot" value="B2R8B4" type="string"/>
<att name="uniprot" value="LDLP" type="string"/>
</att>
-------
-------
-------
</node>
I used this code to extract the information: such as node Id, label, edge Id ,label ,source and target
clc
clear all
xDoc = xmlread('ans.xgmml');
ansNode = xDoc.getDocumentElement;
nodes = ansNode.getElementsByTagName('node');
edges=ansNode.getElementsByTagName('edge');
node_Matrix = zeros(nodes.getLength,1);
edge_Matrix = zeros(edges.getLength,3);
for i = 0 : nodes.getLength-1
node_IDs = nodes.item(i).getAttribute('id');
node_Labels = nodes.item(i).getAttribute('label');
node_Matrix(i+1,1) = str2double(node_IDs);
node_Matrix1{i+1,1} = char(node_Labels);
end
t_node= table(node_Matrix1 ,node_Matrix, ...
'VariableNames', {'Label','ID'} ...
);
writetable(t_node,'nodeinfo.txt')
for j = 0 : edges.getLength-1
edge_IDs = edges.item(j).getAttribute('id');
edge_Labels = edges.item(j).getAttribute('label');
edge_sources=edges.item(j).getAttribute('source');
edge_targets=edges.item(j).getAttribute('target');
edge_Matrix(j+1,1) = str2double(edge_IDs);
edge_Matrix(j+1,2) = str2double(edge_sources);
edge_Matrix(j+1,3) = str2double(edge_targets);
edge_Matrix1{j+1,:} = regexp(char(edge_Labels), '( )', 'split');
end
t_edge = table(edge_Matrix1 ,edge_Matrix(:,1),edge_Matrix(:,2),edge_Matrix(:,3), ...
'VariableNames', {'Label','ID', 'Source', 'Target'} ...
);
writetable(t_edge,'edgeinfo.txt')
Now I want to get the other information in these lines
<att name="Human Readable Label" value="LPXN" type="string"/>
<att name="uniprot" value="O60711" type="string"/>
<att name="uniprot" value="B4DV71" type="string"/>
<att name="uniprot" value="LPXN" type="string"/>
<att name="uniprot" value="Q53FW6" type="string"/>
<att name="uniprot" value="Q6FI07" type="string"/>
<att name="uniprot" value="B2R8B4" type="string"/>
<att name="uniprot" value="LDLP" type="string"/>
</att>
like this
Human Readable Label=LPXN
uniprot=O60711
uniprot=B4DV71
uniprot=LPXN
uniprot=Q53FW6
uniprot=Q6FI07
uniprot=B2R8B4
uniprot=LDLP
How can I get these information from the lines above Any help will be highly appreciate
0 Comments
Answers (0)
See Also
Categories
Find more on JSON Format in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!