writestruct does not reproduce the input of readstruct
9 views (last 30 days)
Show older comments
Adrien Leygue
on 16 Feb 2023
Commented: Jeremy Hughes
on 21 Feb 2023
I find it disturbing that applying writestruct on a structure created with readstruct does not reproduce the original input: see example below.
Is there a way to fix this? Readstruct is very convenient for reading and modifying lighweight xml file (in my case xdmf file to be opened by Paraview) but this issue is in my opinion a major flaw.
Thanks for any advice.
Adrien.
>> type input.xml
<?xml version="1.0" encoding="UTF-8"?>
<Tag1 Version="2">
<Tag2 Name="foo">
1 2 3
</Tag2>
</Tag1>
>> h = readstruct("input.xml","structnodename","Tag1");
>> writestruct(h,"output.xml","structnodename","Tag1")
>> type output.xml
<?xml version="1.0" encoding="UTF-8"?>
<Tag1 Version="2">
<Tag2 Name="foo">
<Text>1 2 3</Text>
</Tag2>
</Tag1>
>>
1 Comment
Stephen23
on 16 Feb 2023
Edited: Stephen23
on 17 Feb 2023
"writestruct does not reproduce the input of readstruct"
Nor is this expected: there is a large set of input files which will produce exactly the same structure once imported. The XML standard specifically states that a lot of file formatting (e.g. different whitespace) and "irrelevant" XML formatting (e.g. attribute order) is not signficant and should be considered equivalent. In your specific example, note that XML elements may contain text, attributes, other elements, or any mix of these:
Your "1 2 3" are themselves not elements or attributes, so must be text. MATLAB is semantically correct.
"...but this issue is in my opinion a major flaw."
Your proposal is impossible: in general there is no way to know which exact XML file generated a particular structure when imported into MATLAB (or any other application). The XML standard specifically states that this should not be possible.
This applies not only to READSTRUCT/WRITESTRUCT, but every other "pair" of import/export functions, e.g. READMATRIX accepts an uncountably large set of input files, which WRITEMATRIX cannot reproduce from the matrix alone. This is a necessary corollary of applying Postel's law:
Accepted Answer
Jeremy Hughes
on 16 Feb 2023
Unfortunately, there's no way to get that to happen 100% of the time, and readstruct/writestruct are not meant to do that. In fact, round tripping from a data source to any other representation and back again is seldom fully round-tripable unless the two systems were designed together to be that way.
In this case, in order to be able to do document transformation, you have to know far more than what a MATLAB struct is capable of storing. E.g. whether the data was in an attribute, or part of the text node as in your case, or in a node named "Text" to begin with.... or this thing:
<?xml version="1.0" encoding="UTF-8"?>
<Tag1 Version="2">
<Tag2 Name="foo">
<Text>1</Text>
<Text>2</Text>
<Text>3</Text>
</Tag2>
</Tag1>
To get the kind of fidelity for a round trip of XML across all the valid XML files, you necessarily lose some simplicity. Essentially, there's not a 1:1 mapping between MATLAB structs and XML. To get that and still have some sembalance of usability you have to move to objects.
Good(ish) news, there is a MATLAB XML DOM: https://www.mathworks.com/help/matlab/import_export/importing-xml-documents.html, and it (and similarly xmlread) will allow you to do anything, but at a cost of both a steeper learning curve, and more code complexity. Note that the XML DOM APIs are not designed by MathWorks; here's the spec: https://www.w3.org/TR/WD-DOM/
... but I don't recommend the read unless you're having trouble sleeping.
Basically, it's not a simple problem. XML is a very complex format, and working with it can be painful. readstruct/writestruct are really about getting some MATLAB data into that format and back again, and not the other way around, though readstruct should allow you to get data into MATLAB and work with it for general XML files.
3 Comments
Jeremy Hughes
on 21 Feb 2023
Thanks for the info Adrien,
I'll put in an enhancement request for an option to handle "Text" fields on writing.
More Answers (0)
See Also
Categories
Find more on Data Type Identification in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!