utf-16 encoding in writestruct /readstruct or other xml2struct functions
10 views (last 30 days)
Show older comments
I have a large xml file that is used to store information and settings for variables used in another program. It holds the variable name, units, description, display name. That kind of thing.
Due to some weird error with another program, someone tried to import this variable set and instead of overwriting the existing variables on that system it merged them. So I now have this xml file with over 400 duplicate entries and over 2000 entries in total.
I've already sorted a way of finding and removing these but the problem has come with encoding. The original xml file is in UTF-16 and it needs this presumably because of the japanese characters that are used for the variable descriptions etc.
Presumably then I need to have matlab not convert to utf-8 on reading and also save as utf-16 on saving. Is this possible?
I've been using community functions xml2struct and struct2xml but I see there are also native matlab options of readstruct and writestruct. But its not clear if they are capable of doing UTF-16 or whether its a selectable option?
3 Comments
Walter Roberson
on 22 Aug 2024
You would struct2xml() returning the generated xml, which would be generated with utf-8. You would "fix up" the header that says encoding utf-8 to say utf-16 instead. You would uint8() that to convert from characters to bytes, and you would native2unicode() the bytes to convert into unicode code points. You would then unicode2native() that char stream asking for UTF-16, generating a byte stream. You would fwrite() the byte stream.
Answers (1)
Harsh
on 26 Aug 2024
Hi,
Based on my understanding, you've been utilizing community functions to handle the reading and writing of UTF-16 XML files. Now, you're seeking a MATLAB-native solution to achieve the same task.
Fortunately, MATLAB provides built-in functions that can seamlessly accomplish this. Here's how you can use MATLAB's native capabilities to read and write UTF-16 XML files:
% Define the file name
filename = 'example_utf16.xml';
% Open the file for writing with UTF-16 encoding
fileID = fopen(filename, 'w', 'n', 'UTF-16LE');
if fileID == -1
error('Failed to open file for writing.');
end
% Write some text to the file
fprintf(fileID, '<note>It can contain special characters like: ä, ö, ü, ñ, ç, 𤭢.\n </note>');
% Close the file
fclose(fileID);
disp(['File "', filename, '" has been created with UTF-16 encoding.'])
The encoding of the file created can be confirmed in notepad as well,
I hope this helps, thanks!
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!