How do I read in a binary file that has a very unique data structure?

33 views (last 30 days)
I have a binary file of which I know the structure (i.e. A x uint16, B x 12-bit unsigned, C x uint16 etc.). I suspect I have to use "memmapfile" in some capacity to map and then read the file into a usable form but I haven't had any success after some time trying. If anyone has any ideas how one can "Read this file where bits 1 to A are uint16, A+1 to B are 12-bit unsigned, etc." that'd be a huge help. Many thanks!
  3 Comments
Scott
Scott on 2 Apr 2016
Walter, Just saw your comment now as I was going to post an update. That's exactly what I did. Used fopen and fread which read in all data as unit8. Converted that to binary, and used my outline to convert the desired number of bits into the appropriate int. Thanks for everyone's replies and support. Much appreciation!

Sign in to comment.

Answers (3)

dpb
dpb on 30 Mar 2016
There's no facility in memmapfile to handle arbitrary bit-lengths; you'll have to map the 12-byte type to an array of 3-x-uint8 and then combine those results.
You also need to know endianness of the underlying data as well to interpret correctly; that should be provided by the application documentation that created the file.

Walter Roberson
Walter Roberson on 30 Mar 2016
You need to know whether the "12 bit unsigned" is 12 bits out of every 16 with the other 4 ignored, or if the 12 bits are packed together, 2 of them for every 3 8-bit bytes? If they are packed together, then you need to have defined what happens if you end in the middle of a byte because B is not a multiple of 2.

Teja Muppirala
Teja Muppirala on 1 Apr 2016
Would it be possible to use FREAD with the precision specified?
For example:
%% Just making some test data... (16bit,12bit,16bit // 16bit,12bit,16bit // ...)
digitsOfPi = [3 1 4 1 5 9 2 6 5 3 5 9]
fileID = fopen('pi.dat','w');
for n = 1:numel(digitsOfPi)
switch mod(n,3)
case 1
fwrite(fileID,digitsOfPi(n),'ubit16');
case 2
fwrite(fileID,digitsOfPi(n),'ubit12');
case 0
fwrite(fileID,digitsOfPi(n),'ubit16');
end
end
fclose(fileID);
%% Now read the file back in:
fileID = fopen('pi.dat','r');
n = 0;
while ~feof(fileID)
n = n+1;
switch mod(n,3)
case 1
data = fread(fileID,1,'ubit16')
case 2
data = fread(fileID,1,'ubit12')
case 0
data = fread(fileID,1,'ubit16')
end
end
fclose(fileID);
This gives me back
data =
3
data =
1
data =
4
data =
1
data =
5
data =
9
data =
2
data =
6
data =
5
data =
3
data =
5
data =
9
data =
[]
  2 Comments
Walter Roberson
Walter Roberson on 1 Apr 2016
Note that each fread will start at a byte boundary and that when the count is exhausted that any partial byte unread will be discarded. It is thus not suitable for advancing through a file as if the file were a bit array.
dpb
dpb on 1 Apr 2016
Interesting! I wasn't aware there was such a thing as 'uint12' that Matlab would recognize, even. Of course, as you note the above "works" for the OP as it writes 16-bits to store 12...
>> fid=fopen('test.dat','w');
>> fwrite(fid,3,'ubit12');
>> fid=fclose(fid);
Open file in a binary mode viewer...
C:\ML_R2012b\work> ty /x test.dat
0000 0000 03 00 ♥.
shows two bytes were written so by symmetry fread read the same two bytes when used 'uint12' there as well. The demonstration shows the underlying C i/o library is consistent but doesn't, as you note, do anything for the OP if the file is actually a packed bitstream as would seem to be, probably from some lab instrument.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!