Does matlab support parquet partitions

I have a large data set written using parquet partitioning. The partition variable is called 'mdRun', and I have 10 parquet files created in 10 directories as follows:
.../events/mdRun=0/events-0.parquet
../events/mdRun=1/events-0.parquet
and so on. I created these files using pyarrow Hive partitioning.
Using pyarrow, I can read the parquet file corresponding to a single partition using the filter argument, which will read only the parquet file stored in the appropriate directory. As a nice side effect, the mdRun column is not stored in the parquet file, but it is automatically included when I read a partition file(s).
Is it possible to read a parquet partitioned dataset in matlab in the same way?
Thank you!

Answers (1)

Hi Jerry,
As per my knowledge, the feature is not supported by MATLAB in R2022b. This request has already been forwarded to the relevant team.
However, MATLAB R2022b does support parquet file reading and writing. I have attached a few documentation links that may help you in working with parquet functions.
You can refer to the link below for various functions that could be useful in your case:
You can refer to link below for the detailed documentation of the data type mappings:
To help you read parquet files, you can refer the link below:
I hope that this helps!

Products

Release

R2022b

Asked:

on 30 Dec 2022

Answered:

on 2 Jan 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!