Package smile.io
Class Parquet
java.lang.Object
smile.io.Parquet
Apache Parquet is a columnar storage format that supports
nested data structures. It uses the record shredding and
assembly algorithm described in the Dremel paper.
-
Method Summary
Modifier and TypeMethodDescriptionstatic DataFrame
Reads a HDFS parquet file.static DataFrame
Reads a HDFS parquet file.static DataFrame
Reads a local parquet file.static DataFrame
Reads a local parquet file.static DataFrame
read
(org.apache.parquet.io.InputFile file) Reads a parquet file.static DataFrame
read
(org.apache.parquet.io.InputFile file, int limit) Reads a limited number of records from a parquet file.
-
Method Details
-
read
Reads a local parquet file.- Parameters:
path
- the input file path.- Returns:
- the data frame.
- Throws:
IOException
- when fails to write the file.
-
read
Reads a local parquet file.- Parameters:
path
- the input file path.limit
- the number of records to read.- Returns:
- the data frame.
- Throws:
IOException
- when fails to write the file.
-
read
Reads a HDFS parquet file.- Parameters:
path
- the input file path.- Returns:
- the data frame.
- Throws:
IOException
- when fails to write the file.URISyntaxException
- when the file path syntax is wrong.
-
read
Reads a HDFS parquet file.- Parameters:
path
- the input file path.limit
- the number of records to read.- Returns:
- the data frame.
- Throws:
IOException
- when fails to write the file.URISyntaxException
- when the file path syntax is wrong.
-
read
Reads a parquet file.- Parameters:
file
- an interface with the methods needed by Parquet to read data files. See HadoopInputFile for example.- Returns:
- the data frame.
- Throws:
IOException
- when fails to write the file.
-
read
Reads a limited number of records from a parquet file.- Parameters:
file
- an interface with the methods needed by Parquet to read data files. See HadoopInputFile for example.limit
- the number of records to read.- Returns:
- the data frame.
- Throws:
IOException
- when fails to write the file.
-