Tabular File Formats

CSV Files

ReadOptions([use_threads, block_size, …])

Options for reading CSV files.

ParseOptions([delimiter, quote_char, …])

Options for parsing CSV files.

ConvertOptions([check_utf8, column_types, …])

Options for converting CSV data.

read_csv(input_file[, read_options, …])

Read a Table from a stream of CSV data.

open_csv(input_file[, read_options, …])

Open a streaming reader of CSV data.

CSVStreamingReader()

An object that reads record batches incrementally from a CSV file.

Feather Files

read_feather(source[, columns, use_threads, …])

Read a pandas.DataFrame from Feather format.

read_table(source[, columns, memory_map])

Read a pyarrow.Table from Feather format

write_feather(df, dest[, compression, …])

Write a pandas.DataFrame to Feather format.

JSON Files

ReadOptions([use_threads, block_size])

Options for reading JSON files.

ParseOptions([explicit_schema, …])

Options for parsing JSON files.

read_json(input_file[, read_options, …])

Read a Table from a stream of JSON data.

Parquet Files

ParquetDataset([path_or_paths, filesystem, …])

Encapsulates details of reading a complete Parquet dataset possibly consisting of multiple files and partitions in subdirectories.

ParquetFile(source[, metadata, …])

Reader interface for a single Parquet file.

ParquetWriter(where, schema[, filesystem, …])

Class for incrementally building a Parquet file for Arrow tables.

read_table(source[, columns, use_threads, …])

Read a Table from Parquet format

read_metadata(where[, memory_map])

Read FileMetadata from footer of a single Parquet file.

read_pandas(source[, columns, use_threads, …])

Read a Table from Parquet format, also reading DataFrame index values if known in the file metadata

read_schema(where[, memory_map])

Read effective Arrow schema from Parquet file metadata.

write_metadata(schema, where[, …])

Write metadata-only Parquet file from schema.

write_table(table, where[, row_group_size, …])

Write a Table to Parquet format.

write_to_dataset(table, root_path[, …])

Wrapper around parquet.write_table for writing a Table to Parquet format by partitions.

ORC Files

ORCFile(source)

Reader interface for a single ORC file