These functions uses the Arrow C++ CSV reader to read into a data.frame
.
Arrow C++ options have been mapped to argument names that follow those of
readr::read_delim()
, and col_select
was inspired by vroom::vroom()
.
read_delim_arrow( file, delim = ",", quote = "\"", escape_double = TRUE, escape_backslash = FALSE, col_names = TRUE, col_select = NULL, na = c("", "NA"), quoted_na = TRUE, skip_empty_rows = TRUE, skip = 0L, parse_options = NULL, convert_options = NULL, read_options = NULL, as_data_frame = TRUE ) read_csv_arrow( file, quote = "\"", escape_double = TRUE, escape_backslash = FALSE, col_names = TRUE, col_select = NULL, na = c("", "NA"), quoted_na = TRUE, skip_empty_rows = TRUE, skip = 0L, parse_options = NULL, convert_options = NULL, read_options = NULL, as_data_frame = TRUE ) read_tsv_arrow( file, quote = "\"", escape_double = TRUE, escape_backslash = FALSE, col_names = TRUE, col_select = NULL, na = c("", "NA"), quoted_na = TRUE, skip_empty_rows = TRUE, skip = 0L, parse_options = NULL, convert_options = NULL, read_options = NULL, as_data_frame = TRUE )
file | A character file name, raw vector, or an Arrow input stream |
---|---|
delim | Single character used to separate fields within a record. |
quote | Single character used to quote strings. |
escape_double | Does the file escape quotes by doubling them?
i.e. If this option is |
escape_backslash | Does the file use backslashes to escape special
characters? This is more general than |
col_names | If |
col_select | A character vector of column names to keep, as in the
"select" argument to |
na | A character vector of strings to interpret as missing values. |
quoted_na | Should missing values inside quotes be treated as missing
values (the default) or strings. (Note that this is different from the
the Arrow C++ default for the corresponding convert option,
|
skip_empty_rows | Should blank rows be ignored altogether? If
|
skip | Number of lines to skip before reading data. |
parse_options | see file reader options.
If given, this overrides any
parsing options provided in other arguments (e.g. |
convert_options | |
read_options | |
as_data_frame | Should the function return a |
A data.frame
, or an Table if as_data_frame = FALSE
.
read_csv_arrow()
and read_tsv_arrow()
are wrappers around
read_delim_arrow()
that specify a delimiter.
Note that not all readr
options are currently implemented here. Please file
an issue if you encounter one that arrow
should support.
If you need to control Arrow-specific reader parameters that don't have an
equivalent in readr::read_csv()
, you can either provide them in the
parse_options
, convert_options
, or read_options
arguments, or you can
use CsvTableReader directly for lower-level access.
# \donttest{ tf <- tempfile() on.exit(unlink(tf)) write.csv(iris, file = tf) df <- read_csv_arrow(tf) dim(df)#> [1] 150 6