These functions uses the Arrow C++ CSV reader to read into a data.frame.
Arrow C++ options have been mapped to argument names that follow those of
readr::read_delim(), and col_select was inspired by vroom::vroom().
read_delim_arrow( file, delim = ",", quote = "\"", escape_double = TRUE, escape_backslash = FALSE, col_names = TRUE, col_select = NULL, na = c("", "NA"), quoted_na = TRUE, skip_empty_rows = TRUE, skip = 0L, parse_options = NULL, convert_options = NULL, read_options = NULL, as_data_frame = TRUE ) read_csv_arrow( file, quote = "\"", escape_double = TRUE, escape_backslash = FALSE, col_names = TRUE, col_select = NULL, na = c("", "NA"), quoted_na = TRUE, skip_empty_rows = TRUE, skip = 0L, parse_options = NULL, convert_options = NULL, read_options = NULL, as_data_frame = TRUE ) read_tsv_arrow( file, quote = "\"", escape_double = TRUE, escape_backslash = FALSE, col_names = TRUE, col_select = NULL, na = c("", "NA"), quoted_na = TRUE, skip_empty_rows = TRUE, skip = 0L, parse_options = NULL, convert_options = NULL, read_options = NULL, as_data_frame = TRUE )
| file | A character file name, raw vector, or an Arrow input stream |
|---|---|
| delim | Single character used to separate fields within a record. |
| quote | Single character used to quote strings. |
| escape_double | Does the file escape quotes by doubling them?
i.e. If this option is |
| escape_backslash | Does the file use backslashes to escape special
characters? This is more general than |
| col_names | If |
| col_select | A character vector of column names to keep, as in the
"select" argument to |
| na | A character vector of strings to interpret as missing values. |
| quoted_na | Should missing values inside quotes be treated as missing
values (the default) or strings. (Note that this is different from the
the Arrow C++ default for the corresponding convert option,
|
| skip_empty_rows | Should blank rows be ignored altogether? If
|
| skip | Number of lines to skip before reading data. |
| parse_options | see file reader options.
If given, this overrides any
parsing options provided in other arguments (e.g. |
| convert_options | |
| read_options | |
| as_data_frame | Should the function return a |
A data.frame, or an Table if as_data_frame = FALSE.
read_csv_arrow() and read_tsv_arrow() are wrappers around
read_delim_arrow() that specify a delimiter.
Note that not all readr options are currently implemented here. Please file
an issue if you encounter one that arrow should support.
If you need to control Arrow-specific reader parameters that don't have an
equivalent in readr::read_csv(), you can either provide them in the
parse_options, convert_options, or read_options arguments, or you can
use CsvTableReader directly for lower-level access.
# \donttest{ tf <- tempfile() on.exit(unlink(tf)) write.csv(iris, file = tf) df <- read_csv_arrow(tf) dim(df)#> [1] 150 6