Compute Functions

Aggregations

count(array, *[, options, memory_pool])

Count the number of null / non-null values.

mean(array, *[, memory_pool])

Compute the mean of a numeric array.

min_max(array, *[, options, memory_pool])

Compute the minimum and maximum values of a numeric array.

mode(array[, n])

Return top-n most common values and number of times they occur in a passed numerical (chunked) array, in descending order of occurance.

stddev(array, *[, options, memory_pool])

Calculate the standard deviation of a numeric array.

sum(array)

Sum the values in a numerical (chunked) array.

variance(array, *[, options, memory_pool])

Calculate the variance of a numeric array.

Arithmetic Functions

By default these functions do not detect overflow. Each function is also available in an overflow-checking variant, suffixed _checked, which throws an ArrowInvalid exception when overflow is detected.

add(x, y, *[, memory_pool])

Add the arguments element-wise.

add_checked(x, y, *[, memory_pool])

Add the arguments element-wise.

divide(dividend, divisor, *[, memory_pool])

Divide the arguments element-wise.

divide_checked(dividend, divisor, *[, …])

Divide the arguments element-wise.

multiply(x, y, *[, memory_pool])

Multiply the arguments element-wise.

multiply_checked(x, y, *[, memory_pool])

Multiply the arguments element-wise.

subtract(x, y, *[, memory_pool])

Substract the arguments element-wise.

subtract_checked(x, y, *[, memory_pool])

Substract the arguments element-wise.

Comparisons

These functions expect two inputs of the same type. If one of the inputs is null they return null.

equal(x, y, *[, memory_pool])

Compare values for equality (x == y).

greater(x, y, *[, memory_pool])

Compare values for ordered inequality (x > y).

greater_equal(x, y, *[, memory_pool])

Compare values for ordered inequality (x >= y).

less(x, y, *[, memory_pool])

Compare values for ordered inequality (x < y).

less_equal(x, y, *[, memory_pool])

Compare values for ordered inequality (x <= y).

not_equal(x, y, *[, memory_pool])

Compare values for inequality (x != y).

Logical Functions

These functions normally emit a null when one of the inputs is null. However, Kleene logic variants are provided (suffixed _kleene). See User Guide for details.

and_(x, y, *[, memory_pool])

Logical ‘and’ boolean values.

and_kleene(x, y, *[, memory_pool])

Logical ‘and’ boolean values (Kleene logic).

all(array, *[, memory_pool])

Test whether all elements in a boolean array evaluate to true..

any(array, *[, memory_pool])

Test whether any element in a boolean array evaluates to true..

invert(values, *[, memory_pool])

Invert boolean values.

or_(x, y, *[, memory_pool])

Logical ‘or’ boolean values.

or_kleene(x, y, *[, memory_pool])

Logical ‘or’ boolean values (Kleene logic).

xor(x, y, *[, memory_pool])

Logical ‘xor’ boolean values.

String Predicates

In these functions an empty string emits false in the output. For ASCII variants (prefixed ascii_) a string element with non-ASCII characters emits false in the output.

The first set of functions emit true if the input contains only characters of a given class.

ascii_is_alnum(strings, *[, memory_pool])

Classify strings as ASCII alphanumeric.

ascii_is_alpha(strings, *[, memory_pool])

Classify strings as ASCII alphabetic.

ascii_is_decimal(strings, *[, memory_pool])

Classify strings as ASCII decimal.

ascii_is_lower(strings, *[, memory_pool])

Classify strings as ASCII lowercase.

ascii_is_printable(strings, *[, memory_pool])

Classify strings as ASCII printable.

ascii_is_space(strings, *[, memory_pool])

Classify strings as ASCII whitespace.

ascii_is_upper(strings, *[, memory_pool])

Classify strings as ASCII uppercase.

utf8_is_alnum(strings, *[, memory_pool])

Classify strings as alphanumeric.

utf8_is_alpha(strings, *[, memory_pool])

Classify strings as alphabetic.

utf8_is_decimal(strings, *[, memory_pool])

Classify strings as decimal.

utf8_is_digit(strings, *[, memory_pool])

Classify strings as digits.

utf8_is_lower(strings, *[, memory_pool])

Classify strings as lowercase.

utf8_is_numeric(strings, *[, memory_pool])

Classify strings as numeric.

utf8_is_printable(strings, *[, memory_pool])

Classify strings as printable.

utf8_is_space(strings, *[, memory_pool])

Classify strings as whitespace.

utf8_is_upper(strings, *[, memory_pool])

Classify strings as uppercase.

The second set of functions also consider the order of characters in the string element.

ascii_is_title(strings, *[, memory_pool])

Classify strings as ASCII titlecase.

utf8_is_title(strings, *[, memory_pool])

Classify strings as titlecase.

The third set of functions examines string elements on a byte-by-byte basis.

string_is_ascii(strings, *[, memory_pool])

Classify strings as ASCII.

String Transforms

ascii_lower(strings, *[, memory_pool])

Transform ASCII input to lowercase.

ascii_upper(strings, *[, memory_pool])

Transform ASCII input to uppercase.

utf8_lower(strings, *[, memory_pool])

Transform input to lowercase.

utf8_upper(strings, *[, memory_pool])

Transform input to uppercase.

Containment tests

index_in(values, *[, options, memory_pool])

Return index of each element in a set of values.

is_in(values, *[, options, memory_pool])

Find each element in a set of values.

match_substring(array, pattern)

Test if substring pattern is contained within a value of a string array.

Conversions

cast(arr, target_type[, safe])

Cast array values to another data type.

strptime(strings, *[, options, memory_pool])

Parse timestamps.

Selections

filter(data, mask[, null_selection_behavior])

Select values (or records) from array- or table-like data given boolean filter, where true values are selected.

take(data, indices, *[, boundscheck, …])

Select values (or records) from array- or table-like data given integer selection indices.

Associative transforms

dictionary_encode(array, *[, memory_pool])

Dictionary-encode array.

unique(array, *[, memory_pool])

Compute unique elements.

value_counts(array, *[, memory_pool])

Compute counts of unique elements.

Sorts and partitions

partition_nth_indices(array, *[, options, …])

Return the indices that would partition an array around a pivot.

sort_indices(input, *[, options, memory_pool])

Return the indices that would sort an array, record batch or table.

Structural Transforms

binary_length(strings, *[, memory_pool])

Compute string lengths.

fill_null(values, fill_value)

Replace each null element in values with fill_value.

is_null(values, *[, memory_pool])

Return true if null.

is_valid(values, *[, memory_pool])

Return true if non-null.

list_value_length(lists, *[, memory_pool])

Compute list lengths.

list_flatten(lists, *[, memory_pool])

Flatten list values.

list_parent_indices(lists, *[, memory_pool])

Compute parent indices of nested list values.