| any(expr) |
Returns true if at least one value of `expr` is true. |
| approx_count_distinct(expr[, relativeSD]) |
Returns the estimated cardinality by HyperLogLog++.
`relativeSD` defines the maximum relative standard deviation allowed. |
| approx_percentile(col, percentage [, accuracy]) |
Returns the approximate `percentile` of the numeric
column `col` which is the smallest value in the ordered `col` values (sorted from least to
greatest) such that no more than `percentage` of `col` values is less than the value
or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy`
parameter (default: 10000) is a positive numeric literal which controls approximation accuracy
at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is
the relative error of the approximation.
When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0.
In this case, returns the approximate percentile array of column `col` at the given
percentage array. |
| avg(expr) |
Returns the mean calculated from values of a group. |
| bit_and(expr) |
Returns the bitwise AND of all non-null input values, or null if none. |
| bit_or(expr) |
Returns the bitwise OR of all non-null input values, or null if none. |
| bit_xor(expr) |
Returns the bitwise XOR of all non-null input values, or null if none. |
| bool_and(expr) |
Returns true if all values of `expr` are true. |
| bool_or(expr) |
Returns true if at least one value of `expr` is true. |
| collect_list(expr) |
Collects and returns a list of non-unique elements. |
| collect_set(expr) |
Collects and returns a set of unique elements. |
| corr(expr1, expr2) |
Returns Pearson coefficient of correlation between a set of number pairs. |
| count(*) |
Returns the total number of retrieved rows, including rows containing null. |
| count(expr[, expr...]) |
Returns the number of rows for which the supplied expression(s) are all non-null. |
| count(DISTINCT expr[, expr...]) |
Returns the number of rows for which the supplied expression(s) are unique and non-null. |
| count_if(expr) |
Returns the number of `TRUE` values for the expression. |
| count_min_sketch(col, eps, confidence, seed) |
Returns a count-min sketch of a column with the given esp,
confidence and seed. The result is an array of bytes, which can be deserialized to a
`CountMinSketch` before usage. Count-min sketch is a probabilistic data structure used for
cardinality estimation using sub-linear space. |
| covar_pop(expr1, expr2) |
Returns the population covariance of a set of number pairs. |
| covar_samp(expr1, expr2) |
Returns the sample covariance of a set of number pairs. |
| every(expr) |
Returns true if all values of `expr` are true. |
| first(expr[, isIgnoreNull]) |
Returns the first value of `expr` for a group of rows.
If `isIgnoreNull` is true, returns only non-null values. |
| first_value(expr[, isIgnoreNull]) |
Returns the first value of `expr` for a group of rows.
If `isIgnoreNull` is true, returns only non-null values. |
| grouping(col) |
indicates whether a specified column in a GROUP BY is aggregated or
not, returns 1 for aggregated or 0 for not aggregated in the result set.", |
| grouping_id([col1[, col2 ..]]) |
returns the level of grouping, equals to
`(grouping(c1) << (n-1)) + (grouping(c2) << (n-2)) + ... + grouping(cn)` |
| kurtosis(expr) |
Returns the kurtosis value calculated from values of a group. |
| last(expr[, isIgnoreNull]) |
Returns the last value of `expr` for a group of rows.
If `isIgnoreNull` is true, returns only non-null values |
| last_value(expr[, isIgnoreNull]) |
Returns the last value of `expr` for a group of rows.
If `isIgnoreNull` is true, returns only non-null values |
| max(expr) |
Returns the maximum value of `expr`. |
| max_by(x, y) |
Returns the value of `x` associated with the maximum value of `y`. |
| mean(expr) |
Returns the mean calculated from values of a group. |
| min(expr) |
Returns the minimum value of `expr`. |
| min_by(x, y) |
Returns the value of `x` associated with the minimum value of `y`. |
| percentile(col, percentage [, frequency]) |
Returns the exact percentile value of numeric column
`col` at the given percentage. The value of percentage must be between 0.0 and 1.0. The
value of frequency should be positive integral |
| percentile(col, array(percentage1 [, percentage2]...) [, frequency]) |
Returns the exact
percentile value array of numeric column `col` at the given percentage(s). Each value
of the percentage array must be between 0.0 and 1.0. The value of frequency should be
positive integral |
| percentile_approx(col, percentage [, accuracy]) |
Returns the approximate `percentile` of the numeric
column `col` which is the smallest value in the ordered `col` values (sorted from least to
greatest) such that no more than `percentage` of `col` values is less than the value
or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy`
parameter (default: 10000) is a positive numeric literal which controls approximation accuracy
at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is
the relative error of the approximation.
When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0.
In this case, returns the approximate percentile array of column `col` at the given
percentage array. |
| skewness(expr) |
Returns the skewness value calculated from values of a group. |
| some(expr) |
Returns true if at least one value of `expr` is true. |
| std(expr) |
Returns the sample standard deviation calculated from values of a group. |
| stddev(expr) |
Returns the sample standard deviation calculated from values of a group. |
| stddev_pop(expr) |
Returns the population standard deviation calculated from values of a group. |
| stddev_samp(expr) |
Returns the sample standard deviation calculated from values of a group. |
| sum(expr) |
Returns the sum calculated from values of a group. |
| var_pop(expr) |
Returns the population variance calculated from values of a group. |
| var_samp(expr) |
Returns the sample variance calculated from values of a group. |
| variance(expr) |
Returns the sample variance calculated from values of a group. |