Table.aggregate
aggregategroup_bycolumnserror_on_missing_columnson_problems
Group: Calculations
Aliases: average
, count
, count distinct
, first
, group by
, last
, longest
, maximum
, mean
, median
, minimum
, mode
, percentile
, shortest
, standard deviation
, sum
, summarize
, variance
Documentation
Aggregates the rows in a table using group_by
columns. The columns argument specifies which additional aggregations to perform and to return.
Arguments
group_by
: A list of columns to group by. These will be included at the start of the resulting table. If no columns are specified a single row will be returned with the aggregate columns.columns
: The aggregate operations being performed specifying the aggregated table. Expressions can be used within the aggregate column to perform more complicated calculations.error_on_missing_columns
: Specifies if a missing columns in aggregates should result in an error regardless of theon_problems
settings. Defaults toFalse
, meaning that problematic aggregations will not be included in the result and a problem reported.on_problems
: Specifies how to handle problems if they occur, reporting them as warnings by default.
Examples
Count all the rows
table = Table.from_rows ["Name","Location"] [["John", "Massachusetts"],["Paul","London"]]
grouped = table.aggregate columns=[Aggregate_Column.Count]
Returns a Table
Count |
---|
2 |
Group by the Key column, count the rows
table = Table.from_rows ["Name","Location"] [["John", "Massachusetts"],["Paul","London"]]
grouped = table.aggregate ["Name"] [Aggregate_Column.Count]
Returns a Table
Name | Count |
---|---|
John | 1 |
Paul | 1 |
Errors
- If there are no columns in the output table, a
No_Output_Columns
is raised as an error regardless of the problem behavior, because it is not possible to create a table without any columns. - If a column index is out of range, a
Missing_Input_Columns
is reported according to theon_problems
setting, unlesserror_on_missing_columns
is set toTrue
, in which case it is raised as an error. Problems resolvinggroup_by
columns are reported as dataflow errors regardless of these settings, as a missing grouping will completely change semantics of the query. - If a column selector is given as a
Text
and it does not match any columns in the input table nor is it a valid expression, anInvalid_Aggregate_Column
problem is raised according to theon_problems
settings (unlesserror_on_missing_columns
is set toTrue
in which case it will always be an error). Problems resolvinggroup_by
columns are reported as dataflow errors regardless of these settings, as a missing grouping will completely change semantics of the query. - If an aggregation fails, an
Invalid_Aggregation
dataflow error is raised. - Additionally, the following problems may be reported according to the
on_problems
setting:- If there are invalid column names in the output table,
a
Invalid_Column_Names
. - If there are duplicate column names in the output table,
a
Duplicate_Output_Column_Names
. - If grouping on or computing the
Mode
on a floating point number, aFloating_Point_Equality
. - If when concatenating values there is an quoted delimited,
an
Unquoted_Delimiter
- If there are more than 10 issues with a single column,
an
Additional_Warnings
.
- If there are invalid column names in the output table,
a
Returns
- A new table with the group_by columns as well as any aggregate columns.