Skip to main content

DB_Table.distinct

distinctcolumnscase_sensitivityon_problems

Group: Selections
Aliases: deduplicate, unique

Documentation

Returns the distinct set of rows within the specified columns from the input table. When multiple rows have the same values within the specified columns, the first row of each such set is returned if possible, but in database backends any row from each set may be returned (for example if the row ordering is unspecified). For the in-memory table, the unique rows will be in the order they occurred in the input (this is not guaranteed for database operations).

Arguments

  • columns: The columns of the table to use for distinguishing the rows.
  • case_sensitivity: Specifies if the text values should be compared case sensitively.
  • on_problems: Specifies how to handle if a problem occurs, raising as a warning by default.

Errors

  • If there are no columns in the output table, a No_Output_Columns is raised as an error regardless of the problem behavior, because it is not possible to create a table without any columns.
  • If a column in columns is not in the input table, a Missing_Input_Columns is raised as an error.
  • If no valid columns are selected, a No_Input_Columns_Selected, is reported as a dataflow error regardless of setting.
  • If floating points values are present in the distinct columns, a Floating_Point_Equality is reported according to the on_problems setting.