Skip to main content

Table.from_union

Table.from_uniontables columns_to_keep match_columns on_problems

Group: Calculations

Documentation

Appends records from the vector of tables into a single table.

Arguments

  • tables: A vector of tables to union together. The tables are concatenated in the order they are specified.
  • columns_to_keep: Specifies which columns to keep. Defaults to keeping columns that are present in any of the tables, reporting a warning for columns that are not present in all tables and adding Nothing values for them.
  • match_columns: Specifies how to match the columns. - If Match_Columns.By_Name - the columns are matched by name across all provided tables. - If Match_Columns.By_Position - the columns are mapped by position. The names of each column come from the first table in which the given column appears in. The List option is not applicable when mapping columns by position. Column names are taken from the first table if In_All and from the first table that has the maximum number of columns if In_Any.
  • on_problems: Specifies how to handle problems if they occur, reporting them as warnings by default.

Errors

  • If no common type is found and the text conversion fallback is used, the No_Common_Type problem is reported.
  • The Float type may not be able to exactly represent larger integers, thus if such large integers are mixed with floats, the resulting conversion to Float may cause a loss of precision. In that case, a Loss_Of_Integer_Precision problem is reported. This warning is only reported in the in-memory backend. Currently, the Database backend proceeds without a warning about precision loss.
  • If a column of dates is unified with a column of date-times, since the assumption of using the midnight time-of-day is arbitrary, a Implicit_Date_As_Date_Time_Conversion problem is reported.
  • If an empty vector of tables is provided, an Illegal_Argument error is raised.
  • If columns_to_keep is set to In_All or List and an expected column is missing in some of the tables, a Unmatched_Columns problem is reported. If this causes the output to contain no columns, a No_Output_Columns error is raised.

Remarks

Unifying Column Types

Numeric columns are unified by finding the smallest type that can fit all of the columns. The biggest integer type will be chosen and if integers and decimals are mixed, the decimal type will be chosen. If boolean columns are mixed with numeric columns, they will be coerced to the numeric type (and converted to 0 and 1).

Text types will are also unified by finding the smallest type that can fit all the values. If constant-length texts of different lengths are mixed, they will be coerced to a varying-length type.

If date and date-time columns are unified, this yields a date-time column. In-memory, the date is promoted by adding a time of 00:00 and the system time-zone. In other backends that behaviour may differ.

If one of the matched columns has Mixed type, that type will be used regardless of types of other columns. Note that the Mixed type may not be supported by most Database backends.

Finally, if no common type is found using the rules above, everything is converted to text.

Ordering of Columns in the result

When matching columns by name, it is possible that the ordering of columns may vary between input tables. The ordering is determined as following: columns that are kept from the first table are in the order they appear in that table. If there are columns that do not appear in the first table, they are appended to the end of the resulting table in the order they appear in the input.