Skip to main content

Text.cleanse

cleanseremove

Group: Text

Documentation

Applies the specified cleansings to the text.

Arguments

  • remove: A vector of the named patterns to cleanse from the text. The named patterns are applied in the order they are provided. The same named pattern can be used multiple times. The named patterns are: - ..Leading_Whitespace: Removes all whitespace from the start of the string. - ..Trailing_Whitespace: Removes all whitespace from the end of the string. - ..Duplicate_Whitespace: Removes all duplicate whitespace from the string replacing it with the first whitespace character of the duplicated block. - ..All_Whitespace: Removes all whitespace from the string. - ..Newlines: Removes all newline characters from the string. Line Feed and Carriage Return characters are considered newlines. - ..Leading_Numbers: Removes all numbers from the start of the string. - ..Trailing_Numbers: Removes all numbers from the end of the string. - ..Non_ASCII: Removes all non-ascii characters from the string. - ..Tabs: Removes all tab characters from the string. - ..Letters: Removes all letters from the string. - ..Numbers: Removes all numbers characters from the string. - ..Punctuation: Removes all characters in the set ,.!?():;'" from the string. - ..Symbols: Removes anything that isn't letters, numbers or whitespace from the string.

Examples

Remove leading and trailing spaces from text.

      text.cleanse [..Leading_Whitespace, ..Trailing_Whitespace]