Reading a CSV
Introduction
This post shows how to use Enso to read and process a CSV file. Enso is designed and built to make it easier to catalog, process, blend and analyse both structured and unstructured data.
Reading in a File
The simplest way to bring data into Enso is to drag a file onto the graph editor background. The component referencing the link to the file will be added to the graph:
Let's have a look what the component created:
Data.read "/Users/sylwia/Downloads/Sample - Superstore.csv"
.
The first part, Data.read
, is the method that allows reading different data sources.
It will attempt to deduce the format automatically based on extension. It is
also possible to choose the extension from the drop down on the right.
"/Users/sylwia/Downloads/Sample - Superstore.csv”
represents the path for the
file we dragged into the project. We can change it by choosing another file from
the file browser or by editing a text field on a component.
Now, take a quick look at the component itself:
To the left of the component is a menu with three icons. The first icon controls whether output
nodes are "live" or disabled. This post will ignore this functionality. The
second icon allows for entering the edit
mode so you can edit the whole component
as a text. The third icon opens the visualisation of the data. In Enso, components
are evaluated as you add or edit them, and you can use the visualisation to see the results as
you work.
To the component’s right, the variable name is shown (node1
in this case),
where the result of the component is stored; you can use this name to refer to this
component anywhere later in the graph.
Finally, within the component, you can see the function used to create it, the
required argument, and placeholders for the two optional parameters of the read
method (format
and on_problems
). If you hover your mouse over these (or the "Sample –
Superstore.csv"), you will see a dropdown arrow at the bottom appear. If you
click here, it will allow you to choose a value.
Specifying how to read the CSV file
If the format of the file was not deduced properly form the extension you can always specify
it in the read
component as it was mentioned in the last paragraph. From the format
drop-down you can choose Delimited
option. After doing this, you will see additional parameters
possible to manually set (delimiter
, encoding
, skip_rows
, row_limit
, quote_style
, headers
,
value_formatter
, keep_invalid_rows
, line_endings
and comment_character
)
Then you can specify the delimiter for the file. You can either choose one of the available options from the drop-down or put a custom text in the text field.
Next, you can choose encoding of your file. By default Enso is using UTF-8. If this doesn’t work it will try with Windows-1252. However, you can choose encoding from other available options.
skip_rows
allows you to define how many rows at the beginning of the file you want to skip when reading a CSV file.
With row_limit
option it is possible to say if you want to read all rows ar the number of rows at the beginning.
Wrapping Up
In this post, we worked through reading a CSV file in Enso. To learn more, please see our Getting Started and Enso 101 Learning Paths on the Enso Community.