Skip to main content

Reading a CSV

Introduction

This post shows how to use Enso to read and process a CSV file. Enso is designed and built to make it easier to catalog, process, blend and analyse both structured and unstructured data.

Reading in a File

The simplest way to bring data into Enso is to drag a file onto the IDE. The component referencing the link to the file will be added to the graph:

Import a file

Let's have a look at the code the node created: Data.read "/Users/sylwia/Downloads/Sample - Superstore.csv".

The first part, Data.read, is the method that allows reading different data sources. It will attempt to deduce the format automatically based on extension. It is also possible to choose the extension from the drop down on the right.

"/Users/sylwia/Downloads/Sample - Superstore.csv” represents the path for the file we dragged into the project. We can change it by choosing another file from the file browser or by editing a text field on a component.

Choose a file

Now, take a quick look at the component itself:

Read Node

To the left of the component is a menu with three icons. The first icon controls whether output nodes are "live" or disabled. This post will ignore this functionality. The second icon allows for entering the edit mode so you can edit the whole component as a text. The third icon opens the visualisation of the data. In Enso, components are evaluated as you add or edit them, and you can use the visualisation to see the results as you work.

To the component’s right, the variable name is shown (node1 in this case), where the result of the component is stored; you can use this name to refer to this component anywhere later in the graph.

Finally, within the component, you can see the function used to create it, the required argument, and placeholders for the two optional parameters of the read method (format and on_problems). If you hover your mouse over these (or the "Sample – Superstore.csv"), you will see a dropdown arrow at the bottom appear. If you click here, it will allow you to choose a value.

Format Dropdown

Specifying how to read the CSV file

If the format of the file was not deduced properly form the extension you can always specify it in the read component as it was mentioned in the last paragraph. From the format drop-down you can choose Delimited option. After doing this, you will see additional parameters possible to manually set (delimiter, encoding, skip_rows, row_limit, quote_style, headers, value_formatter, keep_invalid_rows, line_endings and comment_character)

Custom Format Node

Then you can specify the delimiter for the file. You can either choose one of the available options from the drop-down or put a custom text in the text field.

Delimiter Dropdown

Next, you can choose encoding of your file. By default Enso is using UTF-8. If this doesn’t work it will try with Windows-1252. However, you can choose encoding from other available options.

Encoding Dropdown

skip_rows allows you to define how many rows at the beginning of the file you want to skip when reading a CSV file. With row_limit option it is possible to say if you want to read all rows ar the number of rows at the beginning.

Row limit

Wrapping Up

In this post, we worked through reading a CSV file in Enso. To learn more, please see our Getting Started and Enso 101 Learning Paths on the Enso Community.