Pretty CSV viewing on the Command Line - Stefaan Lippens inserts content here

CSV (comma separated values) files are to data formats what FAT32 is to file systems: everybody loves to hate them, but you can't find a more widely supported alternative.

For example, viewing CSV files in a command line environment is typically pretty annoying. You can't make much from this, right?

raw CSV file on the command line

However, if you weld a couple of common command line tools together, you can create a handy viewer that make the data look like this:

raw CSV file on the command line

The basics

There is this well hidden command line tool called "column" that allows you to align the data nicely in properly sized columns. Combine this with a pager like less and we have a nice prototype already

cat data.csv | column -t -s, | less -S

One problem with this is that column ignores/merges empty cells in your data, which ruins the whole point of aligning all together. On Debian/Ubuntu, column provides an option -n to disable this behavior, but for other platforms (like with the BSD flavor of column on the Mac), we need some additional trickery. A simple solution is just adding a space before each comma:

cat data.csv | sed 's/,/ ,/g' | column -t -s, | less -S

Or, if you want to avoid wasting too much horizontal space, you can add a space only to the empty cells as follows:

cat data.csv | perl -pe 's/((?<=,)|(?<=^)),/ ,/g;' | column -t -s, | less -S

Shortcuts

Time to create some shortcuts and put this in, for example, your .bashrc, .bash_aliases or whatever other customization options your favorite shell provides. I'll just cover bash here, because that's the shell I currently use most.

In the end we'll have a tool pretty_csv which can be used in different ways:

pretty_csv data.csv
pretty_csv < data.csv
sort data.csv | pretty_csv (to illustrate that the input doesn't necessary have to be a file, you can also pipe the output of another process to it)

For Debian/Ubuntu

On Debian/Ubuntu systems I just put this in my .bashrc (note some additional less options, roughly based on how git log works):

function pretty_csv {
    column -t -s, -n "$@" | less -F -S -X -K
}

For other platforms

For non-Debian systems we have to add preprocessing of empty cells:

function pretty_csv {
    perl -pe 's/((?<=,)|(?<=^)),/ ,/g;' "$@" | column -t -s, | less  -F -S -X -K
}

Conflict with iTerm2 on Mac OS X macOS

On my Mac I use iTerm2 and I noticed that its shell integration conflicts in some weird ways with less in the above bash function if I apply it through a pipe (cat data.csv | pretty_csv).

As workaround I use a bash script instead of a bash function. For example, create a file ~/.bash.d/pretty_csv.sh, containing:

#!/bin/bash
perl -pe 's/((?<=,)|(?<=^)),/ ,/g;' "$@" | column -t -s, | exec less  -F -S -X -K

make it executable (chmod u+x ~/.bash.d/pretty_csv.sh) and create a bash alias for it (e.g in .bashrc or .bash_aliases)

alias pretty_csv='~/.bash.d/pretty_csv.sh'

TSV: tab separated values

I regularly also have to work with TSV files, where the columns are separated by the tab character. The tricky part here is passing this special character correctly to the parts of the pipeline.

For Debian/Ubuntu:

function pretty_tsv {
    column -t -s $'\t' -n "$@" | less -F -S -X -K
}

For non-Debian systems:

function pretty_tsv {
    perl -pe 's/((?<=\t)|(?<=^))\t/ \t/g;' "$@" | column -t -s $'\t' | less  -F -S -X -K
}

As a bash script (pretty_tsv.sh):

#!/bin/bash
perl -pe 's/((?<=\t)|(?<=^))\t/ \t/g;' "$@" | column -t -s $'\t' | exec less  -F -S -X -K

Bye

Now you can enjoy the warm cosy feeling of browsing pretty, shiny CSV and TSV files in your terminal.

I've put the code and scripts on github too.