Episode #36 - CLI Monday: column and tr

Loading the player...

About Episode - Duration: 4 minutes, Published: 2014-09-01

In this episode, we are going to review the column and translate commands. These two command are great at manipulating text, so as an example, we will look at converting log data into an easily read format.

Download: mp4 or webm

Get notified about future content via the mailing list, follow @jweissig_ on Twitter for episode updates, or use the RSS feed.

Links, Code, and Transcript


In this episode, we are going to review the column and translate commands. These two command are great at manipulating text, so as an example, we will look at converting log data into an easily read format.

Lets start off my looking at some example log data.

$ cat data.log 
a b c d e f g h i j k l m n o p
4 0 0 6298384 40428 47587356 0 0 1209 172 0 0 17 3 71 9
4 0 0 6298448 40428 47587356 0 0 0 0 2163 5260 25 1 74 0
14 0 0 6299024 40428 47587416 0 0 0 0 2053 4773 25 1 75 0
15 0 0 6291776 40428 47595272 0 0 0 0 3276 5176 25 1 74 0
4 0 0 6283868 40432 47602948 0 0 0 296 2736 5471 25 1 74 0
6 0 0 6282352 40432 47602780 0 0 0 0 2692 5204 24 1 75 0
9 0 0 6282352 40432 47602780 0 0 0 0 2269 4939 25 1 74 0
9 0 0 6282168 40440 47602776 0 0 4 108 2681 5570 25 1 74 0
14 0 0 6282132 40440 47602784 0 0 0 12344 4930 5534 25 1 74 0
6 0 0 6282144 40440 47602784 0 0 0 0 2520 5194 25 1 74 0
15 0 0 6282096 40444 47602784 0 0 0 616 2426 4930 25 1 74 0
14 0 0 6274292 40444 47610464 0 0 0 0 2301 5057 25 1 74 0
14 0 0 6274200 40448 47610460 0 0 0 4 2337 4466 25 1 75 0
6 0 0 6274316 40456 47610468 0 0 0 396 2254 4580 25 1 75 0
6 0 0 6274292 40456 47610472 0 0 4 0 2440 4729 25 1 74 0
5 0 0 6266660 40456 47618152 0 0 0 0 3403 5208 25 1 74 0
4 0 0 6258848 40456 47625832 0 0 0 2020 4108 5752 25 1 73 1
6 0 0 6258980 40464 47625824 0 0 0 508 2200 4414 25 1 74 0
4 0 0 6258972 40464 47625836 0 0 0 0 2336 4789 25 1 74 0

So, I recently ran into a situation where I quickly wanted to review monitoring data. The data was in a similar format, but the values were separated or delimited by spaces, so it would be nice if these lined up into a table with columns, because as you can see over here, it is hard to tell how these values line up. Well, it turns out that we can use the column command for just this task.

Lets rerun our command and pipe our data.log file into column -t.

$ cat data.log | column -t
a   b  c  d        e      f         g  h  i     j      k     l     m   n  o   p
4   0  0  6298384  40428  47587356  0  0  1209  172    0     0     17  3  71  9
4   0  0  6298448  40428  47587356  0  0  0     0      2163  5260  25  1  74  0
14  0  0  6299024  40428  47587416  0  0  0     0      2053  4773  25  1  75  0
15  0  0  6291776  40428  47595272  0  0  0     0      3276  5176  25  1  74  0
4   0  0  6283868  40432  47602948  0  0  0     296    2736  5471  25  1  74  0
6   0  0  6282352  40432  47602780  0  0  0     0      2692  5204  24  1  75  0
9   0  0  6282352  40432  47602780  0  0  0     0      2269  4939  25  1  74  0
9   0  0  6282168  40440  47602776  0  0  4     108    2681  5570  25  1  74  0
14  0  0  6282132  40440  47602784  0  0  0     12344  4930  5534  25  1  74  0
6   0  0  6282144  40440  47602784  0  0  0     0      2520  5194  25  1  74  0
15  0  0  6282096  40444  47602784  0  0  0     616    2426  4930  25  1  74  0
14  0  0  6274292  40444  47610464  0  0  0     0      2301  5057  25  1  74  0
14  0  0  6274200  40448  47610460  0  0  0     4      2337  4466  25  1  75  0
6   0  0  6274316  40456  47610468  0  0  0     396    2254  4580  25  1  75  0
6   0  0  6274292  40456  47610472  0  0  4     0      2440  4729  25  1  74  0
5   0  0  6266660  40456  47618152  0  0  0     0      3403  5208  25  1  74  0
4   0  0  6258848  40456  47625832  0  0  0     2020   4108  5752  25  1  73  1
6   0  0  6258980  40464  47625824  0  0  0     508    2200  4414  25  1  74  0
4   0  0  6258972  40464  47625836  0  0  0     0      2336  4789  25  1  74  0

With this simple command, it is now easy to see how these column values relate to one another, where as before, it was not easy to tell what values belonged together in a column.

To tell you the truth, I maybe used this command every three or four months, but when I run into a situation like this, it is nice to know that I can just convert the text. I think you will find something similar with most CLI Monday episodes. In that, I want to highlight commands which you might not use all the time, but when you run into a particular problem, you will have a sense of what can be done to fix it. I should also mention that, you could also do something similar the column command, by importing this data file in a spreadsheet, but sometimes this is not always practice.

You can also check out the man page for the column command. I always use the -t option to align columns into a table, you can also use the -s option to specify the delimiter value, by default it is whitespace. This could be of useful on csv data, key value configuration files, or something similar.

$ man column

The next command that I wanted to show you is the tr command. Lets start off by looking at the man page.

$ man tr

The tr command gives us and easy way to translate or deleting characters from text. You can do all sorts of cool things, like replace text, delete text, or smartly delete sequences of characters.

cat data.log 
a b c d e f g h i j k l m n o p
4 0 0 6298384 40428 47587356 0 0 1209 172 0 0 17 3 71 9
4 0 0 6298448 40428 47587356 0 0 0 0 2163 5260 25 1 74 0
14 0 0 6299024 40428 47587416 0 0 0 0 2053 4773 25 1 75 0
15 0 0 6291776 40428 47595272 0 0 0 0 3276 5176 25 1 74 0
4 0 0 6283868 40432 47602948 0 0 0 296 2736 5471 25 1 74 0
6 0 0 6282352 40432 47602780 0 0 0 0 2692 5204 24 1 75 0
9 0 0 6282352 40432 47602780 0 0 0 0 2269 4939 25 1 74 0
9 0 0 6282168 40440 47602776 0 0 4 108 2681 5570 25 1 74 0
14 0 0 6282132 40440 47602784 0 0 0 12344 4930 5534 25 1 74 0
6 0 0 6282144 40440 47602784 0 0 0 0 2520 5194 25 1 74 0
15 0 0 6282096 40444 47602784 0 0 0 616 2426 4930 25 1 74 0
14 0 0 6274292 40444 47610464 0 0 0 0 2301 5057 25 1 74 0
14 0 0 6274200 40448 47610460 0 0 0 4 2337 4466 25 1 75 0
6 0 0 6274316 40456 47610468 0 0 0 396 2254 4580 25 1 75 0
6 0 0 6274292 40456 47610472 0 0 4 0 2440 4729 25 1 74 0
5 0 0 6266660 40456 47618152 0 0 0 0 3403 5208 25 1 74 0
4 0 0 6258848 40456 47625832 0 0 0 2020 4108 5752 25 1 73 1
6 0 0 6258980 40464 47625824 0 0 0 508 2200 4414 25 1 74 0
4 0 0 6258972 40464 47625836 0 0 0 0 2336 4789 25 1 74 0

Lets look at some examples, say we wanted to convert all whitespace to commas. So, lets rerun our cat data.txt command, and then pipe that as input into our tr command, the first argument is the value we want to match against, and the second argument is what we want to replace it with, in our case a comma. So when this type of problem appears, you know that, hey, I can probably use the tr command to manipulating this text into a format we want.

$ cat data.log | tr ' ' ','
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p
4,0,0,6298384,40428,47587356,0,0,1209,172,0,0,17,3,71,9
4,0,0,6298448,40428,47587356,0,0,0,0,2163,5260,25,1,74,0
14,0,0,6299024,40428,47587416,0,0,0,0,2053,4773,25,1,75,0
15,0,0,6291776,40428,47595272,0,0,0,0,3276,5176,25,1,74,0
6,0,0,6282352,40432,47602780,0,0,0,0,2692,5204,24,1,75,0
9,0,0,6282352,40432,47602780,0,0,0,0,2269,4939,25,1,74,0
9,0,0,6282168,40440,47602776,0,0,4,108,2681,5570,25,1,74,0
14,0,0,6282132,40440,47602784,0,0,0,12344,4930,5534,25,1,74,0
6,0,0,6282144,40440,47602784,0,0,0,0,2520,5194,25,1,74,0
15,0,0,6282096,40444,47602784,0,0,0,616,2426,4930,25,1,74,0
14,0,0,6274292,40444,47610464,0,0,0,0,2301,5057,25,1,74,0
14,0,0,6274200,40448,47610460,0,0,0,4,2337,4466,25,1,75,0
6,0,0,6274316,40456,47610468,0,0,0,396,2254,4580,25,1,75,0
6,0,0,6274292,40456,47610472,0,0,4,0,2440,4729,25,1,74,0
4,0,0,6258848,40456,47625832,0,0,0,2020,4108,5752,25,1,73,1
6,0,0,6258980,40464,47625824,0,0,0,508,2200,4414,25,1,74,0
4,0,0,6258972,40464,47625836,0,0,0,0,2336,4789,25,1,74,0

I just wanted to show you one more example before I end this episode. Say you have some column data and want to squeeze it down to a delimited format. Well, you can actually use the tr command for this. Lets get some example data, from the top command, and then just grab the tail end of it, using the tail command.

$ top -n1 | tail -n 10
   15 root      20   0     0    0    0 S    0  0.0   0:01.81 ksoftirqd/2                                  
   16 root      RT   0     0    0    0 S    0  0.0   0:01.75 watchdog/2                                   
   17 root      RT   0     0    0    0 S    0  0.0   0:00.15 migration/3                                  
   18 root      20   0     0    0    0 S    0  0.0   0:53.69 kworker/3:0                                  
   19 root      20   0     0    0    0 S    0  0.0   0:02.60 ksoftirqd/3                                  
   20 root      RT   0     0    0    0 S    0  0.0   0:01.82 watchdog/3                                   
   21 root       0 -20     0    0    0 S    0  0.0   0:00.00 cpuset                                       
   22 root       0 -20     0    0    0 S    0  0.0   0:00.00 khelper                                      
   23 root      20   0     0    0    0 S    0  0.0   0:00.00 kdevtmpfs                                    
   24 root       0 -20     0    0    0 S    0  0.0   0:00.00 netns        

Say we wanted to use this as input to some program or something and it expected whitespace delimited input, well we can use the tr command, to squeeze all recurrences of a character into a single instance. Lets try it out, by running.

$ top -n1 | tail -n 10 | tr -s ' '
 13 root RT 0 0 0 0 S 0 0.0 0:00.40 migration/2 
 15 root 20 0 0 0 0 S 0 0.0 0:01.81 ksoftirqd/2 
 16 root RT 0 0 0 0 S 0 0.0 0:01.75 watchdog/2 
 17 root RT 0 0 0 0 S 0 0.0 0:00.15 migration/3 
 18 root 20 0 0 0 0 S 0 0.0 0:53.70 kworker/3:0 
 19 root 20 0 0 0 0 S 0 0.0 0:02.60 ksoftirqd/3 
 20 root RT 0 0 0 0 S 0 0.0 0:01.82 watchdog/3 
 21 root 0 -20 0 0 0 S 0 0.0 0:00.00 cpuset 
 22 root 0 -20 0 0 0 S 0 0.0 0:00.00 khelper 
 23 root 20 0 0 0 0 S 0 0.0 0:00.00 kdevtmpfs 

So, we used the tr command with the -s option to squeeze all repeating whitespace into a since occurrence. You will probably notice that all of this can be done through scripting languages too, but you will often find these types of commands useful for writing scripts, or if you quickly need to convert text without much overhead.

So hopefully in the future when you find yourself up against this type of problem, you might turn to the column or translate commands, to help you out.