Fork me on GitHub

klashxx    Archive    About    Talks    Feed

Use of the ternary operator, an awk example.

Post first published in nixtip

Use of the ternary operator, an awk example.

Ok, suppose this source file:

$ cat infile
21/tcp   closed ftp
22/tcp   open   ssh
23/tcp   closed telnet
80/tcp   closed http
90/tcp   closed dnsix
95/tcp   closed supdup
100/tcp  closed newacct
162/tcp  closed snmptrap
205/tcp  closed at-5
335/tcp  closed unknown
435/tcp  closed mobilip-mn
555/tcp  closed dsf
8080/tcp closed http-proxy
8081/tcp closed blackice-icecap

Our mission will be to get a formatted output port:state:service like this

21:1:ftp
22:0:ssh
23:1:telnet
80:1:http

… and so on …

All closed ports should be marked as 1 the rest will be 0.

Like always, in *nix system we have plenty of tools (and approaches) to get the expected result, lets try the awk way…

At first sight we can identify three fields in our input file and tree tasks be solved.

  1. Get rid of the slash + tcp string of the first field.
  2. Change the value of the second field for 1 or 0.
  3. Field separator should be :

A simply text replacing, is a straightforward way to get the expected result:

$ awk '{sub(/\/.*closed +/,":1:");sub(/\/.*open +/,":0:")}1' infile

Here’s the internals:

  • We look for a string started by as slash (note de escape char \/) followed by any number of any character (dot + star .*) ,followed by the string closed and ended by any number of space chars * and replace it with :1: .For the first line: 21/tcp closed ftp will be replace for :1:

  • Same thing for “open” in this case “:0:” will be the substitution string , example: 22/tcp open ssh will be replace for :0:

Our initial tasks get solved ,but we can refine our efforts.

Let’s use the conditional operator.

expr ? action1 : action2

Its pretty straight forward : if expr then acction1 is performed/evaluated , if not action2.

For our example , field two must change to 1 if it’s value is closed, if not it should be 1.

The needed conditional operator:

$2=="closed" ? "1" : "0"

Depending of second field value, our program will perform a different action, in this case its returning a string : 1 or 0.

At this point, a variable is needed to store it:

n= $2=="closed" ? "1" : "0"

Finally we perform the text substitution:

awk '{n= $2=="closed" ? "1" : "0";sub(/\/.*(open|closed) +/,":"n":")}1' infile

Note that we reduce the calls to the sub function to just one.

A final (and total different) approach , field substitution instead of text replacing.

Remember our tasks:

a) Get rid of the slash+tcp string of the first field. b) Change the value of the second field for 1 or 0 c) Field separator should be :

Our input file has naturally three fields (by the default awk FS ):

21/tcp   closed ftp
22/tcp   open   ssh
23/tcp   closed telnet

It’s clear that we can think in a four fields based line, if we add the slash / to our field separators by using a regex as FS='( *)|(/)' where ( *) represents any number of spaces as separator and (/) represents the slash:

So:

awk '{print $1,$2,$3,$4}' OFS='>' FS='( *)|(/)'  infile|head -3
21>tcp>closed>ftp
22>tcp>open>ssh
23>tcp>closed>telnet

Note that the Output Field Separator OFS is changed to > for clarify.

Now, we want to get rid of the second field, technically is not possible, but we can assign the null value (empty string) to it:

awk '{$2=""}1' OFS='>' FS='( *)|(/)'  infile|head -4
21>>closed>ftp
22>>open>ssh
23>>closed>telnet
80>>closed>http

Attention, the use of the print statement is not needed, awk will print the input line if the result of applying the inner statements to the current input line is true.

The assignment $2="" is not an action statement but we force a true return by placing 1 at the end of the program.

If we set the OFS to null value:

awk '{$2=""}1' OFS= FS='( *)|(/)'  infile|head -4
21closedftp
22openssh
23closedtelnet
80closedhttp

We’re close to or goal, the last step is to process the third field:

$3=="closed" ? ":1:" : ":0:"

Like we saw before we need to assign it to a variable,… look the trick:

$3= $3=="closed" ? ":1:" : ":0:"

We say , hey! change `$3 depending of its previous value. So :

awk '{$2="";$3=$3=="closed" ? ":1:" : ":0:"}1' OFS= FS='( *)|(/)'  infile|head -4
21:1:ftp
22:0:ssh
23:1:telnet
80:1:http

A final optimization, the conditional operator performs always an action that imply the print statement, so:

awk '{$2="";$3=$3=="closed" ? ":1:" : ":0:"}1' OFS= FS='( *)|(/)' infile

Is equivalent to:

awk '$2="";$3=$3=="closed" ? ":1:" : ":0:"' OFS= FS='( *)|(/)'  infile
21:1:ftp
22:0:ssh
23:1:telnet
80:1:http
90:1:dnsix
95:1:supdup
100:1:newacct
162:1:snmptrap
205:1:at-5
335:1:unknown
435:1:mobilip-mn
555:1:dsf
8080:1:http-proxy
8081:1:blackice-icecap

We’re done.

Using sed + xargs to rename multiple files

Post first published in nixtip

Lets say that whe have a bunch of txt files and we need to rename to sql.

$ touch a.txt  b.txt  c.txt  d.txt  e.txt  f.txt
$ ls
a.txt  b.txt  c.txt  d.txt  e.txt  f.txt

We can use ls combined with sed and xargs to achieve our goal.

$ ls | sed -e "p;s/\.txt$/\.sql/"|xargs -n2 mv
$ ls
a.sql  b.sql  c.sql  d.sql  e.sql  f.sql

How it works:

$ ls | sed -e "p;s/\.txt$/\.sql/"
a.txt
a.sql
b.txt
b.sql
c.txt
c.sql
d.txt
d.sql
e.txt
e.sql
f.txt
f.sql

The ls output is piped to sed , then we use the p flag to print the argument without modifications, in other words, the original name of the file.

The next step is use the substitute command to change file extension.

NOTE: We’re using single quotes to enclose literal strings (the dot is a metacharacter if using double quotes scape it with a backslash).

The result is a combined output that consist of a sequence of old_file_name and new_file_name.

Finally we pipe the resulting feed through xargs to get the effective rename of the files.

$ ls | sed -e "p;s/.txt$/.sql/"|xargs -n2 mv

PD: Alternative path to take care of spaces in the file names:

$ touch "a a d.txt.txt" "b b b.txt" "c c.txt" d.txt e.txt f.txt
$ ls
a a d.txt.txt  b b b.txt      c c.txt        d.txt          e.txt          f.txt

Here’s the CMD:

$ ls | awk '{gsub(/^|$/,"\"");print;gsub(/\.txt\"$/,".sql\"")}1' |xargs -n2 mv

Result:

$ ls
a a d.txt.sql  b b b.sql      c c.sql        d.sql          e.sql          f.sql

From the man page:

DESCRIPTION

xargs combines the fixed initial-arguments with arguments read from standard input to execute the specified command one or more times. The number of arguments read for each command invocation and the manner in which they are combined are determined by the options specified. [/sourcecode]

The n parameter

-n number Execute command using as many standard input arguments as possible, up to number arguments maximum. Fewer arguments are used if their total size is greater than size bytes, and for the last invocation if there are fewer than number arguments remaining. If option -x is also coded, each number arguments must fit in the size[/sourcecode]

The -n2 flag force xargs to take 2 arguments from the piped output each time and parses it to the mv command to get the job done.

Print lines between two patterns , the awk way ...

Post first published in nixtip

Example input file:

test -3
test -2
test -1
OUTPUT
top 2
bottom 1
left 0
right 0
page 66
END
test 1
test 2
test 3

The standard way ..

awk '/OUTPUT/ {flag=1;next} /END/{flag=0} flag {print}' infile
top 2
bottom 1
left 0
right 0
page 66

Self-explained indented code:

awk '
/OUTPUT/ {flag=1;next} # Initial pattern found --> turn on the flag and read the next line
/END/    {flag=0}      # Final pattern found   --> turn off rhe flag
flag     {print}       # Flag on --> print the current line
' infile

The first optimization is to get rid of the print , in awk when a condition is true print is the default action , so when the flag is true the line is going to be echoed.

To delete de NEXT statement , in order o prevent printing the TAG line, we need to activate the flag after the OUTPUT pattern discovery and after the flag evaluation.

A slight variation of the program flow and we’re done:

awk '/END/{flag=0}flag;/OUTPUT/{flag=1}' infile

PD: What if we only want to print the lines enclosed between the OUTPUT && END tags ?

© Juan Diego Godoy Robles