Sunday, May 22, 2005

Learning awk

Other day, I had a task to kill some of the UNIX processes, which were running javac (Java compiler). One way I could have approached this was to identify all the processes running javac by running

root@localhost# ps -ef | grep javac

And then picking each of these process id and killing them individually.

root@localhost# kill –9 pid

But I have overseen people writing some one liner shell command using combination of awk ,ps etc. to kill all the processes running particular program/binary. So I thought of building this one liner. I had no idea of what is awk except that it is named after the initials of the authors (Aho, Kernighan and Weinberger) who wrote this. I even didn’t knew if it was a comprehensive language for text processing.

I launched awk manual on shell prompt and started going through it. It took me a while to get hold of where the things were heading.

Some how I tried run through the manual but didn’t pick much except {print} and few simple things like awk can have inline programs or the programs could be stored in file. Also it is data driven and not procedural. $0 refers to the current line in the data. Commands like print and printf. But some how I was not able to get hold of syntaxes. I was not understanding the grammar and how the program needs to be written. I just managed to execute

awk {print}

After this I tried few more

root@localhost# awk {print Hello}
awk: cmd. line:2: (END OF FILE)
awk: cmd. line:2: parse error
root@localhost# awk {print 'Hello'}
awk: cmd. line:2: (END OF FILE)
awk: cmd. line:2: parse error
root@localhost# awk "{print 'Hello'}"
awk: cmd. line:1: {print 'Hello'}
awk: cmd. line:1: ^ invalid char ''' in expression
root@localhost# awk '{print "Hello"}'
Finally, I got hold of tutorial GAWK: Effective AWK Programming which gave me head-start understanding the language.

root@localhost# awk 'BEGIN {print "Hello"}' was easy to comprehend.
Things started clicking.

root@localhost# awk 'BEGIN {print substr("Hello World",2,2}' and so on..

So since I started understanding the awk and how it works, I tried getting to the problem, which started the whole zeal. Killing all the processes running javac.

ps –ef | grep javac | awk {print}
then it was
ps -ef | grep javac | awk '{print substr($0,10,6)}'
ps -ef | grep javac | awk '{printf substr($0,10,6)}'
to throw the all the process id in one line
and finally
kill -9 `ps -ef | grep javac | awk '{printf substr($0,10,6)}'`

This tries to kill all the processes which have patter “javac” in the executing command. This means even it tries to kill “grep javac” also which some how does not exist after passing the output to awk command. Thus you will get
bash: kill: (pid) - No such pid
error message. But it above command will kill all the processes running javac.

The onliner got further modified to get rid of grep
kill -9 `ps -ef | awk '/javac/{printf substr($0,10,6)}'`

Some how substr was sounding bit fishy. It could be a case that pid does not start at the 10th position and is of more then 6 character. The one liner was further modified to

kill -9 `ps -ef | awk '/javac/{printf $2}'`
where $2 implicitly refers to the second column of the ps output. So finally was able to get through.


Jadu Kumar Saikia said...

Some of my posts on different uses of awk

// Jadu

Nikhil said...

For the particular usecase of killing all running instances of a particular binary, you can also use killall command.

killall javac (should kill all the javac instances.)


Leonardo said...

Nikhil, this is OK if you're running on a Linux box but Unix doesn't have killall command