Feb 282012

Sometimes, for no discernable reason, a command fails.  In the absence of a log file, a decent debug level, or a helpful blog discovered via Google, you’re floundering, and in theory could be stuck with a simple blocker all day.  Often this can be a simple cryptic error message of “File not found”, and you smash your keyboard impotently wondering which file exactly the process is looking for.  In these cases, a debugger may often be helpful.

But for the non-kernel hacking fashionably dressed Unix user, a dive into the stack trace and a peek behind the curtain of system calls just increases the confusion, and so this line of investigation is often neglected.  A huge majority of problems can be solved with a simply couple of options to the “strace” command.

Supposing with the “File not found” error, if you knew what the process expected, you’d be fine. Or suppose you wanted to know what environment scripts a command was implicitly loading, and in what order.  The “strace” command can be filtered to show file open system calls only, like so:

# strace -f -e open -o /tmp/strace.log <command>

What you’ll get in the /tmp/strace.log file is a list of all attempts by the process- successful or otherwise – at opening files.  This can be helpful in locating a log file being output to, or a library file being read from.

For even more information, try “-e open,close,read,write”.  But “open” tends to be just the ticket for the majority of these kinds of problems.

Matt Parsons is a freelance Linux specialist who has designed, built and supported Unix and Linux systems in the finance, telecommunications and media industries.

He lives and works in London.