<< Prev  |  TOC  |  Front Page  |  FAQ  |  Next >>
LINUX GAZETTE
...making Linux just a little more fun!
Windows Defectors: Wisdom Will 'find' You If You Seek to Find It
By Petar Marinov

In its last installment of the habitual linux weekly articles, Arstechnica.com writes, among other noteworthy events, about the fine ways of 'find'. (Unfortunately I can't find a direct link to that article.) How underestimated 'find' was to me you don't know. I belong to the group of shallow Windows users with limited command-line training who thought that 4DOS/4NT is the top of the cream. Unlearning "dir" was the hardest, and I know why it's in there -- to betrey your unwholesome Windows background everytime habit tugs you to the old ways in the gutter.

What is there to find on the subject of 'find'?

One usually encounters 'find' used by colleagues in a simple form:

"find | grep block.c"
just to discover whether a file block.c exists somewhere underneath the current directory. From this I learned that there is such thing as 'find' in Unix. Hastily, I assume that 'find' would accept any mask, totally disregarding the 'grep' part.
"find *.c"
And I'm answered: "find: *.c: No such file or directory". Huh! Where is the Unix power? How come 'find' finds nothing? Well, forget it, some other time. And 'find' languishes in peace.

But then I encounter "find -name *.c" I'm quick to experiment. Although "-name" is a long option to type, the results are encouraging. I'm awarded with the complete list of *.c files from the current and all subdirectories. (You may need to put a backslash before the * or quotes around the word in some shell environments.)

Quite prematurely I try:

"find -name src/*.c",
to be smacked in the face with: "find: paths must precede expresion". Paths?! Expression?! "find --help", so unhelpful. Once again I abandon 'find'.

Here and there I read about 'find', people discover it and are quick to cheerfully announce it to the world -- mailing lists, sites, blogs, innumerable places on the web serve as altars to 'find'.

I wandered in ungodly lands for too long.

I experienced divine intervention by reading the Linux column in Arstechnica. Nothing stunningly good on the surface, the author just shows one use of 'find'. For me, though, it was revealing. Everything came to place -- paths, expressions, ooh, great, expressions in 'find' are very fine idea. The article maybe grabbed my attention because of combining 'find' and 'grep'.

'grep' I know from the old days (in the gutter). Borland supplemented their compiler package with a set of tools including 'grep' (do you remember README.COM? Excellent viewer!). 'Grep' was fun. A friend explained to me about regular expressions; regular expressions are fun. Borland's 'grep' was somewhat compatible with the Unix 'grep' in command line options. What I missed most in GNU 'grep' is the recursive find-in-files feature.

What is important to understand is that 'find' is executing an expression. 'Find' provides operations like '-name \*.c', meaning "try to match a file with *.c". Logical operations can be applied to resolve more complicated request: -a is AND, -o is OR and \( \) is to determine precedence.

Now,

"find -name \*.c -o -name \*.h"
Will show you all files that are .c or .h.

By default 'find' prints the file names, but more powerful operation is '-exec'. '-exec' will execute a command substitute one of the matched files at the place of '{}' and return the exit code of the command, so we can combine a sequence of various '-exec' commands with '-a' and '-o' to serve a more elaborate purpose.

And this is what the article shows, the result of the match is then send to 'grep'. 'sed' is used to show with bold the pattern among the text. As I subsequently investigated the man pages of 'grep', it allowes for this with its '--color' option, so no 'sed' is needed.

Screenshot 1

This shows what the foremantioned article provides as example of using 'find':

"find \( -name *.c -o -name *.h \) -exec grep -i --color 'blockbegin' {} \;"

'\;' demarcates the end of one '-exec' component.

I always wanted to have something similar to the output of Borland's 'grep' -- show me the file names, the line numbers and the contents of the line that matches. I felt now that I have the right mixture of tools to achieve it.

'grep' will show line numbers, but ... only line numbers, no file names. So if I pass a file '-exec grep -n pattern {}' it will show lines from various files and no way for you to figure what is from where. 'find' itself provides for the file names to be printed: '-printf "%p"', but I don't need to print all names, I need to print only the names of the files that match at least once the pattern. Enjoy the line below.

Screenshot 2

export STR="blockbegin"; find \( -name *.c -o -name *.h \) -exec grep -q 
-i $STR {} \; -a -printf "%p\n" -exec grep -i --color -n $STR {} \; 
-printf "\n"

Bear with me for one last analysis. First I export the pattern in one environment variable, just to reduce the typing later. The whole line of 'find' is separated in a few distinctive sub-expressions, that are unified by a sequence of '-o' or '-a' operators. From Perl I picked up one simple technique, operation AND can be used as operator not only to compute boolean expressions but to execute a sequence of commands in successive order. Conversely, OR will execute at least one of the commands. A similar approach might be discovered in what I'm offering you here (and it is widely used in bash too -- the '&&' operator).

\( -name *.c -o -name *.h \)
Match any of *.c and *.h
-exec grep -q -i $STR {} \;
For every match check if the pattern occures anywhere (at least once '-q'). '-exec' will return the exit code of 'grep'
-a
If 'grep' returns success (there is a match) then the execution will continue with the right-hand part of '-a'
-printf "%p\n"
print filename where we have one or more matches (then I omit an optional '-a' -- and '-printf' is always TRUE)
-exec grep -i --color -n $STR {} \;
This executes grep but this time it will show line numbers and put color on the patterns where they match. (Then I omit an optional '-a' -- and 'grep' is guaranteed to succeed as we already know there is at least one match in this particular file).
-printf "\n"
An empty line to separate inbetween two files.

That's all folks. This could be maybe reworked as a nice script that accepts as parameters file masks and a set of patterns. If anyone is overwhelmed with enthusiasm as a result of my humble writings, feel welcome to share any futher developments on the subject.

What I discovered for myself, after years of reading scripts in journals and other venues, is that I forget everything I read unless I try to experiment with the newly gained knowledge. I encourage you to experiment and share, my dear reader.

 


Copyright © 2003, Petar Marinov. Copying license http://linuxgazette.net/copying.html
Published in Issue 97 of Linux Gazette, December 2003

<< Prev  |  TOC  |  Front Page  |  FAQ  |  Next >>