Team LiB
Previous Section Next Section

#21 Digging Around in the Man Page Database

The Unix man command has a tremendously useful option that produces a list of man pages whose descriptions include the specified word. Usually this functionality is accessible as man -k word, but it can also be invoked using the apropos or whatis commands.

Searching for a word with the man command is helpful, but it's really only half the story, because once you have a set of matches, you still might find yourself performing a brute-force search for the specific command you want, going one man page at a time.

As a smarter alternative, this script generates a list of possible man page matches for a particular pattern and then searches each of those matching pages for a second search pattern. To constrain the output a bit more, it also allows the user to specify which section of the man pages to search.

Note 

As a reminder, the man pages are organized by number: 1 = user commands, 3 = library functions, 8 = administrative tools, and so on. You can use man intro to find out your system's organizational scheme.

The Code

#!/bin/sh

# findman -- Given a specified pattern and man section, shows all the matches
#   for that pattern from within all relevant man pages.

match1="/tmp/$0.1.$$"
matches="/tmp/$0.$$"
manpagelist=""

trap "rm -f $match1 $matches" EXIT

case $#
in
  3 ) section="$1"  cmdpat="$2"  manpagepat="$3"            ;;
  2 ) section=""    cmdpat="$1"  manpagepat="$2"            ;;
  * ) echo "Usage: $0 [section] cmdpattern manpagepattern" >&2
      exit 1
esac

if ! man -k "$cmdpat" | grep "($section" > $match1 ; then
  echo "No matches to pattern \"$cmdpat\". Try something broader?" >&2; exit 1
fi

cut -d\(-f1 < $match1 > $matches        # command names only
cat /dev/null > $match1                 # clear the file...

for manpage in $(cat $matches)
do
  manpagelist="$manpagelist $manpage"
  man $manpage | col -b | grep -i $manpagepat | \
    sed "s/^/${manpage}: /" | tee -a $match1
done

if [ ! -s $match1 ] ; then
cat << EOF
Command pattern "$cmdpat" had matches, but within those there were no
matches to your man page pattern "$manpagepat".
Man pages checked:$manpagelist
EOF
fi

exit 0

How It Works

This script isn't quite as simple as it may seem at first glance. It uses the fact that commands issue a return code depending on the result of their execution to ascertain whether there are any matches to the cmdpat value. The return code of the grep command in the following line of code is what's important:

if ! man -k "$cmdpat" | grep "($section" > $match1 ; then

If grep fails to find any matches, it returns a nonzero return code. Therefore, without even having to see if $match1 is a nonzero-sized output file, the script can ascertain the success or failure of the grep command. This is a much faster way to produce the desired results.

Each resultant line of output in $match1 has a format shared with the following line:

httpd               (8)  - Apache hypertext transfer protocol server

The cut -d\(-f1 sequence grabs from each line of output the command name up through the open parenthesis, discarding the rest of the output. Once the list of matching command names has been produced, the man page for each command is searched for the manpagepat. To search man pages, however, the embedded display formatting (which otherwise would produce boldface text) must be stripped, which is the job of col -b.

To ensure that a meaningful error message is generated in the case where there are man pages for commands that match the cmdpat specified, but manpagepat does not occur within those man pages, the following line of code copies the output into a temp file ($match1) as it's streamed to standard output:

sed "s/^/${manpage}: /" | tee -a $match1

Then if the ! -s test shows that the $match1 output file has zero lines, the error message is displayed.

Running the Script

To search within a subset of man pages for a specific pattern, first specify the keyword or pattern to determine which man pages should be searched, and then specify the pattern to search for within the resulting man page entries. To further narrow the search to a specific section of man pages, specify the section number as the first parameter.

The Results

To find references in the man page database to the httpd.conf file is problematic with the standard Unix toolset. On systems with Perl installed, you'll find a reference to a Perl module:

$ man -k httpd.conf
Apache::httpd_conf(3)    - Generate an httpd.conf file

But almost all Unixes without Perl return either "nothing appropriate" or nothing at all. Yet httpd.conf is definitely referenced within the man page database. The problem is, man -k checks only the one-line summaries of the commands, not the entire man pages (it's not a full-text indexing system).

But this failure of the man command is a great example of how the findman script proves useful for just this sort of needle-in-a-haystack search. To search all man pages in section 8 (Administration) that have something to do with Apache, in addition to mentioning httpd.conf specifically, you would use the following command, with the results showing the exact matches in both relevant man pages, apxs and httpd:

$ findman 8 apache httpd.conf
apxs:    [activating module `foo' in /path/to/apache/etc/httpd.conf]
apxs:              Apache's httpd.conf configuration file, or by
apxs:              httpd.conf configuration file without attempt-
apxs:        the httpd.conf     file accordingly. This can be achieved by
apxs:    [activating module `foo' in /path/to/apache/etc/httpd.conf]
apxs:    [activating module `foo' in /path/to/apache/etc/httpd.conf]
httpd:             ServerRoot. The default is conf/httpd.conf.
httpd:        /usr/local/apache/conf/httpd.conf

Searching just within section 8 quickly identified two man pages worth exploring for information about the httpd.conf file. Yet searching across all man pages in the system is just as easy:

$ findman apache .htaccess
mod_perl:        In an httpd.conf <Location /foo> or .htaccess you need:
mod_perl:        dlers are not allowed in .htaccess files.

Team LiB
Previous Section Next Section
This HTML Help has been published using the chm2web software.