Team LiB
Previous Section Next Section

#33 Wrapping Only Long Lines

One limitation of the fmt command and its shell script equivalent, Script #14, is that they wrap and fill everything they encounter, whether it makes sense to do so or not. This can mess up email (wrapping your .signature is not good, for example) and many other input file formats.

What if you have a document in which you want to wrap just the long lines but leave everything else intact? With the default set of commands available to a Unix user, there's only one possible way to accomplish this: Explicitly step through each line in an editor, feeding the long ones to fmt one by one (for example, in vi you could move the cursor onto the line in question and then use !$fmt to accomplish this).

Yet Unix has plenty of tools that can be combined to accomplish just what we seek. For example, to quickly scan a file to see if any lines are too long:

awk '{ if (length($0) > 72) { print $0 } }'

A more interesting path to travel, however, is to use the $#varname construct in the shell, which returns the length of the contents of whatever variable is substituted for varname.

The Code

#!/bin/sh
# toolong - Feeds the fmt command only those lines in the input stream that are
#     longer than the specified length.

width=72

if [ ! -r "$1" ] ; then
  echo "Usage: $0 filename" >&2; exit 1
fi

while read input
  do
     if [ ${#input} -gt $width ] ; then
       echo "$input" | fmt
     else
       echo "$input"
     fi
  done < $1

exit 0

How It Works

The method of processing the input file in this script is interesting. Notice that the file is fed to the while loop with a simple < $1 and that each line can then be analyzed by reading it with read input, which assigns the input variable to each line of the file.

If your shell doesn't have the ${#var} notation, you can emulate its behavior with wc:

varlength="$(echo "$var" | wc -c)"

However, wc has a very annoying habit of prefacing its output with spaces to get values to align nicely in the output listing. To sidestep that pesky problem, a slight modification, which lets only digits through the final pipe step, is necessary:

varlength="$(echo "$var" | wc -c | sed 's/[^:digit:]//')"

Running the Script

This script accepts exactly one filename as its input.

The Results

$ toolong ragged.txt
So she sat on, with closed eyes, and half believed herself in
Wonderland, though she knew she had but to open them again, and
all would change to dull reality--the grass would be only rustling
in the wind, and the pool rippling to the waving of the reeds--the
rattling teacups would change to tinkling sheep-bells, and the
Queen's shrill cries to the voice of the shepherd boy--and the
sneeze
of the baby, the shriek of the Gryphon, and all the other queer
noises, would change (she knew) to the confused clamour of the busy
farm-yard--while the lowing of the cattle in the distance would
take the place of the Mock Turtle's heavy sobs.

Notice that, unlike a standard invocation of fmt, toolong has retained line breaks where possible, so the word "sneeze," which is on a line by itself in the input file, is also on a line by itself in the output.


Team LiB
Previous Section Next Section
This HTML Help has been published using the chm2web software.