The Command-Line Really Does Matter, Part 3

08:03 reading time


In Part 2 of this series, we had a look at looping strategies to help us determine which strategy might be best for us under particular circumstances.

One of the other options that we didn’t touch on was using the find command to help us operate on a group of files. We’ll look into find here, and also the set builtin: both tools that have many options, are a source of confusion, and can also be very useful.

Stop fearing find

The find command always seems to bother people. When using find, think, “I’m looking somewhere for something.”

Right now, I have three files in the directory called “stuff” within my home directory: fobar, foobar, and fooxbar. Let’s say I want to get a list of all files matching “foo” in there:

~$ find ~/stuff -name 'foo*' -exec basename '{}' \;
foobar
fooxbar

(I threw that -exec in for you, Terry.)

In the bourne shell, the tilde character can be used as a shorthand for the full path to your home directory. You may also use the $HOME variable, but unless you need to, it takes more effort to type.

Here, we use the mnemonic look-somewhere-for-something to make sense of things. We’re looking in my stuff folder for things that start with “foo.”

As a side-effect, we use the -exec option to the find command to execute a command on each item found. Since I’m asking find to look in the absolute path to my home directory, all the results that find returns will be absolute paths that include my home directory. I don’t care to share the absolute path to my home directory with you, so I’ll use the basename command to just give me the names of the files in the stuff folder. The {} sequence is special to find; it means “substitute each result here.” The backslash-semicolon escapes the semicolon such that the embedded command is terminated, prior to terminating the find command itself (which could be terminated with a semicolon, but is implied through a new line).

Find has a number of expressions/tests that you can chain together to match very specific criteria, such as -type to match a specific file type (i.e., explicitly match a directory), or -mtime to only match files updated a given number of days ago. You can also use operators like -o and -a to match multiple expressions (“or”, and “and”, respectively).

Confused yet? man find to view the manual page. When sufficiently frustrated, press “q” to exit the manual.

Elements of the set builtin

The GNU Bash manual has a page dedicated to the set builtin command because:

This builtin is so complicated that it deserves its own section.

There are three ways I’ve found set as a practical tool:

  1. Saving the state of the shell session.
  2. Experimenting with arguments for a yet-to-be function.
  3. Changing editing mode to “vi.”

Saving state

For the first, consider a bunch of shell scripts that use the same environment variables to perform related tasks. Perhaps they need some way to share data, and maybe they don’t all run in the same process. An easy way to do this is to simply dump the environment at the termination of one script and read it on startup of the next.

As an example, let’s make a simple script that saves its state and does something different each time it’s loaded, depending on that saved state:

~/stuff$ cat << END > stateful.ksh
#!/bin/ksh

[ -f state.ksh ] && source ./state.ksh

case $STATE in
step2)
  echo "Hi. This is step 2."
  export STATE=step3
  ;;
step3)
  echo "Hi. This is step 3."
  unset STATE
  ;;
*)
  echo "Hi. This is step 1."
  export STATE=step2
  ;;
esac

set > state.ksh
END
~/stuff$ chmod +x stateful.ksh
~/stuff$ ./stateful.ksh
Hi. This is step 1.
~/stuff$ ./stateful.ksh
Hi. This is step 2.
~/stuff$ ./stateful.ksh
Hi. This is step 3.
~/stuff$ ./stateful.ksh
Hi. This is step 1.

We create a case block that just changes one variable in each clause. On termination, it dumps all environment variables, functions, etc, to a file—which is also read on startup.

Each time we execute the script, we match a different clause, depending on the variable set in the previous execution.

Notably, we use ksh-93 here instead of bash. This will work in bash, too, but you’ll get errors regarding read-only variables.

Prototyping functions

For the second “practical” use case, set can be handy when prototyping a function that accepts arguments.

Consider a simple function, maybe a one-liner, even. What if we just want to create an “echo”-like function that reverses the order of the first three arguments? This is a silly function, but it’s also a simple example.

We could write the function first and then try it out, or we could test it before we “officially” make it, as a twisted interpretation of Test-Driven Development.

~/stuff$ set one two three
~/stuff$ echo $3 $2 $1
three two one
~/stuff$ # This is what I want, so I'll make a function out of it.
~/stuff$ function silly {
> echo $3 $2 $1
> }
~/stuff$ silly one two three
three two one

Here, we use the set builtin to assign positional argument variables in-line. We review what happens when we use them in a certain way, and then we make a function that performs that action. The function can then be saved in a script for later use.

Changing editing mode

As much as I love shortcuts like the bang (!) for repeating history, or the caret (^) for history substitution, sometimes it’s nice to just be able to efficiently move around on a line.

By default, bash uses “emacs” mode. If you’re an emacs fan, good for you. I like that set allows me to change the way I can move around on the command-line using keystrokes familiar to me from the “vi” text-editing world. (If you are a fan of bourne shells because of their availability in every unix-like environment, then you ought to be a fan of vi as well.)

With a simple set -o vi I can now move backwards by pressing Escape and typing b, append to the end of the line with A, trim off extra junk with D, etc. I can even edit commands in my text editor of choice (vi, no surprise) using v.

Even more: Stop when there’s an error

My colleague Nat also points out another great use for set:

It seems odd to discuss set without mentioning set -e. This tells bash to exit if any command returns a non-zero code, and is the first line you should add to any bash script. It’s an important part of making bash behave like a real programming language.

Go ahead and prank your co-worker by issuing a:

echo "set -e" >> ~/.profile

when they’ve stepped away from their desk.

Check out Part 4 for a dive into file permissions.


30e9354fde8b14f9d85628775b7c1bd6

Ian Melnick
Senior Software Engineer