The Bourne Ambiguity
I have a great ambivalence about the Bourne shell. It is universally available
on any Unix system, and thus is one of the only truly portable, universal ways
of expressing behaviour in a Turing-complete language, such that it is
executable on UNIX without regard to architecture, binary executable formats,
ABIs, what other shells or scripting languages are available, etc. (This
property sees it used somewhat byzantinely in tools such as the makeself
self-extracting archive generator, where binary data is appended to a Bourne
shell script. Not a lot of options.)
It is somewhat tragic, then, that the Bourne shell has such fundamental issues
as not supporting lists. It relies on the horrendous $IFS
hack to try and
understand lists expressed via its everything-is-a-string type unsystem.
Particularly troublesome are the potential security implications, or the
tendency of scripts to spontaneously break when, years after they are written,
someone decides to put a space in a filename.
The Bourne shell essentially admits this failing by applying special cases
without which it would be unusable. Namely, the quoted string "$@"
expands to
a series of quoted strings, a complete divergence from the normal behaviour
applied only when quoting $@
. This makes it practical to execute programs
with the verbatim arguments passed to a shell script.
I've written a lot of Bourne shell scripts, and although I am downright
obsessive about quoting every variable expansion, there are still severe
ambiguities. For example, if you glob into a variable and then want to iterate
through the items, you're screwed; spaces in filenames strike again. I am
resigned to assuming any shell script I write contains unknown security
disasters, especially when you consider that in the UNIX filesystem, the only
illegal characters for a filename are /
and the NULL byte. Filenames can
contain spaces, asterisks, newlines, ASCII control codes, Unicode control
codes...
This page lists various common errors when writing sh or bash. Perhaps most hilariously, it correctly notes that the following is wrong for an interactive shell:
echo "Hello, world!"
Alternatives. In terms of the Bourne shell, there are few alternatives with as wide availability: perhaps awk, expect (sometimes useful, but often not installed by default), perl (increasingly less commonly installed by default), bash.
bash is commonly available, and supports some manner of array, which is perhaps safer. A cursory investigation suggests that iterating through a list requires the following syntax:
a=("John Smith" "Jane Doe" "Mary Jones")
for x in "${a[@]}"; do echo "$x"; done
The syntax "${a[@]}"
is here special cased again, expanding to a series of
quoted items. bash doesn't escape being a Bourne shell with arrays; it still
comes down to idiosyncratic special casing. Still, the ability to handle
filenames with spaces safely in circumstances other than argument processing is
at least welcome.
Obscure shells. If the availability requirement is discarded, there are some interesting possibilities. rc, the Plan 9 shell, does, I understand, support lists, hallelujah. Some manner of Lisp could probably be conceived that feels “native” to the Unix filesystem.