What is the reason for the weird syntax of the “ca

2020-02-04 06:49发布

问题:

Looking from a programmer's point of view then shell script is just another programming language, where one has to learn and conform to the rules of the language. However, I have to admit that this syntax is the weirdest style I have ever seen in a rather commonly used language. Did the shell take this syntax from an older language that it descents from? Is there a special implication / meaning in the syntax?

As an example, here is a little snippet that I take from another post on SO

case "$1" in
    start)
        start
        ;;
    stop)
        stop
        ;;
    restart)
        stop
        start
        ;;
    status)
        check_status
        ;;
    *)
        echo "Usage: $0 {start|stop|restart|status}"
        exit 1
        ;;
esac

Looking at this, firstly I can see that case ends with esac, which is its reversed form (like if ending in fi). Secondly I understand that each case is followed by a ). Fair enough, but why on earth do I need two ; at the end of each statement? I would also say that the ) without an accompanying ( is ugly.

I'm looking for more information about the historical aspect of the language, but I'm open for technical reasons as well.

回答1:

Per request:

  • So can you guess why a loop is 'for ...; do ...; done' and not 'for ...; do ...; od'? There was a sound reason for it - but the Algol-like reversed keyword to mark the end was used elsewhere.

Answer:

  • The syntax came from Bourne (of Bourne shell fame). He had worked on Algol, and liked it enough to model some of the shell syntax on Algol. Algol uses reversed keywords to mark the ends of constructs, so 'case ... esac' was appropriate. The reason that loops do not end with 'od' is that there was already a command 'od' in Unix - octal dump. So, 'done' is used instead.

By reputation, the Bourne shell source code was written in idiosyncratic C with macros to make it look like Algol. This made it hard to maintain.

With respect to the main question - about why no opening bracket (parenthesis) around the alternatives in the case statement - I have a couple of related theories.

First of all, back when the Bourne shell was written (late 1970s), much editing was done with 'ed', the standard text editor. It has no concept of skipping to a balanced parenthesis or other such notations, so there was no requirement for a leading parenthesis. Also, if you are writing a document, you might well marshal your arguments with:

a) ...blah...
b) ...more...
c) ...again...

The opening parenthesis is often omitted - and the case statement would fit into that model quite happily.

Of course, since then, we have grown used to editors that mark the matching open parenthesis when you type a close parenthesis, so the old Bourne shell notation is a nuisance. The POSIX standard makes the leading parenthesis optional; most more modern implementations of POSIX-like shells (Korn, Bash, Zsh) will support that, and I generally use it when I don't have to worry about portability to machines like Solaris 10 where /bin/sh is still a faithful Bourne shell that does not allow the leading parenthesis. (I usually deal with that by using #!/bin/ksh as the shebang.)



回答2:

The reason of using ;; is that a single ; can be used to write multiple statements in one line, like:

restart)
   stop; start;;
...


回答3:

Bash can accept matching parentheses:

case "$1" in
    (start)
        start
        ;;
    (stop)
        stop
        ;;

    etc.


回答4:

The closing parenthesis is sometimes used in lists in natural language, like

1) do this
2) do that

The reversed keywords were taken from some form of Algol but are in fact a very good idea for interactive use. They clearly demarcate the end of a construct, including if/else.

For example, with a C-like syntax, after this has been parsed:

if (condition)
    command here;

is there an else coming or not? rc, a shell from Plan 9 with a more C-like syntax, solves this by providing if not instead of else but it is not pretty.

With Bourne shell syntax, you'll have either else or fi and there is no need to read additional input.