GoogleHomeAbout

Turning the bad and ugly into the good

COBOL has been with us for a long time. Early versions of the language contained "features" that forced programmers to write code that could only be described as "bad and ugly". So much so that COBOL is the only programming language specifically mentioned in the Tao of Programming...

Each language has its purpose, however humble. Each language expresses the Yin and Yang of software. Each language has its place within the Tao.

But do not program in COBOL if you can avoid it

The COBOL language has undergone significant renovation since those words were written. But tradition dies hard in the world of COBOL and enough "bad and ugly" code had already been written that many programmers were left with the impression that this is how COBOL should be written - or worse yet, must be written. This impression has lead to much of the hate and loathing for the language found among lesser informed programmers.

The following sections illustrate the worst of COBOL. Some of these nasty "features" were corrected with the COBOL-85 standard - but not eliminated.

Things to avoid:

Eliminate these few bad and ugly legacy constructs and COBOL takes on a whole new character - a good character making it very pleasant to work with.

Jerome Garfunkel provides a brief history of COBOL in his essay: Modern COBOL: Its time to take the gloves off. Although this essay is somewhat dated it still makes for a good read - particularly with respect to the positive impact of the COBOL-85 standard.


IF without END
back to top

The single biggest improvement to COBOL occured with the COBOL-85 standard when explicit statement terminators were added to the language. This happened over 25 years ago.

Explicit scope terminators, such as END-IF and END-PERFORM IBM Enterprise COBOL contains 20 explicit END-something terminatorsetc. enable the programmer to transform what once had to be a conditional statement into an imperative statement. Big deal... Yes, it is a big deal. Prior to COBOL-85 one could not structure IF statements in the following way:

             statements
             IF some-condition-a
                statement-group-1
                IF some-condition-b
                   statement-group-2
                END-IF
                statement-group-3
             END-IF
             more-statements
             .
The best one could do was:
             statements
             IF some-condition-a
                statement-group-1
                IF some-condition-b
                   statement-group-2
                   .
             IF some-condition-a
                statement-group-3
                .
             more-statements
             .
In the example above, the same IF condition had to be coded, and executed, twice. An alternative coding might also have been:
             statements
             IF some-condition-a
                statement-group-1
                PERFORM TEST-FOR-AND-DO-CONDITION-B
                statement-group-3
                .
             more-statements
             .
        TEST-FOR-AND-DO-CONDITION-B.
             IF some-condition-b
                statement-group-2
             .
Here the nested IF was replaced by an imperative PERFORM. The PERFORMed paragraph isolated the scope of its contained conditional IF statement.

The only way to terminate the scope of a conditional statement prior to COBOL-85 was to end the sentence containing it. Unfortunately, ending a sentence terminated all not yet terminated statements. This "feature" prevented selective statement scoping within nested conditional statements. In actual fact, it prevented all conditional statement nesting, except for the IF statement itself. IF statements, as illustrated above, could only be nested in very limited ways. The result was both bad and ugly.

Always use explicit scope terminators with the following statements:

  • IF ... END-IF
  • in-line PERFORM ... END-PERFORM
  • EVALUATE ... END-EVALUATE
Several COBOL verbs contain optional conditional clauses. Always terminate statements containing conditional clauses with the appropriate explicit scope terminator. A few examples of statements that may contain conditional clauses are:
  • ADD ... END-ADD
  • CALL ... END-CALL
  • COMPUTE ... END-COMPUTE
  • READ ... END-READ
  • STRING ... END-STRING
Explicit scope terminators bring structure to COBOL - use them.

One misplaced period can ruin your life
back to top

The separator periodA separator period is a period followed by at least one space serves multiple functions in COBOL. Within the PROCEDURE DIVISION it is required to delimit SECTION and PARAGRAPH names. It is also used to terminate sentences. This last usage has lead to a lot of grief over the years.

Ending a sentenceA sentence is a sequence of one or more statements that ends with a separator period. was the only mechanism available to terminate the scope of a conditional statement prior to the introduction of explicit scope terminators. The problem with the separator period is that it does not stand out particularly well. It is easy to miss and just about as easy to assume when reading code. Misplacing a period, or forgetting to put one in, changes the entire semantics of a program. Adding insult to injury is the separator comma. You guessed it, a separator comma is a comma followed by at least one space Comma can look a lot like a period but it does not have the same semantics.

Consider the following code fragment:

            IF some-condition
               statement-a
               statement-b
               statement-c
            statement-d.
            statement-e.
            .
One might assume that the scope of the IF statement ends with statement-c. That would imply statement-d and statement-e are unconditionally executed. Not so, the sentence ending (scope terminating) period is after statement-d which is firmly under the scope of the IF statement.

The following is no better, notice the "terminator" after statement-c is a comma. Commas do not terminate anything - they are semantically equivalent to spaces.

            IF some-condition
               statement-a
               statement-b
               statement-c,
            statement-d.
            statement-e.
            .
The solution to all of this ugliness is to use explicit scope terminators. When explicit scope terminators are used, only a single sentence should ever be required within a SECTION or PARAGRAPH. Periods are no longer required to delimit statement scope. Avoid separator periods except where language syntax explicitly requires them.

PERFORM THRU
back to top

The idea of executing a contiguous sequence of procedures A procedure can be either a section or paragraph. They are semantically equivalent in COBOL. Do not confuse PERFORMing a procedure with CALLing a subprogram (subroutine) or function using a construct like:
            PERFORM procedure-1 THRU procedure-2
is a bit odd to a non-COBOL programmer. The above PERFORM statement will execute all of the code starting from procedure-1 through to the end of procedure-2 regardless of however many intervening procedures occur between the two. What is going on here?

One upon a time, long long ago, before the advent of END-IF, the following coding style was very common. In fact it was pretty much the standard way of doing things.

       P-211-BUILD-INVOICE.
           statements...
           PERFORM P-2111-BUILD-INVOICE-LINE
              THRU P-2111-BUILD-INVOICE-LINE-EXIT.
           PERFORM P-2112-ACCUMULATE-TOTAL
              THRU P-2112-ACCUMULATE-TOTAL-EXIT.
           statements...
           .
       P-2111-BUILD-INVOICE-LINE.
           statements...
           IF WS-SKIP-THIS-ITEM-FLAG = "YES"
              Escape!GO TO P-2111-BUILD-INVOICE-LINE-EXIT.
           statements...
           .
       P-2111-BUILD-INVOICE-LINE-EXIT.
           EXIT
           .
       P-2112-ACCUMULATE-TOTAL.
           ADD WS-INVOICE-LINE-ITEM-AMT TO WS-INVOICE-TOT-AMT
            ON SIZE ERROR PERFORM E-INVOICE-OVERFLOW-ROUTINE.
           ADD +1 TO WS-INVOICE-ITEM-COUNT
           .
       P-2112-ACCUMULATE-TOTAL-EXIT.
           EXIT
           .
Procedures were coded in pairs, one as an entry point, one as an exit point. PERFORM THRU was then used to execute them as if they constituted a single unified procedure. The procedure serving as the exit point only ever contained a single statement: EXIT. A programmer could then "escape" from nested conditional statements using a "structured" GO to The "structured" GO to statement was a practical response to the lack of explicit scope terminators (see: IF without END) statement. This technique is illustrated above at the line marked "Escape!".

The EXIT statement does nothing. Do not be confused - it does not exit anything. The return from a PERFORMed procedure is a result of passing through the last statement of the last named procedure of the invoking PERFORM statement.

The no-op nature of EXIT can be demonstrated by recoding the P-211-BUILD-INVOICE paragraph as:

       P-211-BUILD-INVOICE.
           statements...
           PERFORM P-2111-BUILD-INVOICE-LINE
              THRU P-2112-ACCUMULATE-TOTAL-EXIT.
           statements...
           .
This version of the P-211-BUILD-INVOICE procedure has exactly the same semantics as the original. Beautiful... Not really! Many programmers make use of this "feature" to perform groups procedures that are not obviously related to each other.

Maintaining a program written in this manner is difficult. One must be very careful where new procedures are inserted. If a new procedure was required to do something like calculate reward points, the programmer must take care to insert the new procedure where it is not accidentally contained in some performed range. For example, inserting the new procedure P-2113-CALC-REWARD-POINTS after P-2111-BUILD-INVOICE-LINE-EXIT One could argue that a programmer would not insert a new procedure in the wrong place because structured procedure names as used here would prevent it. See the next section where I take a shot at that line of reasoning too... would work well with:

       P-211-BUILD-INVOICE.
           statements...
           PERFORM P-2111-BUILD-INVOICE-LINE
              THRU P-2111-BUILD-INVOICE-LINE-EXIT.
           PERFORM P-2112-ACCUMULATE-TOTAL
              THRU P-2112-ACCUMULATE-TOTAL-EXIT.
           PERFORM P-2113-CALC-REWARD-POINTS
              THRU P-2113-CALC-REWARD-POINTS-EXIT.
           statements...
           .
but would result in double calculation of reward points with:
       P-211-BUILD-INVOICE.
           statements...
           PERFORM P-2111-BUILD-INVOICE-LINE
              THRU P-2112-ACCUMULATE-TOTAL-EXIT.
           PERFORM P-2113-CALC-REWARD-POINTS
              THRU P-2113-CALC-REWARD-POINTS-EXIT.
           statements...
           .
This type of bug can go undetected for a long time.

Maintaining programs containing PERFORM THRU can become quite difficult when the same procedure is contained in multiple ranges or when several (more than two) procedures are included in a performed range.

So, why would anyone want to write programs in this manner? The short answer is that before COBOL-85 and the introduction of explicit scope terminators it was a very practical way of managing conditional scope termination. Today this coding practice is just an anachronism and a somewhat dangerous one at that.

There is absolutely no reason why a modern COBOL program should ever need to use the THRU form of the PERFORM verb. Similarly, the EXIT statement is just as pointless. There is no valid reason to propagate their usage.

If this section has not convinced you to abandon the usage of PERFORM THRU then have a look at the essay on Transfer of Control. This essay is a bit of a tough read but it concludes with some very strong reasons to abandon this coding style.


Structured Procedure names
back to top

Naming procedures to reflect program structure seems to be a matter of religion among some programmers. Try a Google search for COBOL naming/coding standards. Most promote or at least illustrate structured procedure names. Run away! They do it; they have all kinds of numbering rules (see this); they feel strongly about doing it; they can't provide a truly rational reason why it should be done. Basically, it is just dogma.

I should probably stop here. But I won't...

It is surprising how many procedures are PERFORMed from only one place in a COBOL program. From this observation it is not difficult to see why there is some temptation to encode the program PERFORM hierarchy using a structured naming system along the lines of:

        MAINLINE.
           PERFORM P-1000-GET-SOME-DATA
           PERFORM P-2000-DO-SOME-EDITS
           PERFORM P-3000-DO-SOME-CALCULATIONS
           PERFORM P-4000-PUT-RESULTS
           .
        P-1000-GET-SOME-DATA.
           PERFORM P-1100-CHECK-FOR-INPUT
           PERFORM P-1200-GET-INPUT
           PERFORM P-1300-CHECK-FOR-INPUT-ERROR
           IF WS-ERROR-CONDITION
              PERFORM E-1000-ERROR-ROUTINE
           END-IF
           .
        P-1100-CHECK-FOR-INPUT.
           PERFORM P-1110-SETUP-CHANNEL
           PERFORM P-1120-CHECK-CHANNEL
           IF WS-ERROR-CONDITION
              PERFORM E-2000-ERROR-ROUTINE
           END-IF
           .
        P-1110-SETUP-CHANNEL.

       etc...
The programmer is using a "functional" procedure numbering scheme where "normal" procedures are prefixed with the letter "P" and error procedures with the letter "E". The idea being that procedure function may be determined by looking at the first letter of the procedure name. Gosh, I wonder what the significance of the words "ERROR-ROUTINE" might have been in the name: E-2000-ERROR-ROUTINE?

Next, procedure nesting is immediately obvious from the imbedded number. One can deduce that a procedure named P-2144-something is performed in only place - somewhere inside a procedure called P-2140-something. Interesting. Of course, a procedure performed from multiple places would require a different naming convention to reflect this. Introduce more structured naming rules... What if a procedure performs more than 9 other procedures? More naming rules... What if procedure nesting goes beyond 4 levels? More rules... Get the picture?

What is the purpose of all this "structure" information? It is supposed promote better understanding and improve program reading. I know where I am because of the name. I know where I came from because of the name. I know... whatever, because of the name. This might have been useful when paper listings and Punch Card punch cards were the medium of program display. Today we have language aware programmable editors! Any number of COBOL aware editors are capable of providing all structural information needed to navigate through a program source.

Program maintenance and refactoring generally involve creation of new procedures, shifting around or re-using existing procedures and deletion of obsolete ones. A maintenance nightmare is sure to follow when a procedure name dictates where it can be performed from and where it fits into the program execution structure. Global name changes may be easy enough to do using an editor, but this is just adding "busy work" to an already challenging task. Furthermore, not all programmers are up to it. Over the years I have witnessed far too much "creative" coding in COBOL applications by programmers who would rather have their teeth pulled than maintain structured names properly. The net result has been some truly horrible and misleading code. Set your imagination free to test the depths of depravity that one could possibly reach when determined to avoid creating a new structured level number; to avoid renumbering existing procedures; or to reorder performed sequences. You will, I assure you, come up very short relative to what previous generations of COBOL programmers have actually done!

By comparison, the misguided morphing of Apps Hungarian variable naming into Systems Hungarian within the C/C++ Windows programming community caused a lot of frustration but probably has not done much real harm to the maintainability of the resulting code. Structured procedure naming and subsequent misguided program maintenance has done significant harm to a lot of legacy COBOL code.

Rational naming standards as they are commonly applied to data elements may be worth while and defendable. Structured procedure naming in COBOL is not.


NEXT SENTENCE
back to top

Trick question: What does the following program display?
       IDENTIFICATION DIVISION.
       PROGRAM-ID. EXAMPLE.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  I                  PIC S9(4) BINARY VALUE ZERO.
       PROCEDURE DIVISION.
           PERFORM VARYING I FROM 1 BY 1 UNTIL I > 5
               IF I = 3
                  NEXT SENTENCE
               ELSE
                  DISPLAY 'I IS NOW: ' I
               END-IF
           END-PERFORM
           DISPLAY 'ALL DONE'.
           DISPLAY 'PROGRAM TERMINATING...'
           GOBACK.
The answer is:
I IS NOW: 0001
I IS NOW: 0002
PROGRAM TERMINATING...
Why didn't the program count to 5? Why didn't it dispaly ALL DONE before PROGRAM TERMINATING? Pre and post COBOL-85 contstructs are intermixed within this code. Some pre/post COBOL-85 coding constructs do not work well with each other. NEXT SENTENCE is a stellar example. COBOL-85 marks a coding style dividing line. Live on one side of it or the other, crossing this line is both bad and ugly.

With pre COBOL-85, NEXT SENTENCE was the only "structured" way to branch out of nested conditional statements, specifically, nested IF statements. NEXT SENTENCE unconditinally transfers control to the end of the current sentence. COBOL sentences are delimited by separator periods and this implies that control flows to the statement following the next period. This was not so bad in the pre COBOL-85 universe because an end-of-sentence period also marked end-of-statement for all open conditional statements (corresponding to the end of the conditional IF containing NEXT SENTENCE). Basically NEXT SENTENCE unambiguously meant: Get me to the next statement following the conditional statements I am currently in. It doesn't matter how deep the current conditional nesting is, get me out of all of it. This was clear enough.

Along comes COBOL-85 with both imperative (scope terminated) See Transfer of Control for a discussion of imperative vs conditional statement types in COBOL and conditional (only terminated by period) versions of several statements, IF and PERFORM being the most notable. Code reading may become confusing, as in the example above, where NEXT SENTENCE is coded within the scope of an imperative statement (scope terminated). Some compilers can flag this type of sillyness as either a warning or outright error. Unfortunately the IBM Enterprise COBOL compiler is not one of them.

And this brings me to: CONTINUE. CONTINUE is not a synonym for NEXT SENTENCE. Replace NEXT SENTENCE with CONTINUE in the program example above it would display:

I IS NOW: 0001
I IS NOW: 0002
I IS NOW: 0004
I IS NOW: 0005
ALL DONE
PROGRAM TERMINATING...
which might be closer to what a casual reader would have expected the original program to have produced. The CONTINUE verb is a true no-op statement. It does exactly nothing! Do not confuse it with NEXT SENTENCE which is a disguised GO to statement.

Bottom line is that NEXT SENTENCE is an obsolete statement. Understand what it does, but don't use it. Better yet, remove it from legacy code when and where you can.

N.Bredin
Informatics

Valid XHTML 1.0 Transitional
Valid CSS!