Different results using OCCURS with different comp

2019-07-26 09:04发布

问题:

I'm attempting to output the following row using DISPLAY and am getting the correct result in Micro Focus COBOL in Visual Studio and the Tutorialspoint COBOL compiler, but something strange when running it on a z/OS Mainframe using IBM's Enterprise COBOL:

01 W05-OUTPUT-ROW.
   05 W05-OFFICE-NAME PIC X(13).
   05 W05-BENEFIT-ROW OCCURS 5 TIMES.
       10 PIC X(2) VALUE SPACES.
       10 W05-B-TOTAL PIC ZZ,ZZ9.99 VALUE ZEROS.
   05 PIC X(2) VALUE SPACES.
   05 W05-OFFICE-TOTAL PIC ZZ,ZZ9.99 VALUE ZEROS.

It appears in Enterprise COBOL that the spaces are being ignored, and is adding an extra zero-filled column even though the PERFORM VARYING and DISPLAY code is the exact same in both versions:

PERFORM VARYING W02-O-IDX FROM 1 BY 1
   UNTIL W02-O-IDX > W12-OFFICE-COUNT

   MOVE W02-OFFICE-NAME(W02-O-IDX) TO W05-OFFICE-NAME

   PERFORM 310-CALC-TOTALS VARYING W02-B-IDX FROM 1 BY 1
       UNTIL W02-B-IDX > W13-BENEFIT-COUNT

   MOVE W02-O-TOTAL(W02-O-IDX) TO W05-OFFICE-TOTAL
   DISPLAY W05-OUTPUT-ROW
END-PERFORM

W13-BENEFIT-COUNT is 5 and never changes in the program, so the 6th column is a mystery to me.

Correct output:

Strange output:

Edit: as requested, here is W02-OFFICE-TABLE:

01 W02-OFFICE-TABLE.
    05 W02-OFFICE-ROW OCCURS 11 TIMES
    ASCENDING KEY IS W02-OFFICE-NAME
    INDEXED BY W02-O-IDX.
        10 W02-OFFICE-CODE PIC X(6).
        10 W02-OFFICE-NAME PIC X(13).
        10 W02-BENEFIT-ROW OCCURS 5 TIMES
        INDEXED BY W02-B-IDX.
            15 W02-B-CODE PIC 9(1).
            15 W02-B-TOTAL PIC 9(5)V99 VALUE ZERO.
        10 W02-O-TOTAL PIC 9(5)V99 VALUE ZERO.

and W12-OFFICE-COUNT is always 11, never changes:

01 W12-OFFICE-COUNT PIC 99 VALUE 11.

回答1:

I'd be very hesitant about mixing VALUE with OCCURS and re-code the WS as

01 W05-OUTPUT-ROW.
   05 W05-OFFICE-NAME  PIC X(13).
   05 W05-BENEFITS     PIC X(55) VALUE SPACES.
   05 FILLER REDEFINES W05-BENEFITS.
     07 W05-BENEFIT-ROW OCCURS 5 TIMES.
       10 FILLER       PIC X(02).
       10 W05-B-TOTAL  PIC ZZ,ZZ9.99.
   05 FILLER           PIC X(02) VALUE SPACES.
   05 W05-OFFICE-TOTAL PIC ZZ,ZZ9.99 VALUE ZEROS.

Perhaps it has something to do with the missing fieldname?

Ah! evil INDEXED. I'd make both ***-IDX variables simple 99s.



回答2:

The question is not so much "why does Enterprise COBOL do that?", because it is documented, as "why do those other two compilers generate programs that do what I want?", which is probably also documented.

Here's a quote from the draft of what became the 2014 COBOL Standard (the actual Standard costs money):

C.3.4.1 Subscripting using index-names

In order to facilitate such operations as table searching and manipulating specific items, a technique called indexing is available. To use this technique, the programmer assigns one or more index-names to an item whose data description entry contains an OCCURS clause. An index associated with an index-name acts as a subscript, and its value corresponds to an occurrence number for the item to which the index-name is associated.

The INDEXED BY phrase, by which the index-name is identified and associated with its table, is an optional part of the OCCURS clause. There is no separate entry to describe the index associated with index-name since its definition is completely hardware oriented. At runtime the contents of the index correspond to an occurrence number for that specific dimension of the table with which the index is associated; however, the manner of correspondence is determined by the implementor. The initial value of an index at runtime is undefined, and the index shall be initialized before use. The initial value of an index is assigned with the PERFORM statement with the VARYING phrase, the SEARCH statement with the ALL phrase, or the SET statement.

[...]

An index-name may be used to reference only the table to which it is associated via the INDEXED BY phrase.

From the second paragraph, it is clear that how an index is implemented is down to the implementor of the compiler. Which means that what an index actually contains, and how it is manipulated internally, can vary from compiler to compiler, as long as the results are the same.

The last paragraph quoted indicates that, by the Standard, a specific index can only be used for the table which defines that specific index.

You have some code equivalent to this in 310-CALC-TOTALS: take a source data-item using the index from its table, and use that index from the "wrong" table to store a value derived from that in a different table.

This breaks the "An index-name may be used to reference only the table to which it is associated via the INDEXED BY phrase."

So you changed your code in 310-CALC-TOTALS to: take a source data-item using the index from its table, and use a data-name or index defined on the destination table to store a value derived from that in a different table.

So your code now works, and will give you the same result with each compiler.

Why did the Enterprise COBOL code compile, if the Standard (and this was the same for prior Standards) forbids that use?

IBM has a Language Extension. In fact two Extensions, which are applicable to your case (quoted from the Enterprise COBOL Language Reference in Appendix A):

Indexing and subscripting ... Referencing a table with an index-name defined for a different table

and

OCCURS ... Reference to a table through indexing when no INDEXED BY phrase is specified

Thus you get no compile error, as using an index from a different table and using an index when no index is defined on the table are both OK.

So, what does it do, when you use another index? Again from the Language Reference, this time on Subscripting using index-names (indexing)

An index-name can be used to reference any table. However, the element length of the table being referenced and of the table that the index-name is associated with should match. Otherwise, the reference will not be to the same table element in each table, and you might get runtime errors.

Which is exactly what happened to you. The difference in lengths of the items in the OCCURS is down to the "insertion editing" symbols in your PICture for the table you DISPLAY from. If the items in the two tables were the same length, you'd not have noticed a problem.

You gave a VALUE clause for your table items (unnecessary, as you would always put something in them before the are output) and this left your "sixth" column, the five previous columns were written as shorter items. Note the confusion caused when the editing is done to one length and the storing done with a different implicit length, you even overwrite the second decimal place.

IBM's implementation of INDEXED BY means that the length of the item(s) being indexed is intrinsic. Hence the unexpected results when the fields referenced are actually different lengths.

What about the other two compilers? You'd need to hit their documentation to be certain of what was happening (something as simple as the index being represented by an entry-number (so plain 1, 2, 3, etc), and the allowing of an index to reference another table would be enough). There should be two extensions: to allow an index to be used on a table which did not define that index; to allow an index to be used on a table where no index is defined. The two logically come as a pair, and both only need to be specific (the first would do otherwise) because the are specifically against the Standard.

Micro Focus do have a Language Extension whereby an index from one table may be used to reference data from another table. It is not explicit that this includes referencing a table with no indexes defined, but this is obviously so.

Tutorialspoint uses OpenCOBOL 1.1. OpenCOBOL is now GnuCOBOL. GnuCOBOL 1.1 is the current release, which is different and more up-to-date than OpenCOBOL 1.1. GnuCOBOL 2.0 is coming soon. I contribute to the discussion area for GnuCOBOL at SourceForge.Net and have raised the issue there. Simon Sobisch of the GnuCOBOL project has previously approached Ideaone and Tuturialspoint about their use of the out-dated OpenCOBOL 1.1. Ideaone have provided positive feedback, Tutorialspoint, who Simon has again contacted today, nothing yet.

As a side-issue, it looks like you are using SEARCH ALL to do a binary-search of your table. For "small" tables, it is likely that the overhead of the mechanics of the generalised binary-search provided by SEARCH ALL outweighs any expected savings in machine resources. If you were to be processing large amounts of data, it is likely that a plain SEARCH would be more efficient than the SEARCH ALL.

How small is "small" depends on your data. Five is likely to be small close to 100% of the time.

Better performance than SEARCH and SEARCH ALL functionality can be achieved by coding, but remember that SEARCH and SEARCH ALL don't make mistakes.

However, especially with SEARCH ALL, mistakes by the programmer are easy. If the data is out of sequence, SEARCH ALL will not operate correctly. Defining more data than is populated gets a table quickly out of sequence as well. If using SEARCH ALL with a variable number of items, consider using OCCURS DEPENDING ON for the table, or "padding" unused trailing entries with a value beyond the maximum key-value that can exist.