I have a series of ASCII flat files coming in from a mainframe to be processed by a C# application. A new feed has been introduced with a Packed Decimal (COMP-3) field, which needs to be converted to a numerical value.
The files are being transferred via FTP, using ASCII transfer mode. I am concerned that the binary field may contain what will be interpreted as very-low ASCII codes or control characters instead of a value - Or worse, may be lost in the FTP process.
What's more, the fields are being read as strings. I may have the flexibility to work around this part (i.e. a stream of some sort), but the business will give me pushback.
The requirement read "Convert from HEX to ASCII", but clearly that didn't yield the correct values. Any help would be appreciated; it need not be language-specific as long as you can explain the logic of the conversion process.
Some useful links for EBCDIC translation:
Translation table - useful to do check some of the values in the packed decimal fields: http://www.simotime.com/asc2ebc1.htm
List of code pages in msdn:
http://msdn.microsoft.com/en-us/library/dd317756(VS.85).aspx
And a piece of code to convert the byte array fields in C#:
First of all you must eliminate the end of line (EOL) translation problems that will be caused by ASCII transfer mode. You are absolutely right to be concerned about data corruption when the BCD values happen to correspond to EOL characters. The worst aspect of this problem is that it will occur rarely and unexpectedly.
The best solution is to change the transfer mode to BIN. This is appropriate since the data you are transferring is binary. If it is not possible to use the correct FTP transfer mode, you can undo the ASCII mode damage in code. All you have to do is convert \r\n pairs back to \n. If I were you I would make sure this is well tested.
Once you've dealt with the EOL problem, the COMP-3 conversion is pretty straigtforward. I was able to find this article in the MS knowledgebase with sample code in BASIC. See below for a VB.NET port of this code.
Since you're dealing with COMP-3 values, the file format you're reading almost surely has fixed record sizes with fixed field lengths. If I were you, I would get my hands of a file format specification before you go any further with this. You should be using a BinaryReader to work with this data. If someone is pushing back on this point, I would walk away. Let them find someone else to indulge their folly.
Here's a VB.NET port of the BASIC sample code. I haven't tested this because I don't have access to a COMP-3 file. If this doesn't work, I would refer back to the original MS sample code for guidance, or to references in the other answers to this question.
I have been watching the posts on numerous boards concerning converting Comp-3 BCD data from "legacy" mainframe files to something useable in C#. First, I would like to say that I am less than enamoured by the responses that some of these posts have received - especially those that have said essentially "why are you bothering us with these non-C#/C++ related posts" and also "If you need an answer about some sort of COBOL convention, why don't you go visit a COBOL oriented site". This, to me, is complete BS as there is going to be a need for probably many years to come, (unfortunately), for software developers to understand how to deal with some of these legacy issues that exist in THE REAL WORLD. So, even if I get slammed on this post for the following code, I am going to share with you a REAL WORLD experience that I had to deal with regarding COMP-3/EBCDIC conversion (and yes, I am he who talks of "floppy disks, paper-tape, Disc Packs etc... - I have been a software engineer since 1979").
First - understand that any file that you read from a legacy main-frame system like IBM is going to present the data to you in EBCDIC format and in order to convert any of that data to a C#/C++ string you can deal with you are going to have to use the proper code page translation to get the data into ASCII format. A good example of how to handle this would be:
StreamReader readFile = new StreamReader(path, Encoding.GetEncoding(037); // 037 = EBCDIC to ASCII translation.
This will ensure that anything that you read from this stream will then be converted to ASCII and can be used in a string format. This includes "Zoned Decimal" (Pic 9) and "Text" (Pic X) fields as declared by COBOL. However, this does not necessarily convert COMP-3 fields to the correct "binary" equivelant when read into a char[] or byte[] array. To do this, the only way that you are ever going to get this translated properly (even using UTF-8, UTF-16, Default or whatever) code pages, you are going to want to open the file like this:
FileStream fileStream = new FileStream(path, FIleMode.Open, FIleAccess.Read, FileShare.Read);
Of course, the "FileShare.Read" option is "optional".
When you have isolated the field that you want to convert to a decimal value (and then subsequently to an ASCII string if need be), you can use the following code - and this has been basically stolen from the MicroSoft "UnpackDecimal" posting that you can get at:
http://www.microsoft.com/downloads/details.aspx?familyid=0e4bba52-cc52-4d89-8590-cda297ff7fbd&displaylang=en
I have isolated (I think) what are the most important parts of this logic and consolidated it into two a method that you can do with what you want. For my purposes, I chose to leave this as returning a Decimal value which I could then do with what I wanted. Basically, the method is called "unpack" and you pass it a byte[] array (no longer than 12 bytes) and the scale as an int, which is the number of decimal places you want to have returned in the Decimal value. I hope this works for you as well as it did for me.
If you have any questions, post them on here - because I suspect that I am going to get "flamed" like everyone else who has chosen to post questions that are pertinent to todays issues...
Thanks, John - The Elder.
If the original data was in EBCDIC your COMP-3 field has been garbled. The FTP process has done an EBCDIC to ASCII translation of the byte values in the COMP-3 field which isn't what you want. To correct this you can:
1) Use BINARY mode for the transfer so you get the raw EBCDIC data. Then you convert the COMP-3 field to a number and translate any other EBCDIC text on the record to ASCII. A packed field stores each digit in a half byte with the lower half byte as a sign (F is positive and other values, usually D or E, are negative). Storing 123.4 in a PIC 999.99 USAGE COMP-3 would be X'01234F' (three bytes) and -123 in the same field is X'01230D'.
2) Have the sender convert the field into a USAGE IS DISPLAY SIGN IS LEADING(or TRAILING) numeric field. This stores the number as a string of EBCDIC numeric digits with the sign as a separate negative(-) or blank character. All digits and the sign translate correctly to their ASCII equivalent on the FTP transfer.
The packed fields are the same in EBCDIC or ASCII. Do not run the EBCDIC to ASCII conversion on them. In .Net dump them into a byte[].
You use bitwise masks and shifts to pack/unpack. -- But bitwise ops only apply to integer types in .Net so you need to jump through some hoops!
A good COBOL or C artist can point you in the right direction.
Find one of the old guys and pay your dues (about three beers should do it).
I apologize if I am way off base here, but perhaps this code sample I'll paste here could help you. This came from VBRocks...