Java comparator for String in EBCDIC encoding

2019-01-26 18:30发布

问题:

I have come across a requirement where I need to convert a string to EBCDIC encoding and then sort it. We need to sort it with EBCDIC because the string has to go in mainframe. The string I will sort will have only alphabets in captial and integers only.

I googled it some and then I came across the link from IBM which has listed the characters in order

What I realized was that EBCDIC sorting is exactly opposite to normal java lexicographic sorting (at least for the type of data which I am going to process).

My question is my realization right ? If not what I am missing ? OR is there any java comparator available for EBCDIC encoding.

回答1:

Since the char type is implicitly UTF-16 in Java EBCDIC strings need to be compared as Java byte arrays.

Example:

    Charset encoding = Charset.forName("IBM1047");
    Comparator<String> encComparator = (s1, s2) ->
            encoding.encode(s1)
                    .compareTo(encoding.encode(s2));


回答2:

You should not spend much time figuring out the many peculiarities of EBCDIC. Given a limited scope of your problem, a simple approach to implementing your requirements is as follows:

  • Implement a helper method that reads EBCDIC and produces java.lang.String in Java's native encoding (UTF-16)
  • Implement a helper method that takes java.lang.String in Java's native encoding (UTF-16) and produces an EBCDIC-encoded string
  • Use the first method to read the data. Sort and do other processing as needed. Use the second method to write the data to mainframe.

This approach has an advantage that only two pieces of your code need to understand EBCDIC - the one that converts in, and the one that converts out. All other code can use Java system libraries and any libraries that you have for sorting, filtering, searching, and all other processing, without thinking about the EBCDIC encoding at all.



回答3:

Yes there is a comparator for EBCDIC encoding.Here is the code for it.

`Comparator<Entity Class name> EBCDIC = new Comparator<Entity Class name>() 

     {  
        Charset encoding = Charset.forName("cp500");

   @Override         
  public int compare(Entity Class name jc1, 
       Entity Class name jc2) {             
          return (int) (encoding.encode(jc1.toString()).compareTo(encoding.encode(jc2.toString())));         
        }     
      };