Change order of columns in a txt file

2019-08-16 15:24发布

问题:

I have a txt file where some columns do not appear in every row but this causes the problem that in the rows where they appear they mess up the order of my columns:

35=d|5799=00000000|980=A|779=20190721173046000465|1180=310|1300=64|462=5|207=XCME|1151=ES|6937=ES|55=ESM0|48=163235|22=8|167=FUT|461=FFIXSX|200=202006|15=USD|1142=F|562=1|1140=3000|969=25.000000000|9787=0.010000000|996=IPNT|1147=50.000000000|1150=302775.000000000|731=00000110|5796=20190724|1149=315600.000000000|1148=285500.000000000|1143=600.000000000|1146=12.500000000|9779=N|864=2|865=5|1145=20190315133000000000|865=7|1145=20200619133000000000|1141=1|1022=GBX|264=10|870=1|871=24|872=00000000000001000010000000001111|1234=0|5791=279|5792=10121|

35=d|5799=00000000|980=A|779=20190721173046000465|1180=310|1300=64|462=5|207=XCME|1151=ES|6937=ES|55=ESU9|48=191262|22=8|167=FUT|461=FFIXSX|200=201909|15=USD|1142=F|562=1|1140=3000|969=25.000000000|9787=0.010000000|996=IPNT|1147=50.000000000|1150=302150.000000000|731=00000110|5796=20190724|1149=315700.000000000|1148=285600.000000000|1143=600.000000000|1146=12.500000000|9779=N|864=2|865=5|1145=20180615133000000000|865=7|1145=20190920133000000000|1141=1|1022=GBX|264=10|870=1|871=24|872=00000000000001000010000000001111|1234=0|5791=250519|5792=452402|

35=d|5799=00000000|980=A|779=20190721173046000465|1180=310|1300=64|462=5|207=XCME|1151=$E|6937=0ES|55=0ESQ9|48=229588|22=8|167=FUT|461=FFIXSX|200=201908|15=USD|1142=F|562=1|1140=3000|969=25.000000000|9787=0.010000000|996=IPNT|1147=50.000000000|1150=25.000000000|731=00000011|5796=20190607|1143=0.000000000|1146=12.500000000|9779=N|864=2|865=5|1145=20190621133000000000|865=7|1145=20190816133000000000|1141=1|1022=GBX|264=10|870=1|871=24|872=00000000000001000010000000001111|1234=0|

35=d|5799=00000000|980=A|779=20190721173114000729|1180=441|1300=56|462=16|207=DUMX|1151=1O|6937=OQE|55=OQEH4 C6100|48=1546|22=8|167=OOF|461=OCEFPS|201=1|200=202403|15=USD|202=6100.000000000|947=USD|9850=0.100000000|1142=F|562=1|1140=999|969=1.000000000|1146=10.000000000|9787=0.010000000|996=BBL|1147=1000.000000000|731=00000001|1148=0.100000000|9779=N|5796=20190718|864=2|865=5|1145=20181031213000000000|865=7|1145=20240126193000000000|1141=1|1022=GBX|264=3|870=1|871=24|872=00000000000001000000000100000101|1234=1|1093=4|1231=1.0000|711=1|309=211120|305=8|311=OQDH4|1647=0|

35=d|5799=00000000|980=A|779=20190721173115000229|1180=441|1300=56|462=16|207=DUMX|1151=1O|6937=OQE|55=OQEM4 C5700|48=2053|22=8|167=OOF|461=OCEFPS|201=1|200=202406|15=USD|202=5700.000000000|947=USD|9850=0.100000000|1142=F|562=1|1140=999|969=1.000000000|1146=10.000000000|9787=0.010000000|996=BBL|1147=1000.000000000|731=00000001|1148=0.100000000|9779=N|5796=20190718|864=2|865=5|1145=20181031213000000000|865=7|1145=20240425183000000000|1141=1|1022=GBX|264=3|870=1|871=24|872=00000000000001000000000100000101|1234=1|1093=4|1231=1.0000|711=1|309=329748|305=8|311=OQDM4|1647=0|

For example in the first three rows there always comes 461=… and then 200=… while starting from the 4th row between 461=… and 200=… there is 201=…

Now I thought of somehow moving every column which appears later which was not there in the first row to the end of the row so that it becomes the last column but I do not know how to do exactly this operation. Here is what I have tried:

 private static void ladeDatei(String datName) { 

        File file = new File(datName); 

        if (!file.canRead() || !file.isFile()) 
            System.exit(0); 

            BufferedReader in = null; 
        try { 
            in = new BufferedReader(new FileReader(datName)); 
            String row = null;
                String row2 = null; 
            while ((row = in.readLine()) != null) { 
                System.out.println("Gelesene Zeile: " + row); 

                 while(row.contains("|")) {

                    row2 = row.substring(row.indexOf("|") + 1);
                    row=row2;
                    row2 = row.substring(0, row.indexOf("=") + 1);
                    row2 = row2.replace("=", "");
                    if(!numbers.contains(row2)) {
                        numbers.add(row2);
                    }
                    System.out.println(row);
                    //System.out.println(row2);
                }             

            } 
        } catch (IOException e) { 
            e.printStackTrace(); 
        } finally { 
            if (in != null) 
                try { 
                    in.close(); 
                } catch (IOException e) { 
                } 
        } 
    } 

I thought about splitting every row by | and save them in the textArr list but then I wouldn't know which rows belong together. My main problem is that I don't know a good way to check if the column exists in an earlier row and how to move it to the end of the row.

EDIT: Now I saved every new entry in the numbers arraylist (see my edit in the code above) but now I am stuck because I don't know how to shift them and all the ones which come after them to the end of each row.

回答1:

That's a hell of a job. What I would do is:
(1) split the lines at |
(2) make a List where You append the numbers between | and = (append each new number at the end)
(3) make a Map where the line parts are mapped to the numbers in (2) as key
(4) make a second Map where the max-column-values of the line parts are mapped to the numbers in (2)
(5) read through the List from (2) joinig the associated line parts with | padded to the max-column-values (if there is no line part for a specific number You must do the padding as well)
When ever possible — I would prefer to structure the line parts in a html-table.
The change of the column order will not solve the problem of broader or smaller colums.