How to find a specific byte in many bytes?

2019-06-14 17:11发布

I readed a file using Java and use HexDump to output the data. It looks like this: The first and second line: one:31 30 30 31 30 30 30 31 31 30 30 31 30 31 31 31 two: 30 31 31 30 30 31 31 30 31 31 30 30 31 31 30 31 I want to print the data between first "31 30 30 31"and the second "31 30 30 31".My ideal ouput is 31 30 30 31 30 30 30 31 31 30 30 31 30 31 31 31 30 31. But the real output is wrong,I think my code can not find the 31 30 30 31 in the data1.How to figure it out?

I Use jdk 1.7 and the software is idea

import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.File;
public class TestDemo{

  public static void main(String[] args) {


        try {
            File file = new File("/0testData/1.bin");
            DataInputStream isr = new DataInputStream(newFileInputStream(file));

            int bytesPerLine = 16;

            int byteCount = 0;
            int data;
            while ((data = isr.read()) != -1) {
                if (byteCount == 0)
                    System.out.println();
                else if (byteCount % bytesPerLine == 0)
                    System.out.printf("\n",byteCount );
                else
                    System.out.print(" ");


                String data1 = String.format("%02X",data & 0xFF);
                System.out.printf(data1);


                byteCount += 1;
                if(data1.contains("31 30 30 31")) {
                    int i=data1.indexOf("31 30 30 31",12);

                    System.out.println("find it!");
                    String strEFG=data1.substring(i,i+53);
                    System.out.println("str="+strEFG);
                }else {
                    System.out.println("cannot find it");
                }

            }

        } catch (Exception e) {
            System.out.println("Exception: " + e);
        }

    }
}


My ideal ouput is 31 30 30 31 30 30 30 31 31 30 30 31 30 31 31 31 30 31. But the real output is:

31cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 30cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it 31cannot find it

30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 31cannot find it

31cannot find it 31cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 30cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it 31cannot find it

31cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it 31cannot find it 31cannot find it 31cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 31cannot find it 31cannot find it

30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 31cannot find it 30cannot find it 30cannot find it 31cannot find it 30cannot find it 30cannot find it

1条回答
\"骚年 ilove
2楼-- · 2019-06-14 17:46

I feel that your input data is a bit confusing. Nevertheless, this probably answers your question.

It doesn't give quite the same output that you are asking for, but I think you should be able to tweak it to turn on or off the output by using the flag "inPattern". If inPattern is true, print your data read from the file, if false, do not print the data read from the file.

This is probably not the best form of coding as it is entirely static methods - but it does what you ask for.

The problem with your code (I think) is that data1 will be a 2 character string. It is impossible for it to contain a 11 character string ("31 30 30 31"). If you tried reversing the test (i.e. "31 30 30 31".contains(data1)) then it will only be matching a single byte - not the 4 bytes you are intending to match.

package hexdump;

import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.LinkedList;

public class HexDumpWithFilter {
//    private static final int beginPattern [] = { 0x47, 0x0d, 0x0a, 0x1a };
    private static final int beginPattern [] = { 0x00, 0x83, 0x7d, 0x8a };
    private static final int endPattern [] = { 0x23, 0x01, 0x78, 0xa5 };
    private static LinkedList<Integer> bytesRead = new LinkedList();

    public static void main(String[] args) {
        try {
            InputStream isr = new DataInputStream(new FileInputStream("C:\\Temp\\resistor.png"));
            int bytesPerLine = 16;
            int byteCount = 0;
            int data;
            boolean inPattern = false;
            while ((data = isr.read()) != -1) {
                // Capture the data just read into an input buffer.
                bytesRead.add(data);
                // If we have too much data in the input buffer to compare to our
                // pattern, peel off the first byte.
                // Note: This assumes that the begin pattern and end Pattern are the same lengths.
                if (bytesRead.size() > beginPattern.length) {
                    bytesRead.removeFirst();
                }

                // Output a byte count at the start of each new line of output.
                if (byteCount % bytesPerLine == 0)
                    System.out.printf("\n%04x:", byteCount);

                // Output the spacing - if we have found our pattern, then also output an asterisk
                System.out.printf(inPattern ? " *%02x" : "  %02x", data);

                // Finally check to see if we have found our pattern if we have enough bytes
                // in our bytesRead buffer.
                if (bytesRead.size() == beginPattern.length) {
                    // If we are not currently in a pattern, then check for the begin pattern
                    if (!inPattern && checkPattern(beginPattern, bytesRead)) {
                        inPattern = true;
                    }
                    // if we are currently in a pattern, then check for the end pattern.
                    if (inPattern && checkPattern (endPattern, bytesRead)) {
                        inPattern = false;
                    }
                }

                byteCount += 1;
            }
            System.out.println();
        } catch (Exception e) {
            System.out.println("Exception: " + e);
        }
    }

    /**
     * Function to check whether our input buffer read from the file matches
     * the supplied pattern.
     * @param pattern the pattern to look for in the buffer.
     * @param bytesRead the buffer of bytes read from the file.
     * @return true if pattern and bytesRead have the same content.
     */
    private static boolean checkPattern (int [] pattern, LinkedList<Integer> bytesRead) {
        int ptr = 0;
        boolean patternMatch = true;
        for (int br : bytesRead) {
            if (br != pattern[ptr++]) {
                patternMatch = false;
                break;
            }
        }
        return patternMatch;
    }
}

There is a small problem with this code in that it does not mark the beginning pattern, but does mark the ending pattern. Hopefully this is not a problem for you. If you need to correctly mark the beginning or not mark the ending, then there will be another level of complexity. Basically you would have to read ahead in the file and write the data out 4 bytes behind the data you have been reading. This could be achieved by printing the value that comes off of the buffer at the line which reads:

    bytesRead.removeFirst();

rather than printing the value read from the file (i.e. the value in the "data" variable).

Following is an example of the data produced when run against a PNG file of an image of a resistor.

0000:  89  50  4e  47  0d  0a  1a  0a  00  00  00  0d  49  48  44  52
0010:  00  00  00  60  00  00  00  1b  08  06  00  00  00  83  7d  8a
0020: *3a *00 *00 *00 *09 *70 *48 *59 *73 *00 *00 *2e *23 *00 *00 *2e
0030: *23 *01 *78 *a5  3f  76  00  00  00  07  74  49  4d  45  07  e3
0040:  03  0e  17  1a  0f  c2  80  9c  d0  00  00  01  09  49  44  41
0050:  54  68  de  ed  9a  31  0b  82  40  18  86  cf  52  d4  a1  7e
0060:  45  4e  81  5b  a3  9b  10  ae  ae  4d  4d  61  7f  a1  21  1b
0070:  fa  0b  45  53  53  ab  ab  04  6e  42  4b  9b  d0  64  bf  a2
0080:  06  15  a9  6b  ef  14  82  ea  ec  e8  7d  c6  f7  0e  f1  be
0090:  e7  3b  0f  0e  25  4a  29  25  a0  31  5a  28  01  04  fc  35
00a0:  f2  73  e0  af  af  b5  93  fd  c9  8c  cd  36  cb  da  f9  ae
00b0:  ad  11  d3  50  84  2e  50  92  96  24  88  f2  ca  b1  41  7b
00c0:  cc  64  c7  db  b6  be  7e  5e  87  ef  0e  08  e3  82  64  85
00d0:  b8  47  4c  56  50  12  c6  85  b8  9f  20  1e  0b  10  bd  81
00e0:  64  1e  5b  38  49  cb  ca  31  e3  7c  67  b2  b4  c7  f6  c4
00f0:  62  da  65  b2  f9  ea  c2  64  a7  dd  90  c9  fa  a3  3d  0e
0100:  61  00  01  10  00  20  00  02  00  04  40  00  80  00  08  00
0110:  10  00  01  00  02  7e  82  af  5f  c6  99  86  42  5c  5b  7b
0120:  eb  19  be  f7  e2  8d  a4  77  f8  e8  bb  07  51  5e  7b  91
0130:  28  c4  0e  d0  55  89  38  96  2a  6c  77  3a  96  4a  74  55
0140:  12  57  00  8f  05  88  de  40  12  fe  8a  c0  21  0c  01  00
0150:  02  20  00  34  c3  03  f7  3f  46  9a  04  49  f8  9d  00  00
0160:  00  00  49  45  4e  44  ae  42  60  82

Note that some of the bytes have an asterisk in front of them? These are the bytes that are inside of the beginPattern and endPattern.

Also note that I used a beginPattern and an endPattern. You do not need to do this, I only did it to make it easier for me to find a pattern in my resistor.png file to test the pattern matching. You can use one variable for both begin and end, set the same value for both or simply assign endPattern = beginPattern if you want to use a single pattern (e.g. "0x31, 0x30, 0x30, 0x31") for the start and finish.

查看更多
登录 后发表回答