Find HEX value in file and grep the following valu

I have a 2GB file in raw format. I want to search for all appearance of a specific HEX value "355A3C2F74696D653E" AND collect the following 28 characters.

Example: 355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135

In this case I want the output: "323031312D30342D32365431343A34373A30322D31343A34373A3135" or better: 2011-04-26T14:47:02-14:47:15

I have tried with

xxd -u InputFile | grep '355A3C2F74696D653E' | cut -c 1-28 > OutputFile.txt

and

xxd -u -ps -c 4000000 InputFile | grep '355A3C2F74696D653E' | cut -b 1-28 > OutputFile.txt

But I can't get it working.

Can anybody give me a hint?

标签： linux bash grep hex cut

3条回答

Viruses.

2楼-- · 2019-08-14 07:04

As you are using xxd it seems to me that you want to search the file as if it were binary data. I'd recommend using a more powerful programming language for this; the Unix shell tools assume there are line endings and that the text is mostly 7-bit ASCII. Consider using Python:

#!/usr/bin/python
import mmap
fd = open("file_to_search", "rb")
needle = "\x35\x5A\x3C\x2F\x74\x69\x6D\x65\x3E"
haystack = mmap.mmap(fd.fileno(), length = 0, access = mmap.ACCESS_READ)
i = haystack.find(needle)
while i >= 0:
    i += len(needle)
    print (haystack[i : i + 28])
    i = haystack.find(needle, i)

0人赞添加讨论(0) 举报

▲ chillily

3楼-- · 2019-08-14 07:10

Why convert to hex first? See if this awk script works for you. It looks for the string you want to match on, then prints the next 28 characters. Special characters are escaped with a backslash in the pattern.

Adapted from this post: Grep characters before and after match?

I added some blank lines for readability.

VirtualBox:~$ cat data.dat

Thisis a test of somerandom characters before thestringI want5Z</time>2011-04-26T14:47:02-14:47:15plus somemoredata

VirtualBox:~$ cat test.sh

awk '/5Z\<\/time\>/ {
  match($0, /5Z\<\/time\>/); print substr($0, RSTART + 9, 28);
}' data.dat

VirtualBox:~$ ./test.sh

2011-04-26T14:47:02-14:47:15

VirtualBox:~$

EDIT: I just realized something. The regular expression will need to be tweaked to be non-greedy, etc and between that and awk need to be tweaked to handle multiple occurrences as you need them. Perhaps some of the folks more up on awk can chime in with improvements as I am real rusty. An approach to consider anyway.

0人赞添加讨论(0) 举报

Anthone

4楼-- · 2019-08-14 07:16

If your grep supports -P parameter then you could simply use the below command.

$ echo '355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135' | grep -oP '355A3C2F74696D653E\K.{28}'
323031312D30342D32365431343A

For 56 chars,

$ echo '355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135' | grep -oP '355A3C2F74696D653E\K.{56}'
323031312D30342D32365431343A34373A30322D31343A34373A3135

0人赞添加讨论(0) 举报

Find HEX value in file and grep the following valu

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间