For some reason I made simple program in C to output binary representation of given input:
int main()
{
char c;
while(read(0,&c,1) > 0)
{
unsigned char cmp = 128;
while(cmp)
{
if(c & cmp)
write(1,"1",1);
else
write(1,"0",1);
cmp >>= 1;
}
}
return 0;
}
After compilation:
$ gcc bindump.c -o bindump
I made simple test to check if program is able to print binary:
$ cat bindump | ./bindump | fold -b100 | nl
Output is following: http://pastebin.com/u7SasKDJ
I suspected the output to look like random series of ones and zeroes. However, output partially seems to be quite more interesting. For example take a look at the output between line 171 and 357. I wonder why there are lots of zeros in compare to other sections of executable ?
My architecture is:
$ lscpu
Architecture: i686
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 28
Stepping: 10
CPU MHz: 1000.000
BogoMIPS: 3325.21
Virtualization: VT-x
L1d cache: 24K
L1i cache: 32K
L2 cache: 512K
When you compile a program into an executable on Linux (and a number of other unix systems), it is written in the ELF format. The ELF format has a number of sections, which you can examine with readelf or objdump:
For example, section
.text
contains CPU instructions,.data
global variables,.bss
uninitialized global variables (it is actually empty in the ELF file itself, but is created in the main memory when the program is executed),.plt
and.got
which are jump tables, debugging information, etc.Btw. it is much more convenient to examine the binary content of files with hexdump:
There you can see that starting with offset 0x850 (approx. line 171 in your dump) there is a lot of zeros, and you can also see the ASCII representation on the right.
Let us look at which sections correspond to the block of your interest between 0x850 and 0x1160 (the field
Off
– offset in the file is important here):You can examine the content of an individual section with -x:
You would see that there are many zeros. The section is composed of 18-byte values (= one line in the -x output) defining symbols. From
readelf -a
you can see that it has 68 entries, and first 27 of them (excl. the very first one) are of type SECTION:According to the specification (page 1-18), each entry has the following format:
Without going into too much detail here, I think what matters here is that st_name and st_size are both zeros for these SECTION entries. Both are 32-bit numbers, which means lots of zeros in this particular section.
This is not really a programming question, but however...
A binary normally consists of different sections: code, data, debugging info, etc. Since these sections contents differ by type, I would pretty much expect them to look different.
I.e. the symbol table consists of address offsets in your binary. If I read your
lspci
correctly, you are on a 32-bit system. That means Each offset has four bytes, and given the size of your program, in most cases two of those bytes will be zero. And there are more effects like this.You didn't
strip
your program, that means there's still lots of information (symbol table etc.) present in the binary. Try stripping the binary and have a look at it again.