In the 8086 architecture, the memory space is 1 Mbyte
in size and divided into logical segments of up to 64 Kbytes
each.
i.e. it has 20 address lines thus the following method is used:
That the data segment register is shifted left 4 bits then added to the offset register
My question is: How we do the shift operation although all the registers are only 16 bits
Address translation is done internally by a special unit without using the registers available to user code to store intermediate results - it just fetches 16-bit values and does the translation inside - it is not reflected anywhere where the user code could observe it.
In hardware, register is a combination of flips-flops to store bits of information.
A hardware chip may have millions of register like that inside to store current instruction, current states, values... Only a small number of them is exposed to programs to store values. That's the idea. The specific behind each architecture is the manufacturer's secret so you'll never see any public document about this.
This is a simple hardware address calculator in verilog. The real implementation maybe much more complicated