I understand that when sending ip messages around, each hop in the network path between be and my packet's destination will check if the next hop's MTU is bigger than the size of the packet I sent. If so, the packet will be fragmented and the two packets will be separately sent to the next hop, only to be reassembled at destination (or, in some cases, at the first NAT router encountered).
As far as I understand, this thing can be pretty bad, but I don't really understand why.
- I understand that if the connection tends to drop a lot of packets, losing a single fragment means I have to resend the whole packet (this is actually the only thing I figured out myself)
- Is there a chance that instead of being fragmented my packet will just be dropped?
- How are packet fragments identified? Can I be 100% sure that they will be reassembled correctly? On example, if I send two ip packets of the same length nearly simultaneously to the same destination, how likely it is that fragments of the two will be swaped, like AAA, BBB reassembled into ABA, BAB?
In principle, if packets aren't dropped and fragments are reassembled correctly, actually using packet fragmentation seems like a good idea to save on local bandwidth and avoid having to send more and more headers instead of just one big packet.
Thank you
To my knowledge, the only case where packets will be dropped rather than fragmented (barring cases where it would be dropped anyway), is packets which are marked "don't fragment". These packets are to be discarded rather than being fragmented.
Fragmented packets have identifier, fragment offset, and more fragments fields in their headers that, when combined, allow the destination host to reliably reassemble the packet upon receipt of all the fragments. The first fragment's offset is zero, and the last fragment has the more fragments flag set to zero. It is still possible (although very unlikely) to reassemble an incorrect packet if two packets' headers are mutated so their fragment offsets are exchanged, but their checksums are still valid. The probability of this happening is essentially zero. Bear in mind that IP does not provide any mechanism for ensuring the integrity of the data payload, only the integrity of the control information in the header.
Packet fragmentation necessarily wastes bandwidth because each fragment has a copy of [most of] the original datagram's header. Packets can be fragmented down to only 8 bytes per fragment, so we could have a maximum-sized packet at 60 + 65536 bytes fragmented into 60 * 8192 + 65536 bytes, yielding a payload increase of about 750% in the worst case. The only example I can come up with where you would come out ahead is if you fragmented a packet in order to send its fragments in parallel using some kind of Frequency Division Multiplexing scheme with the knowledge that the other channels are free. At that point, it still seems like it would require more work than would be saved to detect that circumstance and divide the packet rather than just sending it.
All the basic details about the mechanics of packet fragmentation in IP can be found in IETF RFC 791, if you're hungry for more information.
IP fragmentation can cause several problems:
1) Application layer loss is increased
As you mentioned, if a single fragment is dropped, the entire layer 4 packet will be lost. Thus, for a network with a small random packet loss rate, the application layer loss rate is increased by a factor approximately equal to the number of fragments for each layer 4 packet.
2) Not all networks handle fragmented packets
Some systems, such as Google's Compute Engine, do not reassemble fragmented packets.
3) Fragmentation can cause re-ordering
When routers split traffic down parallel paths, they may try to keep packets from the same flow on a single path. Because only the first fragment has layer 4 information like UDP/TCP port number, subsequent fragments may be routed down a different path, delaying assembly of the layer 4 packet and causing re-ordering.
4) Fragmentation can cause confusing behavior that is hard to debug
For example, if you send two UDP streams, A and B, from one source to a destination running Linux, the destination may discard packets from one of the streams. This is because by default, Linux "times out" fragment queues if more than 64 other fragments have been received from the same source. If stream A has a much higher data rate than stream B, 64 fragments from stream A may arrive in between the fragments from stream B, causing the B fragment to be dropped.
Thus, while IP fragmentation can reduce overhead by minimizing user headers, it may cause more trouble than it is worth.