Very strange behavior...
I have 3 machines:
----------- ------------ ----------- | A (x86) |-----| B (x86) |-----| C (arm) | | sender | | receiver | | sender | ----------- ------------ -----------
- A and B are Linux (Ubuntu 12.04) machines, kernel 3.2;
- C is an android (ICS) machine, kernel 3.0.8;
- All are connected via RJ45 cables;
- Connections are OK, network is set up correctly;
Issue is: when machine C (ARM-android) sends a UDP packet which payload size is over 1472 bytes (maximum payload before packet gets fragmented), server application on machine B is never able to receive it, ... regarding that:
- Source/Dest IP addresses are correct: I can receive all the datagrams I send if I set the payload size less or equal to 1472;
- On machine B (receiver), if I dump network traffic with Wireshark, I can see each fragment, and then re-assembled message => from Wireshark point of view, it's all good!
- Comparing each fragment header as well as re-assembled message with what I can dump when the same message is sent from machine A (which is always received OK), everything seems perfect (only differences are IP addresses, and checksum, since UDP header checksum takes in account IP address fields).
- There is no MTU issue, packets are fragmented as expected.
- There is no router/switch between the machines
- ifconfig shows neither packets drop, nor overflows, nor any other classical error!
- ... this is so weird!!
I've spent some time on Internet, but never found any topic like this one. Each time people has troubles with UDP, either their MTU discovery was not correct, or they did some mishandling in the testing procedure, or they could not dump message on receiver host, ... this is not the case here!!
For sure, I know issue is on sender end (machine C), but maybe is could be easier to enable some logs (at kernel level?) on receiver end to understand why UDP datagram disappears!? Any advice? Are there specific files I could check in /proc/sys/net, or kernel options I should enable?
Thanks a lot.
If your machines are indeed connected as depicted, i.e. they are not connected to switch/hub, then you must have two NICs on B therefore they will have different addresses so the address you use to send to B from A will not be the same as sending to B from C. i.e.
Could the address you are sending to be wrong? Though that would not explain how smaller datagrams get through - are you sure they are?
Note: Theses addresses would have to be assigned manually as you are not connected to DHCP, furthermore these addresses will need to be in the same subnet as A and C. Are all your addresses (A, BA, BC and C) in the same subnet? What address:port is the socket on B bound to and listening on? Does B continue to receive after receiving a datagram? Please provdie some code..
OR, even if your machines are connected to a switch/hub,, is the "Don't fragment" bit set on datagrams sent from C which would explain why larger ones are dropped but not smaller ones.