For some reason, every multicast example I run (the computer runs OpenSUSE Linux) will work. The clients all just sit silently. How do I figure out why the multicast is being blocked/ignored?
Some of the examples:
EXAMPLE 1
http://www.roseindia.net/java/example/java/net/udp/UDPMulticastServer.java
Example 2
http://docs.oracle.com/javase/tutorial/networking/datagrams/broadcasting.html
(uses these files:)
http://docs.oracle.com/javase/tutorial/networking/datagrams/examples/MulticastServer.java
http://docs.oracle.com/javase/tutorial/networking/datagrams/examples/MulticastServerThread.java
http://docs.oracle.com/javase/tutorial/networking/datagrams/examples/MulticastClient.java
http://docs.oracle.com/javase/tutorial/networking/datagrams/examples/one-liners.txt
When troubleshooting IP multicast, there are some big-picture things you can do to isolate whether this is a host issue, software issue, or network issue:
- Step 1: Ensure the receiver is sending IGMP group joins on the correct interface. Look for the multicast source's traffic on the receiver's interface.
- Step 2: Ensure the server is sending traffic on the proper multicast group out the correct interface
- Step 3: Perform something like a
ping
test for IP multicast (using linux's socat
tool)
The details for each step are outlined below...
Step 1
First, ensure that the linux multicast receivers are correctly advertising their group membership reports; keep in mind that a lot of things in multicast work backwards from unicast. For instance, multicasting requires that you send an IGMP join packet that contains the multicast group you want to receive.
Use tcpdump
or tshark
to examine the interface in question... In the example below, I have a machine on 192.168.12.238
that is announcing (via igmp
) that it wants to receive multicast traffic from 239.255.0.1
[mpenning@Finger ~]$ sudo tshark -n -V -i eth0 igmp
Running as user "root" and group "root". This could be dangerous.
Capturing on eth0
Frame 1 (54 bytes on wire, 54 bytes captured)
Arrival Time: Dec 6, 2011 09:08:45.156782000
... >snip< ...
Internet Protocol, Src: 192.168.12.238 (192.168.12.238), Dst: 224.0.0.22 (224.0.0.22)
Version: 4
Header length: 24 bytes
Differentiated Services Field: 0xc0 (DSCP 0x30: Class Selector 6; ECN: 0x00)
1100 00.. = Differentiated Services Codepoint: Class Selector 6 (0x30)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 40
Identification: 0x0000 (0)
Flags: 0x02 (Don't Fragment)
0.. = Reserved bit: Not Set
.1. = Don't fragment: Set
..0 = More fragments: Not Set
Fragment offset: 0
Time to live: 1
Protocol: IGMP (0x02)
Header checksum: 0x3663 [correct]
[Good: True]
[Bad : False]
Source: 192.168.12.238 (192.168.12.238)
Destination: 224.0.0.22 (224.0.0.22)
Options: (4 bytes)
Router Alert: Every router examines packet
Internet Group Management Protocol
[IGMP Version: 3]
Type: Membership Report (0x22)
Header checksum: 0xe9fd [correct]
Num Group Records: 1
Group Record : 239.255.0.1 Change To Exclude Mode
Record Type: Change To Exclude Mode (4)
Aux Data Len: 0
Num Src: 0
Multicast Address: 239.255.0.1 (239.255.0.1)
^C1 packet captured
Now check and see whether the multicast source's traffic is getting to this interface (I'm assuming it was eth0, below):
sudo tshark -n -i eth0 ip and host 239.255.0.1
If you see traffic sent to the proper multicast group, then proceed directly to Step 3; otherwise go to Step 2.
Step 2
Next ensure that your multicast server is sending the traffic to the correct group. In the example below, I run a command to sniff eth0
for traffic sent to 239.255.0.1
.
[mpenning@hotcoffee Models]$ sudo tshark -n -i eth0 ip and host 239.255.0.1
1.466991 192.168.12.236 -> 239.255.0.1 UDP Source port: 11111 Destination port: 11111
If the multicast source is sending traffic to the right group here in Step 2, you saw IGMP group joins in Step 1, and Step 1 did not see traffic at the multicast receiver's interface, then contact your network administrators about this problem.
Step 3
Assuming all that works, and you still want an acid test in case your multicast receiver software is somehow discarding multicast it receives from the IP stack... make sure you have socat
installed on your machine and do the following...
On the multicast sender (server), use this command to send test multicast packets to 239.255.0.1
:
perl -e '$ii=0; while (1) { print "hi number $ii\n"; $ii++; }' | socat - UDP-SENDTO:239.255.0.1:11111,sp=11111
On the multicast receiver (client), use this command to listen to test multicast packets sent to 239.255.0.1
on eth0
:
socat - UDP-DATAGRAM:239.255.0.1:11111,bind=:11111,ip-add-membership=239.255.0.1:eth0
Assuming your network administrators are allowing multicast on 239.255.0.1
, you will see a lot of traffic like this in the multicast receiver's terminal window:
hi number 212289
hi number 212290
hi number 212291
hi number 212292
hi number 212293
hi number 212294
hi number 212295
hi number 212296
hi number 212297
hi number 212298
NOTE: do not try this with a multicast group address that is already in production use on your network.
Step 4
If steps 1, 2, and 3 reveal that multicast traffic is being sent and received through your network, then call up the software developer and tell them you think there is a problem with the application and explain the steps you have taken so far.
If steps 1, 2, or 3 do not work, reconfigure your software / hosts / network until they do. Warning, multicast in IP networks is 3x harder to implement correctly than IP unicast.
Best of luck to you...
How to check Multicasting ?
When clustering breaks, it could be due to a number of reasons. One of them is multicasting (where applications subscribe to a certain IP address and listen for messages). If users find themselves intermittently logged out, this might indicate such a problem.
A multicast IP address will be in the range 224.0.0.0 to 239.255.255.255.
This post is just a reminder to me what checks to do at the linux level:
Run netstat -g to get the multicast addresses this host subscribes to.
[root@bruatwls001 ~]$ netstat -g
IPv6/IPv4 Group Memberships
Interface RefCnt Group
--------------- ------ ---------------------
lo 1 all-systems.mcast.net
eth0 2 239.128.4.0
eth0 1 all-systems.mcast.net
Note in the RefCnt column it shows 2 members belong to the group 239.128.4.0
Pinging this multicast address reveals which members subscribe to the group (or cluster):
[root@bruatwls001 ~]$ ping 239.128.4.0
PING 239.128.4.0 (239.128.4.0) 56(84) bytes of data.
64 bytes from 10.35.8.12: icmp_seq=0 ttl=64 time=0.032 ms
64 bytes from 10.35.8.13: icmp_seq=0 ttl=64 time=0.207 ms (DUP!)
64 bytes from 10.35.8.12: icmp_seq=1 ttl=64 time=0.029 ms
64 bytes from 10.35.8.13: icmp_seq=1 ttl=64 time=0.193 ms (DUP!)
64 bytes from 10.35.8.12: icmp_seq=2 ttl=64 time=0.028 ms
64 bytes from 10.35.8.13: icmp_seq=2 ttl=64 time=0.241 ms (DUP!)
jgroups
A good way to test multicasting is using jgroups. See http://www.jgroups.org/manual/html/ch02.html#ItDoesntWork
Download jgroups-3.3.3.Final.jar
[quick@laptop]$ java -cp jgroups-3.3.3.Final.jar org.jgroups.tests.McastReceiverTest -mcast_addr 231.12.21.132 -port 45566
Socket=0.0.0.0/0.0.0.0:45566, bind interface=/fe80:0:0:0:201:4aff:fe5e:5331%2
Socket=0.0.0.0/0.0.0.0:45566, bind interface=/192.168.1.5
Socket=0.0.0.0/0.0.0.0:45566, bind interface=/0:0:0:0:0:0:0:1%1
Socket=0.0.0.0/0.0.0.0:45566, bind interface=/127.0.0.1
- Start the sender and type something:
[quick@centos ~]$ java -cp jgroups-3.3.3.Final.jar org.jgroups.tests.McastSenderTest -mcast_addr 231.12.21.132 -port 45566
Socket #1=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/fe80:0:0:0:fc54:ff:fedc:d6da%7
Socket #2=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/fe80:0:0:0:21d:7dff:fe03:4cf5%2
Socket #7=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/192.168.122.1
Socket #8=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/fe80:0:0:0:21d:7dff:fe03:4cf5%2
Socket #9=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/fe80:0:0:0:21d:7dff:fe03:4cf5%3
Socket #10=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/0:0:0:0:0:0:0:1%1
Socket #11=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/127.0.0.1
> helloworld
> quit
The message appears in the receiver window and displays the sender:
helloworld [sender=192.168.1.20:45566]
helloworld [sender=192.168.1.20:45566]
helloworld [sender=192.168.1.20:45566]
helloworld [sender=192.168.1.20:45566]
helloworld [sender=192.168.1.20:45566]
There was a firewall blocking the multicasts. Opened a port and it works!