How can I diagnose our java IP multicast applicati

2019-03-22 18:14发布

问题:

For some reason, every multicast example I run (the computer runs OpenSUSE Linux) will work. The clients all just sit silently. How do I figure out why the multicast is being blocked/ignored?

Some of the examples:

EXAMPLE 1

http://www.roseindia.net/java/example/java/net/udp/UDPMulticastServer.java

Example 2

http://docs.oracle.com/javase/tutorial/networking/datagrams/broadcasting.html (uses these files:) http://docs.oracle.com/javase/tutorial/networking/datagrams/examples/MulticastServer.java http://docs.oracle.com/javase/tutorial/networking/datagrams/examples/MulticastServerThread.java http://docs.oracle.com/javase/tutorial/networking/datagrams/examples/MulticastClient.java http://docs.oracle.com/javase/tutorial/networking/datagrams/examples/one-liners.txt

回答1:

When troubleshooting IP multicast, there are some big-picture things you can do to isolate whether this is a host issue, software issue, or network issue:

  • Step 1: Ensure the receiver is sending IGMP group joins on the correct interface. Look for the multicast source's traffic on the receiver's interface.
  • Step 2: Ensure the server is sending traffic on the proper multicast group out the correct interface
  • Step 3: Perform something like a ping test for IP multicast (using linux's socat tool)

The details for each step are outlined below...

Step 1

First, ensure that the linux multicast receivers are correctly advertising their group membership reports; keep in mind that a lot of things in multicast work backwards from unicast. For instance, multicasting requires that you send an IGMP join packet that contains the multicast group you want to receive.

Use tcpdump or tshark to examine the interface in question... In the example below, I have a machine on 192.168.12.238 that is announcing (via igmp) that it wants to receive multicast traffic from 239.255.0.1

[mpenning@Finger ~]$ sudo tshark -n -V -i eth0 igmp
Running as user "root" and group "root". This could be dangerous.
Capturing on eth0
Frame 1 (54 bytes on wire, 54 bytes captured)
    Arrival Time: Dec  6, 2011 09:08:45.156782000
    ... >snip< ...
Internet Protocol, Src: 192.168.12.238 (192.168.12.238), Dst: 224.0.0.22 (224.0.0.22)
    Version: 4
    Header length: 24 bytes
    Differentiated Services Field: 0xc0 (DSCP 0x30: Class Selector 6; ECN: 0x00)
        1100 00.. = Differentiated Services Codepoint: Class Selector 6 (0x30)
        .... ..0. = ECN-Capable Transport (ECT): 0
        .... ...0 = ECN-CE: 0
    Total Length: 40
    Identification: 0x0000 (0)
    Flags: 0x02 (Don't Fragment)
        0.. = Reserved bit: Not Set
        .1. = Don't fragment: Set
        ..0 = More fragments: Not Set
    Fragment offset: 0
    Time to live: 1
    Protocol: IGMP (0x02)
    Header checksum: 0x3663 [correct]
        [Good: True]
        [Bad : False]
    Source: 192.168.12.238 (192.168.12.238)
    Destination: 224.0.0.22 (224.0.0.22)
    Options: (4 bytes)
        Router Alert: Every router examines packet
Internet Group Management Protocol
    [IGMP Version: 3]
    Type: Membership Report (0x22)
    Header checksum: 0xe9fd [correct]
    Num Group Records: 1
    Group Record : 239.255.0.1  Change To Exclude Mode
        Record Type: Change To Exclude Mode (4)
        Aux Data Len: 0
        Num Src: 0
        Multicast Address: 239.255.0.1 (239.255.0.1)

^C1 packet captured

Now check and see whether the multicast source's traffic is getting to this interface (I'm assuming it was eth0, below):

sudo tshark -n -i eth0 ip and host 239.255.0.1

If you see traffic sent to the proper multicast group, then proceed directly to Step 3; otherwise go to Step 2.

Step 2

Next ensure that your multicast server is sending the traffic to the correct group. In the example below, I run a command to sniff eth0 for traffic sent to 239.255.0.1.

[mpenning@hotcoffee Models]$ sudo tshark -n -i eth0 ip and host 239.255.0.1

1.466991 192.168.12.236 -> 239.255.0.1  UDP Source port: 11111  Destination port: 11111

If the multicast source is sending traffic to the right group here in Step 2, you saw IGMP group joins in Step 1, and Step 1 did not see traffic at the multicast receiver's interface, then contact your network administrators about this problem.

Step 3

Assuming all that works, and you still want an acid test in case your multicast receiver software is somehow discarding multicast it receives from the IP stack... make sure you have socat installed on your machine and do the following...

On the multicast sender (server), use this command to send test multicast packets to 239.255.0.1:

perl -e '$ii=0; while (1) { print "hi number $ii\n"; $ii++; }' | socat - UDP-SENDTO:239.255.0.1:11111,sp=11111

On the multicast receiver (client), use this command to listen to test multicast packets sent to 239.255.0.1 on eth0:

socat - UDP-DATAGRAM:239.255.0.1:11111,bind=:11111,ip-add-membership=239.255.0.1:eth0

Assuming your network administrators are allowing multicast on 239.255.0.1, you will see a lot of traffic like this in the multicast receiver's terminal window:

hi number 212289
hi number 212290
hi number 212291
hi number 212292
hi number 212293
hi number 212294
hi number 212295
hi number 212296
hi number 212297
hi number 212298

NOTE: do not try this with a multicast group address that is already in production use on your network.

Step 4

If steps 1, 2, and 3 reveal that multicast traffic is being sent and received through your network, then call up the software developer and tell them you think there is a problem with the application and explain the steps you have taken so far.

If steps 1, 2, or 3 do not work, reconfigure your software / hosts / network until they do. Warning, multicast in IP networks is 3x harder to implement correctly than IP unicast.

Best of luck to you...



回答2:

How to check Multicasting ?

When clustering breaks, it could be due to a number of reasons. One of them is multicasting (where applications subscribe to a certain IP address and listen for messages). If users find themselves intermittently logged out, this might indicate such a problem.

A multicast IP address will be in the range 224.0.0.0 to 239.255.255.255. This post is just a reminder to me what checks to do at the linux level:

Run netstat -g to get the multicast addresses this host subscribes to.

[root@bruatwls001 ~]$ netstat -g
IPv6/IPv4 Group Memberships
Interface       RefCnt Group
--------------- ------ ---------------------
lo              1      all-systems.mcast.net
eth0            2      239.128.4.0
eth0            1      all-systems.mcast.net

Note in the RefCnt column it shows 2 members belong to the group 239.128.4.0

Pinging this multicast address reveals which members subscribe to the group (or cluster):

[root@bruatwls001 ~]$ ping 239.128.4.0
PING 239.128.4.0 (239.128.4.0) 56(84) bytes of data.
64 bytes from 10.35.8.12: icmp_seq=0 ttl=64 time=0.032 ms
64 bytes from 10.35.8.13: icmp_seq=0 ttl=64 time=0.207 ms (DUP!)
64 bytes from 10.35.8.12: icmp_seq=1 ttl=64 time=0.029 ms
64 bytes from 10.35.8.13: icmp_seq=1 ttl=64 time=0.193 ms (DUP!)
64 bytes from 10.35.8.12: icmp_seq=2 ttl=64 time=0.028 ms
64 bytes from 10.35.8.13: icmp_seq=2 ttl=64 time=0.241 ms (DUP!)

jgroups

A good way to test multicasting is using jgroups. See http://www.jgroups.org/manual/html/ch02.html#ItDoesntWork Download jgroups-3.3.3.Final.jar

  • Start the receiver:

[quick@laptop]$ java -cp jgroups-3.3.3.Final.jar org.jgroups.tests.McastReceiverTest -mcast_addr 231.12.21.132 -port 45566
Socket=0.0.0.0/0.0.0.0:45566, bind interface=/fe80:0:0:0:201:4aff:fe5e:5331%2
Socket=0.0.0.0/0.0.0.0:45566, bind interface=/192.168.1.5
Socket=0.0.0.0/0.0.0.0:45566, bind interface=/0:0:0:0:0:0:0:1%1
Socket=0.0.0.0/0.0.0.0:45566, bind interface=/127.0.0.1

  • Start the sender and type something:

 [quick@centos ~]$ java -cp jgroups-3.3.3.Final.jar org.jgroups.tests.McastSenderTest -mcast_addr 231.12.21.132 -port 45566
Socket #1=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/fe80:0:0:0:fc54:ff:fedc:d6da%7  
Socket #2=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/fe80:0:0:0:21d:7dff:fe03:4cf5%2  
Socket #7=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/192.168.122.1  
Socket #8=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/fe80:0:0:0:21d:7dff:fe03:4cf5%2  
Socket #9=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/fe80:0:0:0:21d:7dff:fe03:4cf5%3  
Socket #10=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/0:0:0:0:0:0:0:1%1  
Socket #11=0.0.0.0/0.0.0.0:45566, ttl=32, bind interface=/127.0.0.1  
> helloworld  
> quit

The message appears in the receiver window and displays the sender:

helloworld [sender=192.168.1.20:45566]  
helloworld [sender=192.168.1.20:45566]  
helloworld [sender=192.168.1.20:45566]  
helloworld [sender=192.168.1.20:45566]  
helloworld [sender=192.168.1.20:45566]  



回答3:

There was a firewall blocking the multicasts. Opened a port and it works!