I am trying to run a second node on a different processor, either an ARM or a second x86_64. I have a DomMgr running on one x86_64 and attempting to start a node on either another x86_64 or arm using nodeBooter. The DevMgr starts and registers with the DomMgr but when it starts the GPP device it "Requesting IDM CHANNEL IDM CHANNEL IDM_CHANNEL" and then immediately "terminate called after throwing an instance of 'CORBA::OBJECT_NOT_EXIST'". The DomMgr printed out to the console that "Domain Channel: IDM_Channel created". Is it supposed to register that in the NameService or why does the remote DevMgr get an invalid object ref when it tries to get it?
I did not realize I could clarify my question by editing it to add new findings.
I'll do that from now on.
By using ORBtraceLevel on the remote DevMgr I found that I had different problem
on my remote x86-based DevMgr and my ARM-based one, even though the normal error
messages were the same. The x86 case was simply that I my exported DevMgr dcd
used the same name and id as one running locally on the Domain. When I fixed that
I have no problem with the x86-based remote DevMgr starting its GPP device and
registering.
But this is NOT the problem for the ARM-based case. With traceLevel=10 I started
DevMgr on both my x86 successfully and my ARM and compared the outputs. First I
should mention that my ARM is running Ubuntu 16.04 on a RaspberryPi 3. The cpu
is 64-bit but no distro for either Ubuntu or CentOS is available as 64-bit so
the OS is 32-bit Ubuntu for now. I know that RedHawk 2.0 says it only now supports
64-bit CentOS so perhaps that is the problem, although I was able to build RedHawk
with no trouble and most of it works fine. But trace does show two warnings
WARN Device_impl:172 - Cannot set allocation implementation: Property ### is
of type 'CORBA::Long' (not 'int')
which do not show in the x86 case and I believe are due to the different sizes of int.
If I do not start an Event Service on the domain, these same warnings show but I am
able to start the GPP fine and run waveforms. So I do not know if this is related to
my OBJECT_NOT_FOUND error in GPP or not but thought I should mention it.
Trace shows one successful
Creating ref to remote: REDHAWK.DEV.IDM.Channel
target id :IDL:omg.org/CosEventChannelAdmin/EventChannel:1.0
most derived id:
Adding root/Files<3> (activating) to object table.
but on the second case it immedately shows
Adding root<3> (activating) to object table.
followed by
throw OBJECT_NOT_EXIST from GIOP_C.cc:281 (NO,OBJECT_NOT_EXIST_NoMatch)
throw OBJECT_NOT_EXIST from omniOrbRef.cc:829 (NO,OBJECT_NOT_EXIST_NoMatch)
and then GPP terminates with signal 6.
The successful x86 trace shows the same Creating ref and Adding root<3> but then
has
Creating ref to remote: root/REDHAWK_DEV.IDM_Channel <...>
Can this be related to the 32-bit vs 64-bit or why would this happen only on the
ARM based GPP?
Note that I have iptables accepting any traffic from my subdomain on x86s and is not
running at all on the ARM. There is a lot of successful connections including queries
with nameclt, so this is not (as far as I can tell) a network connection issue.
What version of REDHAWK are you running? What OS? Can you provide a list of all the omni rpms you have installed on your machine?
It sounds like something is miss-configured on your system, perhaps IPTables or selinux? Lets walk through a quick example to show the minimum needed configuration and running processes needed for a multi-node system. If this does not clear things up, I'd suggest rerunning the domain and device manager with TRACE level debugging enabled and examine the output for any anomalies or disable selinux and iptables temporarily to rule them out as issues.
I'll use a REDHAWK 2.0.1 docker image as a tool to walk through the example. The installation steps used to build this image can be found here.
- First we'll drop into a a REDHAWK 2.0.1 environment with nothing running and label this container as our domain manager
[youssef@axios(0) docker-redhawk]$docker run -it --name=domainMgr axios/redhawk:2.0
- Let's confirm that almost nothing is running on this container
[redhawk@ce4df2ff20e4 ~]$ ps -ef
UID PID PPID C STIME TTY TIME CMD
redhawk 1 0 0 12:55 ? 00:00:00 /bin/bash -l
redhawk 27 1 0 12:57 ? 00:00:00 ps -ef
- Lets take a look at the current omniORB configuration file. This will be the box we run omniNames, omniEvents and the domain manager.
[redhawk@ce4df2ff20e4 ~]$ cat /etc/omniORB.cfg
InitRef = NameService=corbaname::127.0.0.1:2809
supportBootstrapAgent = 1
InitRef = EventService=corbaloc::127.0.0.1:11169/omniEvents
- Since this will be the machine we are running omniNames and omniEvents on, the loopback address (127.0.0.1) is fine however other machines will need to reference this machine either by its hostname (domainMgr) or it's IP address so we can note it's IP now.
[redhawk@ce4df2ff20e4 ~]$ ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:0E
inet addr:172.17.0.14 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:acff:fe11:e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6 errors:0 dropped:0 overruns:0 frame:0
TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:468 (468.0 b) TX bytes:558 (558.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Note it only has a single interface so we do not need to specify an endPoint. However specifying the unix socket endpoint would provide a performance boost for any locally running components.
We can now startup omniNames, omniEvents, and the domain manager and after each step see what is running. The "extra operand" output on omniNames is expected on newer versions of CentOS6 and is an issue with the omniNames init script.
[redhawk@ce4df2ff20e4 ~]$ sudo service omniNames start
Starting omniNames: /usr/bin/dirname: extra operand `2>&1'
Try `/usr/bin/dirname --help' for more information.
[ OK ]
[redhawk@ce4df2ff20e4 ~]$ ps -ef
UID PID PPID C STIME TTY TIME CMD
redhawk 1 0 0 12:55 ? 00:00:00 /bin/bash -l
omniORB 50 1 0 13:01 ? 00:00:00 /usr/bin/omniNames -start -always -logdir /var/log/omniORB/ -errlog /var/log/omniORB/error.log
redhawk 53 1 0 13:01 ? 00:00:00 ps -ef
[redhawk@ce4df2ff20e4 ~]$ sudo service omniEvents start
Starting omniEvents [ OK ]
[redhawk@ce4df2ff20e4 ~]$ ps -ef
UID PID PPID C STIME TTY TIME CMD
redhawk 1 0 0 12:55 ? 00:00:00 /bin/bash -l
omniORB 50 1 0 13:01 ? 00:00:00 /usr/bin/omniNames -start -always -logdir /var/log/omniORB/ -errlog /var/log/omniORB/error.log
root 69 1 0 13:01 ? 00:00:00 /usr/sbin/omniEvents -P /var/run/omniEvents.pid -l /var/lib/omniEvents -p 11169
redhawk 79 1 0 13:01 ? 00:00:00 ps -ef
- I'm going to start up the domain manager in the foreground and grab the output of ps -ef via a "docker exec domainMgr ps -ef" in a different terminal
[redhawk@ce4df2ff20e4 ~]$ nodeBooter -D
2016-06-22 13:03:21 INFO DomainManager:257 - Loading DEFAULT logging configuration.
2016-06-22 13:03:21 INFO DomainManager:368 - Starting Domain Manager
2016-06-22 13:03:21 INFO DomainManager_impl:208 - Domain Channel: ODM_Channel created.
2016-06-22 13:03:21 INFO DomainManager_impl:225 - Domain Channel: IDM_Channel created.
2016-06-22 13:03:21 INFO DomainManager:455 - Starting ORB!
[youssef@axios(0) docker-redhawk]$docker exec domainMgr ps -ef
UID PID PPID C STIME TTY TIME CMD
redhawk 1 0 0 12:55 ? 00:00:00 /bin/bash -l
omniORB 50 1 0 13:01 ? 00:00:00 /usr/bin/omniNames -start -always -logdir /var/log/omniORB/ -errlog /var/log/omniORB/error.log
root 69 1 0 13:01 ? 00:00:00 /usr/sbin/omniEvents -P /var/run/omniEvents.pid -l /var/lib/omniEvents -p 11169
redhawk 80 1 0 13:03 ? 00:00:00 DomainManager DEBUG_LEVEL 3 DMD_FILE /domain/DomainManager.dmd.xml DOMAIN_NAME REDHAWK_DEV FORCE_REBIND false PERSISTENCE true SDRROOT /var/redhawk/sdr
redhawk 93 0 1 13:03 ? 00:00:00 ps -ef
So we can see that we have omniNames, omniEvents, and the DomainManager binaries running. Time to move on to a new node for the device manager.
In a new terminal I create a new container and call it deviceManager
[youssef@axios(0) docker-redhawk]$docker run -it --name=deviceManager axios/redhawk:2.0
- Confirm nothing is really running, then take a look at the omniORB configuration file.
[redhawk@765ce325f145 ~]$ ps -ef
UID PID PPID C STIME TTY TIME CMD
redhawk 1 0 0 13:05 ? 00:00:00 /bin/bash -l
redhawk 28 1 0 13:06 ? 00:00:00 ps -ef
[redhawk@765ce325f145 ~]$ cat /etc/omniORB.cfg
InitRef = NameService=corbaname::127.0.0.1:2809
supportBootstrapAgent = 1
InitRef = EventService=corbaloc::127.0.0.1:11169/omniEvents
- We need to change where the NameService and EventService IPs are pointing to either our domain managers hostname (domainMgr) or IP address (172.17.0.14) I will go with IP address.
[redhawk@765ce325f145 ~]$ sudo sed -i 's,127.0.0.1,172.17.0.14,g' /etc/omniORB.cfg
[redhawk@765ce325f145 ~]$ cat /etc/omniORB.cfg
InitRef = NameService=corbaname::172.17.0.14:2809
supportBootstrapAgent = 1
InitRef = EventService=corbaloc::172.17.0.14:11169/omniEvents
- We can confirm this worked using nameclt list to show the entry in omniNames of the event channel factory and the domain.
[redhawk@765ce325f145 ~]$ nameclt list
EventChannelFactory
REDHAWK_DEV/
- Finally we can start up the device manager and inspect the running processes in a new shell via "docker exec deviceManager ps -ef"
[redhawk@765ce325f145 ~]$ nodeBooter -d /var/redhawk/sdr/dev/nodes/DevMgr_12ef887a9000/DeviceManager.dcd.xml
2016-06-22 13:09:09 INFO DeviceManager:446 - Starting Device Manager with /nodes/DevMgr_12ef887a9000/DeviceManager.dcd.xml
2016-06-22 13:09:09 INFO DeviceManager_impl:367 - Connecting to Domain Manager REDHAWK_DEV/REDHAWK_DEV
2016-06-22 13:09:09 INFO DeviceManager:494 - Starting ORB!
2016-06-22 13:09:09 INFO Device:995 - DEV-ID:DCE:c5029226-ce70-48d9-9533-e025fb9c2a34 Requesting IDM CHANNEL IDM_Channel
2016-06-22 13:09:09 INFO redhawk::events::Manager:573 - PUBLISHER - Channel:IDM_Channel Reg-Id21f4e766-c5c6-4c5b-8974-337736e71f87 RESOURCE:DCE:c5029226-ce70-48d9-9533-e025fb9c2a34
2016-06-22 13:09:09 INFO DeviceManager_impl:1865 - Registering device GPP_12ef887a9000 on Device Manager DevMgr_12ef887a9000
2016-06-22 13:09:09 INFO DeviceManager_impl:1907 - Device LABEL: GPP_12ef887a9000 SPD loaded: GPP' - 'DCE:4e20362c-4442-4656-af6d-aedaaf13b275
2016-06-22 13:09:09 INFO GPP:658 - initialize()
2016-06-22 13:09:09 INFO redhawk::events::Manager:626 - SUBSCRIBER - Channel:ODM_Channel Reg-Id0d18c1f4-71bf-42c2-9a2d-416f16af9fcf resource:DCE:c5029226-ce70-48d9-9533-e025fb9c2a34
2016-06-22 13:09:09 INFO GPP_i:679 - Component Output Redirection is DISABLED.
2016-06-22 13:09:09 INFO GPP:1611 - Affinity Disable State, disabled=1
2016-06-22 13:09:09 INFO GPP:1613 - Disabling affinity processing requests.
2016-06-22 13:09:09 INFO GPP_i:571 - SOCKET CPUS USER SYSTEM IDLE
2016-06-22 13:09:09 INFO GPP_i:577 - 0 8 0.00 0.00 0.00
2016-06-22 13:09:09 INFO GPP:616 - initialize CPU Montior --- wl size 8
2016-06-22 13:09:10 INFO GPP_i:602 - initializeNetworkMonitor: Adding interface (docker0)
2016-06-22 13:09:10 INFO GPP_i:602 - initializeNetworkMonitor: Adding interface (em1)
2016-06-22 13:09:10 INFO GPP_i:602 - initializeNetworkMonitor: Adding interface (lo)
2016-06-22 13:09:10 INFO GPP_i:602 - initializeNetworkMonitor: Adding interface (tun0)
2016-06-22 13:09:10 INFO GPP_i:602 - initializeNetworkMonitor: Adding interface (vboxnet0)
2016-06-22 13:09:10 INFO GPP_i:602 - initializeNetworkMonitor: Adding interface (veth70de860)
2016-06-22 13:09:10 INFO GPP_i:602 - initializeNetworkMonitor: Adding interface (vethd0227d6)
2016-06-22 13:09:10 INFO DeviceManager_impl:2087 - Registering device GPP_12ef887a9000 on Domain Manager
[youssef@axios(0) docker-redhawk]$docker exec deviceManager ps -ef
UID PID PPID C STIME TTY TIME CMD
redhawk 1 0 0 13:05 ? 00:00:00 /bin/bash -l
redhawk 35 1 0 13:09 ? 00:00:00 DeviceManager DCD_FILE /nodes/DevMgr_12ef887a9000/DeviceManager.dcd.xml DEBUG_LEVEL 3 DOMAIN_NAME REDHAWK_DEV SDRCACHE /var/redhawk/sdr/dev SDRROOT /var/redhawk/sdr
redhawk 40 35 1 13:09 ? 00:00:00 /var/redhawk/sdr/dev/devices/GPP/cpp/GPP PROFILE_NAME /devices/GPP/GPP.spd.xml DEVICE_ID DCE:c5029226-ce70-48d9-9533-e025fb9c2a34 DEVICE_LABEL GPP_12ef887a9000 DEBUG_LEVEL 3 DOM_PATH REDHAWK_DEV/DevMgr_12ef887a9000 DCE:218e612c-71a7-4a73-92b6-bf70959aec45 False DCE:3bf07b37-0c00-4e2a-8275-52bd4e391f07 1.0 DCE:442d5014-2284-4f46-86ae-ce17e0749da0 0 DCE:4e416acc-3144-47eb-9e38-97f1d24f7700 DCE:5a41c2d3-5b68-4530-b0c4-ae98c26c77ec 0 DEVICE_MGR_IOR IOR:010000001900000049444c3a43462f4465766963654d616e616765723a312e3000000000010000000000000070000000010102000c0000003137322e31372e302e313500a49e00001c000000ff4465766963654d616e61676572fef58d6a570100002300000000000200000000000000080000000100000000545441010000001c00000001000000010001000100000001000105090101000100000009010100
redhawk 398 0 0 13:09 ? 00:00:00 ps -ef
So we've successfully spun up two machines on the same network with unique IP addresses, designated one as the domain manager, omniNames, and omniEvents server and the other as a Device Manager / GPP node. At this point, we could connect to the domain manager either via the IDE or through a python interface and launch waveforms; we would expect these waveforms to launch on the sole device manager node.