I've have to scripts:
#!/bin/bash
netcat -lk -p 12345 | while read line
do
match=$(echo $line | grep -c 'Keep-Alive')
if [ $match -eq 1 ]; then
[start a command]
fi
done
and
#!/bin/bash
netcat -lk -p 12346 | while read line
do
match=$(echo $line | grep -c 'Keep-Alive')
if [ $match -eq 1 ]; then
[start a command]
fi
done
I've put the two scripts in the '/etc/init.d/'
When I restart my Linux machine (RasbPi), both the scripts work fine.
I've tried them like 20 times, and they keep working fine.
But after around 12 hours, the whole system stops working. I've put in some loggin, but it seems that the scripts are not reacting anymore. But when I;
ps aux
I can see that the scripts are still running:
root 1686 0.0 0.2 2740 1184 ? S Aug12 0:00 /bin/bash /etc/init.d/script1.sh start
root 1689 0.0 0.1 2268 512 ? S Aug12 0:00 netcat -lk 12345
root 1690 0.0 0.1 2744 784 ? S Aug12 0:00 /bin/bash /etc/init.d/script1.sh start
root 1691 0.0 0.2 2740 1184 ? S Aug12 0:00 /bin/bash /etc/init.d/script2.sh start
root 1694 0.0 0.1 2268 512 ? S Aug12 0:00 netcat -lk 12346
root 1695 0.0 0.1 2744 784 ? S Aug12 0:00 /bin/bash /etc/init.d/script2.sh start
After a reboot they start working again... But thats a sin, rebooting a Linux machine periodically...
I've inserted some loggin, here's the outcome;
Listening on [0.0.0.0] (family 0, port 12345)
[2013-08-14 11:55:00] Starting loop.
[2013-08-14 11:55:00] Starting netcat.
netcat: Address already in use
[2013-08-14 11:55:00] Netcat has stopped or crashed.
[2013-08-14 11:49:52] Starting loop.
[2013-08-14 11:49:52] Starting netcat.
Listening on [0.0.0.0] (family 0, port 12345)
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6333)
Connection closed, listening again.
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6334)
[2013-08-14 12:40:02] Starting loop.
[2013-08-14 12:40:02] Starting netcat.
netcat: Address already in use
[2013-08-14 12:40:02] Netcat has stopped or crashed.
[2013-08-14 12:17:16] Starting loop.
[2013-08-14 12:17:16] Starting netcat.
Listening on [0.0.0.0] (family 0, port 12345)
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6387)
Connection closed, listening again.
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6388)
[2013-08-14 13:10:08] Starting loop.
[2013-08-14 13:10:08] Starting netcat.
netcat: Address already in use
[2013-08-14 13:10:08] Netcat has stopped or crashed.
[2013-08-14 12:17:16] Starting loop.
[2013-08-14 12:17:16] Starting netcat.
Listening on [0.0.0.0] (family 0, port 12345)
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6167)
Connection closed, listening again.
Connection from [16.8.94.19] port 12345 [tcp/*] accepted (family 2, sport 6168)
Thanks
About the loop it could look like this.
with added double quotes to keep it safer.
And you could try capturing errors and add some logging with this format:
Your read command could also be better in this format since it would read lines unmodified:
Some could also suggest the use of process substitution but I don't recommend it this time since through the
| while ...
method thewhile
loop would be able to run on a subshell and keep the outerfor
loop safe just in case it crashes. Besides there isn't really a variable from thewhile
loop that would be needed outside of it.I'm actually having the idea now that the issue might actually have been related to the input and how the
while read line; do ...; done
block handles it and not netcat itself. Your variables not being quoted properly around "" could be one of it, or could probably be the actual reason why your netcat is crashing.Periodically netcat will print, not a line, but a block of binary data. The read builtin will likely fail as a result.
I think you're using this program to verify that a remote host is still connected to port 12345 and 12346 and hasn't been rebooted.
My solution for you is to pipe the output of netcat to sed, then pipe that (much reduced) line to the read builtin...
Also, you'll need to review some of the other startup programs in /etc/init.d to make sure they are compatible with whatever version of rc the system uses, though, it would be much easier to call your script2.sh from a copy of some simple file in init.d. As it stands script2 is the startup script but doesn't conform to the init package you use.
That sounds more complicated that I mean... Let me explain better:
As an additional note, I think you could bind netcat to the specific IP that you are monitoring, instead of binding it to the all address 0.0.0.0
you may not use the -p option in the case you will wait for an incoming connect request. (see man page of nc) Hostname and Port are the last two arguments of the command line.
May be it connects to the own port and after some hours there is some resource missing??
You mentioned "after around 12 hours, the whole system stops working" - It is likely that the scripts are executing whatever you have in
[start a command]
and is bloating the memory. Are you sure the[start a command]
is not forking out many processes very frequently and releasing memory?If none of your commands including netcat reads input from stdin you can completely make it run independent of the terminal. Sometimes background process that are still dependent on the terminal pauses (S) when they try to read input from it on a background. Actually since you're running a daemon, you should make sure that none of your commands reads input from it (terminal).
And I think we could try the logging thing again:
I have often experienced strange behaviour with
nc
ornetcat
. You should have a look atncat
it's almost the same tool but it behaves the same on all platforms (nc
andnetcat
behave differently depending on distri, linux, BSD, Mac).