I am trying to do ssh on a server and then do a grep to get the count of different errors in the log file. Once i have those details logs those into a CSV file. But when i am trying to run the grep command i am getting error.
#!/usr/bin/perl
my $addr = "user\@servername";
my $text = qq|Internal Server Error|;
my $remote_path = "/data/logs/error";
my $cmd = `ssh $remote_addr "grep -a $text $remote_path | awk -F " " '{print $4}' | sort -nr | uniq -c | sort -nr 2>/dev/null"`;
print $cmd;
But i am getting below error when i am running the script
grep: Internal: No such file or directory
grep: Server: No such file or directory
grep: Error: No such file or directory
Is there any suggestion how we can do this in shell script.
First, in order to avoid a quoting nightmare and the many chances for shell injection I'd suggest to use a module, like String::ShellQuote
Then, I don't see that you need all those external tools, while such a long pipe-line is tricky and expensive. It invokes a number of programs, for jobs done really well in Perl, and requires very precise syntax.
Apart from ssh
-ing, one other thing that an external tool may be good for here is to extract lines of interest with grep
, in case that file is huge (otherwise you can read it into a scalar).
use warnings;
use strict;
use feature 'say';
use List::Util qw(uniq); # in List::MoreUtils prior to module v1.45
use String::ShellQuote qw(shell_quote);
my $remote_addr = ...
my $remote_path = ...
my $text = 'Internal Server Error';
my $remote_cmd = shell_quote('grep', '-a', $text, $remote_path);
my $cmd = shell_quote('ssh', $remote_addr, $remote_cmd);
my @lines = qx($cmd);
chomp @lines;
# Process @lines as needed, perhaps
my @result = sort { $b <=> $a } uniq map { (split)[3] } @lines;
say for @result;
Once it comes to running external commands, there are many options. In the first place consider using a module. They all simplify things a lot, in particular with error checking, and are in general more reliable, while some also make harder jobs far more manageable.
An example with IPC::System::Simple
use IPC::System::Simple qw(capturex);
my @lines = capturex('ssh', $remote_addr, $remote_cmd);
Since ssh
executes the command when ran with one it doesn't need a shell (for that part) so capturex
is used. See documentation for more options and for how to check for errors.
Some other options, from simple to more powerful, are Capture::Tiny, IPC::Run3, IPC::Run.
For more on all this see links assembled in this post (and search for more).
I can't see a need to run that pipeline as it stands† but if there is one (stay on the remote host?) then form commands as above and then assemble the full pipeline
my $cgrep = shell_quote('grep', '-a', $text, $remote_path);
my $cawk = shell_quote('awk', '-F', ' ', '{print $4}');
my $csort = shell_quote('sort', '-nr');
my $cuniq = shell_quote('uniq', '-c');
my $remote_cmd = "$cgrep | $cawk | $csort | $cuniq | $csort 2>/dev/null";
Note that the needed shell functionalities (|
and redirection) shouldn't be quoted away.
The mere space in the awk
piece may be awkward-looking but since it gets escaped it winds up right for -F
. For me it's another sign of trouble with running external programs in shell pipelines; I couldn't figure out that bare space, thanks to Charles Duffy for the comment.
In this case the sort
and uniq
parts of the pipeline can be just typed as one string since it's only program names and options, but as soon as changes are made or any variables make their way in that becomes tricky. So I use shell_quote
, for consistency and as a template.
Modules are reported missing and hard to obtain. Then escape what need be escaped (until you figure out how to get the modules, that is). In this case there happen to be little to fix, but that bit can serve as an example of common hoops to go through with complicated pipelines.
The string with $text
needs to reach grep
as such, one string. Since it passes through shell, which would break it by space into words, we need to protect (quote/escape) those spaces. Not to forget, we also need to get it to the shell in the first place, through Perl's parsing rules.
One way
my $text_quoted = q(') . quotemeta($text) . q(');
where quotemeta quotes all kinds of other things as well.
We also should protect the filename pattern, as it may rely on shell metacharacters (like *
)
my $remote_path_quoted = quotemeta $remote_path;
but again, you have to inspect whether this is suitable in each case.
NOTE   If any dynamically generated variables (computed, come from user...) are interpolated in these commands they need be validated, with things carefully escaped and quoted.
Now your pipeline should work (it does in my simulated tests)
my $cmd = "ssh $remote_host grep -a $text_quoted $remote_path_quoted"
. q( | awk F ' ' '{print $4}' | sort -nr | uniq | sort -nr 2>/dev/null);
This can be broken up into sensible components in their own variables, etc, but I really don't recommend such hand-patched solutions.
I suggest to only use the first part (ssh + grep) and do the rest in Perl though, as in the main part of the answer. And then to have those modules installed and switch to them.
No major computing tool works without (the many) libraries, and every production install contains a lot of "additional" stuff. As need for more libraries arises they are installed. Why would it have to be different with Perl? Yes, you can do it with builtins only but that may be much harder.
†   One good reason would be so to use the system sort
when files are huge, since it doesn't have to load the whole file at once, and for its speed. However, in this pipeline it is fed data over the pipe and invoked repeatedly so these advantages don't apply.