I am new to regular expressions but I think people here may give me valuable inputs. I am using the logstash grok
filter in which I can supply only regular expressions.
I have a string like this
/app/webpf04/sns882A/snsdomain/logs/access.log
I want to use a regular expression to get the sns882A
part from the string, which is the substring after the third "/", how can I do that?
I am restricted to regex as grok only accepts regex. Is it possible to use regex for this?
Yes you can use regular expression to get what you want via grok:
/[^/]+/[^/]+/(?<field1>[^/]+)/
for your regex:
/\w*\/\w*\/(\w*)\/
You can also test with:
http://www.regextester.com/
By googling regex tester, you can have different UI.
If you are indeed using Perl then you should use the File::Spec
module like this
use strict;
use warnings;
use File::Spec;
my $path = '/app/webpf04/sns882A/snsdomain/logs/access.log';
my @path = File::Spec->splitdir($path);
print $path[3], "\n";
output
sns882A
This is how I would do it in Perl:
my ($name) = ($fullname =~ m{^(?:/.*?){2}/(.*?)/});
EDIT:
If your framework does not support Perl-ish non-grouping groups (?:xyz)
, this regex should work instead:
^/.*?/.*?/(.*?)/
If you are concerned about performance of .*?
, this works as well:
^/[^/]+/[^/]+/([^/]+)/
One more note: All of regexes above will match string /app/webpf04/sns882A/
.
But matching string is completely different from first matching group, which is sns882A
in all three cases.
Same answer but a small bug fix. If you doesnt specify ^ in starting,it will go for the next match(try longer paths adding more / for input.). To fix it just add ^ in the starting like this. ^ means starting of the input line. finally group1 is your answer.
^/[^/]+/[^/]+/([^/]+)/
If you are using any URI paths use below.(it will handle path aswell as URI).
^.*?/[^/]+/[^/]+/([^/]+)/