Extract part of a string from a URL - Java Regex

2019-07-07 00:48发布

问题:

I'm trying to extract a string between '/' and '.' of a URL. For example, I have a URL like "some.com/part1/part2/part3/stringINeed.xyz". I need to extract "stringINeed" from the above URL, the one between last '/' and the '.' nothing else.

So far, I tried the following and it gives an empty output:

import java.util.*;
import java.lang.*;
import java.io.*;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

class Extract
{
    public static void main (String[] args) throws java.lang.Exception
    {
        String str = "part1/part2/part3/stringINeed.xyz" ;
        Pattern pattern = Pattern.compile("/(.*?).");
        Matcher matcher = pattern.matcher(str);
        if (matcher.find()) {
     System.out.println(matcher.group(1));
        }
    }
}

What is wrong with my code. Can anyone help?

回答1:

Use this regex:

[^/.]+(?=\.[^.]+$)

See demo.

In Java:

Pattern regex = Pattern.compile("[^/.]+(?=\\.[^.]+$)");
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
    ResultString = regexMatcher.group();
} 

Explanation

  • [^/.]+ matches any chars that are not a slash or a dot
  • The lookahead (?=\.[^.]+) asserts that what follows is a dot followed by non-dots and the end of the string


回答2:

Without regex

str.substring(str.lastIndexOf("/"), str.lastIndexOf(".")).replaceAll("/", "");