I have a string as : "This is a URL http://www.google.com/MyDoc.pdf which should be used"
I just need to extract the URL that is starting from http and ending at pdf :
http://www.google.com/MyDoc.pdf
String sLeftDelimiter = "http://";
String[] tempURL = sValueFromAddAtt.split(sLeftDelimiter );
String sRequiredURL = sLeftDelimiter + tempURL[1];
This gives me the output as "http://www.google.com/MyDoc.pdf which should be used"
Need help on this.
This kind of problem is what regular expressions were made for:
Pattern findUrl = Pattern.compile("\\bhttp.*?\\.pdf\\b");
Matcher matcher = findUrl.matcher("This is a URL http://www.google.com/MyDoc.pdf which should be used");
while (matcher.find()) {
System.out.println(matcher.group());
}
The regular expression explained:
\b
before the "http" there is a word boundary (i.e. xhttp does not match)
http
the string "http" (be aware that this also matches "https" and "httpsomething")
.*?
any character (.
) any number of times (*
), but try to use the least amount of characters (?
)
\.pdf
the literal string ".pdf"
\b
after the ".pdf" there is a word boundary (i.e. .pdfoo does not match)
If you would like to match only http and https, try to use this instead of http
in your string:
https?\:
- this matches the string http, then an optional "s" (indicated by the ?
after the s) and then a colon.
why don't you use startsWith("http://") and endsWith(".pdf") mthods of String class.
Both the method returns boolean value, if both returns true, then your condition succeed else your condition is failed.
Try this
String StringName="This is a URL http://www.google.com/MyDoc.pdf which should be used";
StringName=StringName.substring(StringName.indexOf("http:"),StringName.indexOf("which"));
You can use Regular Expression
power for here.
First you have to find Url
in original string then remove other part.
Following code shows my suggestion:
String regex = "\\b(http|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";
String str = "This is a URL http://www.google.com/MyDoc.pdf which should be used";
String[] splited = str.split(regex);
for(String current_part : splited)
{
str = str.replace(current_part, "");
}
System.out.println(str);
This snippet code cans retrieve any url in any string with any pattern.
You cant add customize protocol such as https
to protocol part in above regular expression.
I hope my answer help you ;)
public static String getStringBetweenStrings(String aString, String aPattern1, String aPattern2) {
String ret = null;
int pos1,pos2;
pos1 = aString.indexOf(aPattern1) + aPattern1.length();
pos2 = aString.indexOf(aPattern2);
if ((pos1>0) && (pos2>0) && (pos2 > pos1)) {
return aString.substring(pos1, pos2);
}
return ret;
}
You can use String.replaceAll with a capturing group and back reference for a very concise solution:
String input = "This is a URL http://www.google.com/MyDoc.pdf which should be used";
System.out.println(input.replaceAll(".*(http.*?\\.pdf).*", "$1"));
Here's a breakdown for the regex: https://regexr.com/3qmus