so I have a large list of websites and I want to put them all in a String variable. I know I can not individually go to all of the links and escape the //, but is there is over a few hundred links. Is there a way to do a "block escape", so everything in between the "block" is escaped? This is an example of what I want to save in the variable.
String links="http://website http://website http://website http://website http://website http://website"
Also can anyone think of any other problems I might run into while doing this?
I made it htp instead of http because I am not allowed to post "hyperlinks" according to stack overflow as I am not at that level :p
Thanks so much
Edit: I am making a program because I have about 50 pages of a word document that is filled with both emails and other text. I want to filter out just the emails. I wrote the program to do this which was very simple, not I just need to figure away to store the pages in a string variable in which the program will be run on.
I suggest that you save your Word document as plain text. Then you can use classes from the
java.io
package (such asScanner
to read the text).To solve the issue of overwriting the
String
variable each time you read a line, you can use an array orArrayList
. This is much more ideal than holding all the web addresses in a singleString
because you can easily access each address individually whenever you like.Your question is not well-written. Improve it, please. In its current format it will be closed as "too vague".
Do you want to filter e-mails or websites? Your example is about websites, you text about e-mails. As I don't know and I decided to try to help you anyway, I decided to do both.
Here goes the code:
To ensure that it works, first lets test the filterEmails and filterWebsites method:
It outputs:
To test the readFileAsString method:
If that file exists, its content will be printed.
If you don't like the fact that it returns
List<String>
instead of aString
with items divided by spaces, this is simple to solve:Sticking all together:
I'm not sure what kind of 'list of websites' you're referring to, but for eg. a comma-separated file of websites you could read the entire file and use the
String
split
function to get an array, or you could use aBufferedReader
to read the file line by line and add to anArrayList
.From there you can simply loop the array and append to a
String
, or if you need to:You can use a Regular Expression to extract parts of each
String
according to a pattern:The above expression would remove the xml tags due to the
"$2"
which means you're interested in the second group of the expression, where groups are identified by round brackets( )
. Using"$1$3"
instead should then give you only the surrounding xml tags.Another much simpler approach to removing certain "blocks" from a
String
is theString
replace
function, where to remove the block you could simply pass in an empty string as the new value.I hope any of this helps, otherwise you could try to provide a full example with you input "list of websites" and the output you want.
For your first problem, take all the text out of word, put it in something that does regular expressions, use regular expressions to quote each line and end each line with
+
. Now edit the last line and change+
to;
. Above the first line writeString links =
. Copy this new file into your java source. Here's an example using regexr.To answer your second question (thinking of problems) there is an upper limit for a Java string literal if I recall correctly 2^16 in length.
Oh and
Perl
was basically written for you to do this kind of thing (take 50 pages of text and separate out what is a url and what is an email)... not to mentiongrep
.