Actually, I am trying to read file which contain of multiple lines. for this I am using scanner.nextline()
However, I want to read the line until the followstop (dot separator) which usually followed by space or end of line char.
May any body help me in this case ?
If you want to search until a period, you can use a Matcher
with a Pattern
.
//Pattern p = Pattern.compile("[^\\.]*\\.(\\s+)");
Pattern p = Pattern.compile(".*?\\.(\\s+)"); //Anything any amount of times,
//followed by a dot and then some whitespace.
Matcher matcher = p.matcher("firstword. secondword.\n");
while(matcher.find()){
boolean space = matcher.group(1).charAt(0) == ' ';
System.out.println(matcher.start() + matcher.group() + "and is space: " + (space ? "TRUE" : "FALSE"));
}
.*?
- .
will match anything. *
matches 0 or more times. ?
is the lazy matcher. This matches any number of characters of any type, but it stops before the first period and whitespace (because of the lazy operator).
\\.
- This matches a period. In Java, you have to double escape special characters in regexes.
(\\s+)
- This means match whitespace (\s
, which includes new lines) one or more times. It matches one or more whitespace characters. The parentheses "capture" this part of the regex so that every time you get a match on the regex you can just ask it what specific part was matched inside the parentheses. This lets you know if it is a space or a newline.
matcher.group()
gets the string that was just matched.
I added in the question mark and commented out the other pattern because it sounded like you could have a period in the middle of some of your data. The question mark does "lazy" matching. By default, matching is greedy and will take the longest matching string. So if there are multiple places in the string with a period followed by a whitespace, it will return all of that as one match. The laziness forces it to stop matching any character (.*) once it reaches the first period and space.
Use the read() method and read char by char. If you''re matching the . this is your newline character.
Other solution might be to set the newline character and then use readline(). I did not tried this howeever
or read file in once ans use string.split method
FileReader fin = new FileReader("yourfile.txt");
Scanner src = new Scanner(fin);
// Set delimiters to full stop
src.useDelimiter(".");
while (src.hasNext()) {
// do what you want here
}
Try this,
StringBuilder stringBuilder = new StringBuilder();
while ((line = bufferedReader.readLine()) != null)
{
if (line.contains(". ") || line.trim().endsWith("."))
{
int length = line.indexOf(". "); // get the index when the line contains dot and space in the middle
stringBuilder.append(line.trim().endsWith(".") ? line
: line.substring(0, length).replace(". ", "." + System.getProperty("line.separator"))); // when the line contains dot at the end or the line may contain the dot with space
System.out.println("stringBuilder : " + stringBuilder.toString());
stringBuilder.delete(0, stringBuilder.toString().length());
if (length != 0)
{
stringBuilder.append(line.substring(length+2, line.length()));
}
}
else
{
stringBuilder.append(line.replace(System.getProperty("line.separator"), " "));
}
}
System.out.println("stringBuilder : "+stringBuilder.toString()); // when the last line not end with dot or not contain dot and space