I have a log file containing the following data:
Shortest path(2)::RV3280-RV0973C-RV2888C
Shortest path(1)::RV3280-RV2502C
Shortest path(2)::RV3280-RV2501C-RV1263
Shortest path(2)::RV2363-Rv3285-RV3280
From each line, i require the number within the brackets, name of the first protein (RV3280 in the first line) and the name of the last protein (RV2888C in the first line).
I have written a code for this using the Scanner
object.
try{
Scanner s = new Scanner(new File(args[0]));
while (s.hasNextLine()) {
s.findInLine("Shortest path\\((\\d+)\\)::(\\w+).*-(\\w+)"); // at each line, look for this pattern
MatchResult result = s.match(); // results from
for (int i=1; i<=result.groupCount(); i++) {
System.out.println(result.group(i));
}
s.nextLine(); // line no. 29
}
s.close();
}
catch (FileNotFoundException e) {
System.out.print("cannot find file");
}
I get the desired results but i also get an error message. The output i get for the above input file is:
Exception in thread "main" java.util.NoSuchElementException: No line found
at java.util.Scanner.nextLine(Scanner.java:1516)
at nearnessindex.Main.main(Main.java:29)
2
RV3280
RV2888C
1
RV3280
RV2502C
2
RV3280
RV1263
2
RV2363
RV3280
Java Result: 1
BUILD SUCCESSFUL (total time: 1 second)
Why does this error occur and how can correct it?
Your input data probably doesn't end with a line separator which would cause this. Calls to findInLine
moves the Scanner past the matching pattern and if you are at the end of the input data when calling nextLine
it will throw the NoSuchElementException
A easy fix without re-arranging the code to much would be to end the while loop with:
if (s.hasNextLine()) {
s.nextLine();
}
public static void main(String[] args) {
Scanner s = new Scanner("Shortest path(2)::RV3280-RV0973C-RV2888C"
+ "\nShortest path(1)::RV3280-RV2502C"
+ "\nShortest path(2)::RV3280-RV2501C-RV1263"
+ "\nShortest path(2)::RV2363-Rv3285-RV3280");
while (s.hasNextLine()) {
s.findInLine("Shortest path\\((\\d+)\\)::(\\w+).*-(\\w+)"); // at each line, look for this pattern
MatchResult result = s.match(); // results from
for (int i = 1; i <= result.groupCount(); i++) {
System.out.println(result.group(i));
}
s.nextLine(); // line no. 29
}
s.close();
}
}
run:
2
RV3280
RV2888C
1
RV3280
RV2502C
2
RV3280
RV1263
2
RV2363
RV3280
BUILD SUCCESSFUL (total time: 0 seconds)
This works well for me, maybe you have some weird characters or empty lines in your file?
2 empty lines at the end give me that:
Exception in thread "main" java.lang.IllegalStateException: No match result available
If your input file is that strictly formatted, you can do something like that, which is way easier because you can get rid of that nasty regex ;)
String[] lines = new String[]{"Shortest path(2)::RV3280-RV0973C-RV2888C", "Shortest path(1)::RV3280-RV2502C", "Shortest path(2)::RV3280-RV2501C-RV1263", "Shortest path(2)::RV2363-Rv3285-RV3280", "\n", "\n"};
final int positionOfIndex = 14;
final int startPositionOfProteins = 18;
for (String line : lines) {
if (!line.trim().isEmpty()) {
System.out.print(line.charAt(positionOfIndex) + ": ");
String[] proteins = line.substring(startPositionOfProteins).split("-");
System.out.println(proteins[0] + " " + proteins[proteins.size() -1]);
}
}