Hi I am reading from a text file and saving each line (split by a comma) into an array. The only problem is that most of the elements in the array are double values where as two elements are strings. As a result of this I had to make the array a String[] array. Due to this, whenever I want to perform some equations on the double values in the array, I have to first parse them as a double value. I am literally running 1000+ iterations of these equations, therefore my code is constantly parsing the strings into a double. This is a costly way which is slowing down my program. Is there a better way I can convert the values from the string array to double values or is there a better approach I should take when saving the lines from the text file? Thanks
Here is what one of the arrays looks like after I have read from the text file:
String[] details = {"24.9", "100.0", "19.2" , "82.0", "Harry", "Smith", "45.0"};
I now need to multiply the first 2 elements and add that to the sum of the 3rd, 4th and 7th elements. In other words I am only using the numerical elements (that are ofcourse saved as strings)
double score = (Double.parseDouble(details[0]) * Double.parseDouble(details[1])) + Double.parseDouble(details[2]) + Double.parseDouble(details[3]) + Double.parseDouble(details[6]);
I have to do this for every single line in the text file (1000+ lines). As a result of this my program is running very slowly. Is there a better way I can convert the string values into a double? or is there a better way I should go about storing them in the first place?
EDIT: I have used profiler to check which part of the code is the slowest and it is indeed the code that I have shown above
Here's an example of generating an input file like the one you describe that's 10000 lines long, then reading it back in and doing the calculation you posted and printing the result to stdout. I specifically disable any buffering when reading the file in order to get the worst possible read performance. I'm also not doing any caching at all, as others have suggested. The entire process, including generating the file, doing the calculation, and printing the results, consistently takes around 520-550 ms. That's hardly "slow", unless you're repeating this same process for hundreds or thousands of files. If you see drastically different performance from this, then maybe it's a hardware problem. A failing hard disk can drop read performance to nearly nothing.
import java.io.*;
import java.util.Random;
public class ReadingDoublesFromFileEfficiency {
private static Random random = new Random();
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
String filePath = createInputFile();
BufferedReader reader = new BufferedReader(new FileReader(filePath), 1);
String line;
while ((line = reader.readLine()) != null) {
String[] details = line.split(",");
double score = (Double.parseDouble(details[0]) * Double.parseDouble(details[1])) + Double.parseDouble(details[2]) + Double.parseDouble(details[3]) + Double.parseDouble(details[6]);
System.out.println(score);
}
reader.close();
long elapsed = System.currentTimeMillis() - start;
System.out.println("Took " + elapsed + " ms");
}
private static String createInputFile() throws IOException {
File file = File.createTempFile("testbed", null);
PrintWriter writer = new PrintWriter(new FileWriter(file));
for (int i = 0; i < 10000; i++) {
writer.println(randomLine());
}
writer.close();
return file.getAbsolutePath();
}
private static String randomLine() {
return String.format("%f,%f,%f,%f,%s,%s,%f",
score(), score(), score(), score(), name(), name(), score());
}
private static String name() {
String name = "";
for (int i = 0; i < 10; i++) {
name += (char) (random.nextInt(26) + 97);
}
return name;
}
private static double score() {
return random.nextDouble() * 100;
}
}
You'd do better to create a proper object and store the values in that - this gives you two major benefits, 1) your code will be faster since you avoid needlessly recomputing double values and 2) your code will be clearer, since the fields will be named rather than making calls like details[0]
where it's completely unclear what [0]
is referring to.
Due to 2) I don't know what the fields are supposed to be, so obviously your class will look different, but the idea's the same:
public class PersonScore {
private double[] multipliers = new double[2];
private double[] summers = new double[3];
private String first;
private String last;
// expects a parsed CSV String
public PersonScore(String[] arr) {
if(arr.length != 7)
throw new InvalidArgumentException("Must pass exactly 7 fields");
multipliers[0] = Double.parseDouble(arr[0]);
multipliers[1] = Double.parseDouble(arr[1]);
summers[0] = Double.parseDouble(arr[2]);
summers[0] = Double.parseDouble(arr[3]);
summers[0] = Double.parseDouble(arr[6]);
first = arr[4];
last = arr[5];
}
public double score() {
double ret = 1;
for(double mult : multipliers)
ret *= mult;
for(double sum : summers)
ret += sum;
return ret;
}
public String toString() {
return first+" "+last+": "+score();
}
}
Notice there's an additional benefit, that the score method is now more robust. Your implementation above hard-coded the fields we wanted to use, but by parsing and storing the fields as structure content, we're able to implement a more readable, more scalable score calculation method.