I am implementing the LZW algorithm. I have implemented it for strings and text files successfully and am currently modifying my code to work with binary files, such as images or executables (since I cannot read these files as strings).
I have replaced the String
type in my code with the ArrayList<Byte>
type. My code now compresses and decompresses binary files correctly, however it is at least 10x slower! This is unacceptable in a compression application where speed is a critical element.
Have I made the correct substitution of ArrayList<Byte>
for String
. Is there a faster alternative with similar functionality? Note that the LZW algorithm requires array resizing hence standard arrays[]
are not suitable.
Regards.
Using a List<Byte>
will box every byte into a separate object instance.
In bulk, that is one of the worst possible things you can do for performance.
By contrast, an array or string can occupy a solid block of memory.
Instead, you should use ByteArrayOutputStream
, or use byte[]
directly and resize as needed (you can make a wrapper class for that)
You are boxing byte
s within an ArrayList
, which uses much more memory than simple String
s. What this means is each byte
is wrapped in a whole object, and referred to by a reference. Note that such a reference is itself 4 to 8 times larger than the original byte!
You would be much better off using primitive byte []
arrays, or perhaps a primitive collections library (which properly abstracts primitive arrays as collections) such as this or this.
You need to locate the section of the code that is causing the slow down. There is not enough information in the question to gain any useful answers.
You should use a profiler. See this thread: Which Java Profiling tool do you use and which tool you think is the best?
ArrayList
implements an array so it is not ideal for lots of resizing. LinkedList
should give better performance if resizing is creating the bottleneck.
https://stackoverflow.com/a/322742/1487030