Fastest way of processing Java IO using ASCII line

2019-06-07 06:59发布

I'm working with an ASCII input/output stream over a Socket and speed is critical. I've heard using the right Java technique really makes a difference. I have a textbook that says using Buffers is the best way, but also suggests chaining with DataInputStreamReader.

For output I'm using a BufferedOutputStream with OutputStreamWriter which seems to be fine. But I am unsure on what to use for the input stream. I'm working on new lines, so would Scanner be of any use? Speed is critical, I need to get the data off the network as fast as possible.

Thanks.

PH

4条回答
冷血范
2楼-- · 2019-06-07 07:32

Just for laughs...

socket = new ServerSocket(2004, 10);
connection = socket.accept();
in = connection.getInputStream();
InputStreamReader isr = new InputStreamReader(in);
BufferedReader br = new BufferedReader(isr);
String line = null;
do {
    line = br.readLine();
} while (!"done".equals(line));

With LOOPBACK, i.e. just running to localhost with local processes, on my machine, and with a suitably "stupid" client.

requestSocket = new Socket("localhost", 2004);
out = requestSocket.getOutputStream();
PrintWriter pw = new PrintWriter(out);
String line =  "...1000 characters long..."; 
for (int i = 0; i < 2000000 - 1; i++) {
    pw.println(line);
}
line = "done";
pw.println(line);
pw.flush();

You'll note that this send 2M "1000 char" lines. It's simply a crude throughput test.

On my machine, loopback, I get ~190MB/sec transfer rate. Bytes, not bits. 190,000 lines/sec.

My point is that the "unsophisticated" way using bone stock Java sockets is quite fast. This will saturate any common network connection (meaning the network will be slowing you down more than your I/O here will).

Likely "fast enough".

What kind of traffic are you expecting?

查看更多
三岁会撩人
3楼-- · 2019-06-07 07:36

If speed is absolutely critical, consider using NIO. Here's a code example posted for the exact same question.

http://lists.apple.com/archives/java-dev/2004/Apr/msg00051.html

EDIT: Here's another example

http://www.java2s.com/Code/Java/File-Input-Output/UseNIOtoreadatextfile.htm

EDIT 2: I wrote this microbenchmark to get you started on measuring the performance of various approaches. Some folks have commented that NIO will not perform faster because you will need to do more work to 'massage' the data into a usable form, so you can validate that based on whatever it is you're trying to do. When I ran this code on my machine, the NIO code was approximately 3 times faster with a 45 megabyte file, and 5 times faster with a 100 megabyte file.

import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.util.Scanner;

public class TestStuff {

    public static void main(final String[] args)
            throws IOException, InterruptedException {

        final String file_path = "c:\\test-nio.txt";
        readFileUsingNIO(file_path);
        readFileUsingScanner(file_path);

    }

    private static void readFileUsingScanner(final String path_to_file)
            throws FileNotFoundException {
        Scanner s = null;

        final StringBuilder builder = new StringBuilder();
        try {
            System.out.println("Starting to read the file using SCANNER");
            final long start_time = System.currentTimeMillis();
            s = new Scanner(new BufferedReader(new FileReader(path_to_file)));
            while (s.hasNext()) {
                builder.append(s.next());
            }
            System.out.println("Finished!  Read took " + (System.currentTimeMillis() - start_time) + " ms");
        }
        finally {
            if (s != null) {
                s.close();
            }
        }

    }

    private static void readFileUsingNIO(final String path_to_file)
            throws IOException {
        FileInputStream fIn = null;
        FileChannel fChan = null;
        long fSize;
        ByteBuffer mBuf;

        final StringBuilder builder = new StringBuilder();
        try {
            System.out.println("Starting to read the file using NIO");
            final long start_time = System.currentTimeMillis();
            fIn = new FileInputStream("c:\\test-nio.txt");
            fChan = fIn.getChannel();
            fSize = fChan.size();
            mBuf = ByteBuffer.allocate((int) fSize);
            fChan.read(mBuf);
            mBuf.rewind();
            for (int i = 0; i < fSize; i++) {
                //System.out.print((char) mBuf.get());
                builder.append((char) mBuf.get());
            }
            fChan.close();
            fIn.close();
            System.out.println("Finished!  Read took " + (System.currentTimeMillis() - start_time) + " ms");
        }
        catch (final IOException exc) {
            System.out.println(exc);
            System.exit(1);
        }
        finally {
            if (fChan != null) {
                fChan.close();
            }
            if (fIn != null) {
                fIn.close();
            }
        }

    }
查看更多
SAY GOODBYE
4楼-- · 2019-06-07 07:38

A Scanner is used for delimited text. You didn't talk about what your data looks like so I can't comment on that.

If you just want to read until each newline character, use

BufferedReader r = new BufferedReader(new InputStreamReader(Socket.getInputStream()))

and

r.readLine()

When you get a null value, you will know you have exhausted the data in the stream.

As far as speed is concerned, they are both just reading data out of the stream. So assuming you don't need the extra functionality of a Scanner, I don't see any particular reason to use one.

查看更多
迷人小祖宗
5楼-- · 2019-06-07 07:43

I would do something with a BufferedReader along the lines of:

Collection<String> lines = new ArrayList<String>();
BufferedReader reader = new BufferedReader( new InputStreamReader( Foo.getInputStream()));
while(reader.ready())
{
    lines.add( reader.readLine());
}

myClass.processData(lines); //Process the data after it is off the network.

Depending on your situation you could have an additional thread that processes the items in 'lines' as its getting filled, but then you would need to use a different structure to back the collection- one that can be used concurrently.

查看更多
登录 后发表回答