Iterate over a large JSON Array with JSONPath

2020-02-13 04:53发布

问题:

I have a simple Java app that needs to traverse a large JSON array (with around 20K items), and within each array, I parse a sub-array. Each item looks like this:

{"index":0,"f1":[2,16,16,16,16,16,32,16],"f2":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"startTime":0.0}

I am using JSONPath to iterate over each item. What I do is first I read the length, and simply iterate over the whole array. But it is very slow (like, 1 item per second).

int length = JsonPath.read(response, "$.result.length()");
for (int i = 0; i < length; i++) {
    double start_time = JsonPath.read(response, "$.result["+i+"].startTime");
    ArrayList<Integer> f1= JsonPath.read(response, "$.result["+i+"].f1");
    //...other things
}

Is there a way to optimise it?

回答1:

You should minimise number of read operation. First time you scan the whole file and and next you scan n times file partially. Reading from disk is slower than from memory: Latency Numbers Every Programmer Should Know, so you should load the file to memory once and then iterate over items. Also, from JsonPath documentation:

If you only want to read once this is OK. In case you need to read an other path as well this is not the way to go since the document will be parsed every time you call JsonPath.read(...). To avoid the problem you can parse the json first.

String json = "...";
Object document = Configuration.defaultConfiguration().jsonProvider().parse(json);

List<Integer> f10 = JsonPath.read(document, "$.result[0].f1");
List<Integer> f11 = JsonPath.read(document, "$.result[1].f1");

You can improve your JsonPath: $.result and read only what you need by: $.result..['f1','startTime'].

Example app which loads only required fields:

import com.jayway.jsonpath.JsonPath;

import java.io.File;
import java.util.List;
import java.util.Map;

public class JsonPathApp {

    public static void main(String[] args) throws Exception {
        File jsonFile = new File("./resource/test.json").getAbsoluteFile();
        List<Object> array = JsonPath.read(jsonFile, "$.result..['f1','startTime']");
        for (Object item : array) {
            Map<String, Object> map = (Map<String, Object>) item;
            System.out.println(map.get("f1"));
            System.out.println(map.get("startTime"));
        }
    }
}


回答2:

Got it. Thanks to Erwin, I can parse the whole JSON at once into a HASHMap simply like this:

ArrayList<HashMap> json= JsonPath.read(response, "$.result");

And then we can simply call get(i) to access a specific item in the loop:

double start_time = (double) json.get(i).get("startTime");