Java XML Parser for huge files

2019-01-04 14:13发布

I need a xml parser to parse a file that is approximately 1.8 gb.
So the parser should not load all the file to memory.

Any suggestions?

标签: java xml parsing
9条回答
The star\"
2楼-- · 2019-01-04 14:50

As others have said, use a SAX parser, as it is a streaming parser. Using the various events, you extract your information as necessary and then, on the fly store it someplace else (database, another file, what have you).

You can even store it in memory if you truly just need a minor subset, or if you're simply summarizing the file. Depends on the use case of course.

If you're spooling to a DB, make sure you take some care to make your process restartable or whatever. A lot can happen in 1.8GB that can fail in the middle.

查看更多
▲ chillily
3楼-- · 2019-01-04 14:53

Try VTD-XML. I've found it to be more performant, and more importantly, easier to use than SAX.

查看更多
我想做一个坏孩纸
4楼-- · 2019-01-04 14:58

Use almost any SAX Parser to stream the file a bit at a time.

查看更多
登录 后发表回答