How can I store large amount of data from a databa

2020-06-03 07:38发布

First, I had a problem with getting the data from the Database, it took too much memory and failed. I've set -Xmx1500M and I'm using scrolling ResultSet so that was taken care of. Now I need to make an XML from the data, but I can't put it in one file. At the moment, I'm doing it like this:

while(rs.next()){
                i++;
                xmlStringBuilder.append("\n\t<row>");
                xmlStringBuilder.append("\n\t\t<ID>" + Util.transformToHTML(rs.getInt("id")) + "</ID>");
                xmlStringBuilder.append("\n\t\t<JED_ID>" + Util.transformToHTML(rs.getInt("jed_id")) + "</JED_ID>");
                xmlStringBuilder.append("\n\t\t<IME_PJ>" + Util.transformToHTML(rs.getString("ime_pj")) + "</IME_PJ>");
//etc.
                xmlStringBuilder.append("\n\t</row>");
                if (i%100000 == 0){
                                    //stores the data to a file with the name i.xml
                    storeKBR(xmlStringBuilder.toString(),i);
                    xmlStringBuilder= null;
                    xmlStringBuilder= new StringBuilder();  
                }

and it works; I get 12 100 MB files. Now, what I'd like to do is to do is have all that data in one file (which I then compress) but if just remove the if part, I go out of memory. I thought about trying to write to a file, closing it, then opening, but that wouldn't get me much since I'd have to load the file to memory when I open it.

标签: java oracle
4条回答
\"骚年 ilove
2楼-- · 2020-06-03 07:41

You are assembling the complete file in memory: what you should be doing is writing the data directly to the file.

Additionally, you might consider using a proper XML API rather than assembling XML as a text file. A short tutorial is available here.

查看更多
狗以群分
3楼-- · 2020-06-03 07:45

Why not write all data to one file and open the file with the "append" option? There is no need to read in all the data in the file if you are just going to write to it.

However, this might be a better solution:

PrintWriter writer = new PrintWriter(new BufferedOutputStream(new FileOutputStream("data.xml")));

while(rs.next()){
    i++;
    writer.print("\n\t<row>");
    writer.print("\n\t\t<ID>" + Util.transformToHTML(rs.getInt("id")) + "</ID>");
    writer.print("\n\t\t<JED_ID>" + Util.transformToHTML(rs.getInt("jed_id")) + "</JED_ID>");
    writer.print("\n\t\t<IME_PJ>" + Util.transformToHTML(rs.getString("ime_pj")) + "</IME_PJ>");
    //...

    writer.print("\n\t</row>");
}

writer.close();

The BufferedOutputStream will buffer the data before printing it, and you can specify the buffer size in the constructor if the default value does not suit your needs. See the java API for details: http://java.sun.com/javase/6/docs/api/.

查看更多
做个烂人
4楼-- · 2020-06-03 07:46

Ok, so the code is rewritten and I'll include the whole operation:

//this is the calling/writing function; I have 8 types of "proizvod" which makes 
//8 XML files. After an XML file is created, it needs to be zipped by a custom zip class
       generateXML(tmpParam,queryRBR,proizvod.getOznaka());
   writeToZip(proizvod.getOznaka());



//inside writeToZip

    ZipEntry ze = new ZipEntry(oznaka + ".xml");
    FileOutputStream fos = new FileOutputStream(new File(zipFolder + oznaka + ".zip"));
    ZipOutputStream zos = new ZipOutputStream(fos);
    zos.putNextEntry(ze);
    FileInputStream fis = new FileInputStream(new File(zipFolder + oznaka + ".xml"));
    final byte[] buffer = new byte[1024];
    int n;
    while ((n = fis.read(buffer)) != -1)
        zos.write(buffer, 0, n);
    zos.closeEntry();
    zos.flush();
    zos.close();
    fis.close();

// inside generateXML
PrintWriter writer = new PrintWriter(new BufferedOutputStream(new FileOutputStream(zipFolder +oznaka + ".xml")));
        writer.print("\n<?xml version=\"1.0\" encoding=\"UTF-8\" ?>");
        writer.print("\n<PROSTORNE_JEDINICE>");
        stmt = cm.getConnection().createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, 
                ResultSet.CONCUR_READ_ONLY);
        String q = "";
        rs = stmt.executeQuery(q);
        if(rs != null){

            System.out.println("Početak u : " +Util.nowTime());
            while(rs.next()){
                writer.print("\n\t<row>");
                writer.print("\n\t\t<ID>" + Util.transformToHTML(rs.getInt("id")) + "</ID>");
                writer.print("\n\t\t<JED_ID>" + Util.transformToHTML(rs.getInt("jed_id")) + "</JED_ID>");
              //etc
              writer.print("\n\t</row>");
            }
            System.out.println("Kraj u : " +Util.nowTime());
        }
        writer.print("\n</PROSTORNE_JEDINICE>");

But generateXML part still takes a lot of memory (if I'm guessing correctly, it takes bit by bit as much as it can) and I don't see how I could optimize it (use an alternative way to feed the writer.print function)?

查看更多
▲ chillily
5楼-- · 2020-06-03 08:08

I have never encountered this usecase but I am pretty sure vtd-xml supports xml's of size more than 1 GB. It is worth checking out @ http://vtd-xml.sourceforge.net

Or you can also follow all the below article series @ http://www.ibm.com/developerworks/ "Output large XML documents"

查看更多
登录 后发表回答