execute query on table that contains billions of r

2019-09-04 16:30发布

问题:

This question already has an answer here:

  • MySQL LIMIT clause equivalent for SQL SERVER 5 answers

I want to fetch some record(it can be 50,100 or something else that is configured by user) from database without using limit clause because our application may be work on multiple database like mysql,oracle,mssql,db2....

i did following solution

package com.test;

import java.sql.Connection;
import java.sql.Statement;
import java.sql.ResultSet;
import java.sql.DriverManager;
import java.util.Date;

public class BatchRetrieveTest extends Object {
    private static final int FETCH_SIZE = 10;

    public BatchRetrieveTest() {
    }

    public static void main(String[] args) {
        BatchRetrieveTest batchRetrieveTest = new BatchRetrieveTest();
        batchRetrieveTest.test();
    }

    void test() {
        Connection conn = null;
        Statement stmt2 = null;
        Date start = null;
        Date end = null;
        int i = 0;
        try {
            conn = DriverManager.getConnection(
                    "jdbc:mysql://localhost:3306/test",
                    "root", "root");
            stmt2 = conn.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE,
                    ResultSet.CONCUR_READ_ONLY);
            conn.setAutoCommit(false);
            stmt2.setFetchSize(FETCH_SIZE);
            stmt2.setPoolable(true);
            start = new Date();
            System.out.println(new Date() + "second execute start"
                    + new Date().getTime());
            ResultSet rs2 = stmt2
                    .executeQuery("SELECT * FROM sample_final_attendance limit 1000");

            end = new Date();
            System.out.println(new Date() + "*************second execute end"
                    + (end.getTime() - start.getTime()));
            rs2.absolute(200000);
            i = 0;
            while (rs2.next()) {
                if (i++ > 100) {
                    break;
                }
            }
            rs2.close();
            stmt2.close();
            end = new Date();
            System.out.println(new Date() + "second read end"
                    + (end.getTime() - start.getTime()));
            conn.close();
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                stmt2.close();
                conn.close();
            } catch (Exception e) {
            }
        }
    }
}

Here sample_final_attendance table contains 15 columns and 3.2 millions of record while executing this program it requires 2GB of memory and 47 seconds of execution time

here i wonder that if some table has billions of record then it fails to execute

also i used setFetchSize as suggested but problem is same

please suggest some solution

thanks in advance

回答1:

Well the ASFAIK & understood, the problem is more related with the handling of data in polyglot storage. If you think, you need to resolve the same in all cases interdependent of the database type - the one common approach is to build a serving layer .

The serving layer can be a cache library or even a Map of Maps created by you. Do not attempt to query the database with large number of records at once, instead bring the data as batches, and store it as a pool of pojos. On demand of the user, you can serve the data from the serving layer.

You can make use of the memcache or hazlecast or many other cache libraries, which can be directly integrated with databases. I really don't know how complex is your situation. What I made is a suggestion. This makes up a data-grid, which can be populated with data from any databases in the background.



回答2:

We have setMaxRow(int numOfRow) in Statement Object this will limit number of rows generated by the Statement Object and simply ignore the remaining.

Take the look at the doc.



标签: java jdbc