Java Iterator backed by a ResultSet

2019-01-21 20:08发布

站内文章 / Java

33 0

戒情不戒烟

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I've got a class that implements Iterator with a ResultSet as a data member. Essentially the class looks like this:

public class A implements Iterator{
    private ResultSet entities;
    ...
    public Object next(){
        entities.next();
        return new Entity(entities.getString...etc....)
    }

    public boolean hasNext(){
        //what to do?
    }
    ...
}

How can I check if the ResultSet has another row so I can create a valid hasNext method since ResultSet has no hasNext defined itself? I was thinking doing SELECT COUNT(*) FROM... query to get the count and managing that number to see if there's another row but I'd like to avoid this.

回答1:

This is a bad idea. This approach requires that the connection is open the whole time until the last row is read, and outside the DAO layer you never know when it will happen, and you also seem to leave the resultset open and risk resource leaks and application crashes in the case the connection times out. You don't want to have that.

The normal JDBC practice is that you acquire Connection, Statement and ResultSet in the shortest possible scope. The normal practice is also that you map multiple rows into a List or maybe a Map and guess what, they do have an Iterator.

public List<Data> list() throws SQLException {
    List<Data> list = new ArrayList<Data>();

    try (
        Connection connection = database.getConnection();
        Statement statement = connection.createStatement("SELECT id, name, value FROM data");
        ResultSet resultSet = statement.executeQuery();
    ) {
        while (resultSet.next()) {
            list.add(map(resultSet));
        }
    }

    return list;
}

private Data map(ResultSet resultSet) throws SQLException {
    Data data = new Data(); 
    data.setId(resultSet.getLong("id"));
    data.setName(resultSet.getString("name"));
    data.setValue(resultSet.getInteger("value"));
    return data;
}

And use it as below:

List<Data> list = dataDAO.list(); 
int count = list.size(); // Easy as that.
Iterator<Data> iterator = list.iterator(); // There is your Iterator.

Do not pass expensive DB resources outside the DAO layer like you initially wanted to do. For more basic examples of normal JDBC practices and the DAO pattern you may find this article useful.

回答2:

You can get out of this pickle by performing a look-ahead in the hasNext() and remembering that you did a lookup to prevent consuming too many records, something like:

public class A implements Iterator{
    private ResultSet entities;
    private boolean didNext = false;
    private boolean hasNext = false;
    ...
    public Object next(){
        if (!didNext) {
            entities.next();
        }
        didNext = false;
        return new Entity(entities.getString...etc....)
    }

    public boolean hasNext(){
        if (!didNext) {
            hasNext = entities.next();
            didNext = true;
        }
        return hasNext;
    }
    ...
}

回答3:

ResultSet has an 'isLast()' method that might suit your needs. The JavaDoc says it is quite expensive though since it has to read ahead. There is a good chance it is caching the look-ahead value like the others suggest trying.

回答4:

Its not a really bad idea in the cases where you need it, it's just that you often do not need it.

If you do need to do something like, say, stream your entire database.... you could pre-fetch the next row - if the fetch fails your hasNext is false.

Here is what I used:

/**
 * @author Ian Pojman <pojman@gmail.com>
 */
public abstract class LookaheadIterator<T> implements Iterator<T> {
    /** The predetermined "next" object retrieved from the wrapped iterator, can be null. */
    protected T next;

    /**
     * Implement the hasNext policy of this iterator.
     * Returns true of the getNext() policy returns a new item.
     */
    public boolean hasNext()
    {
        if (next != null)
        {
            return true;
        }

        // we havent done it already, so go find the next thing...
        if (!doesHaveNext())
        {
            return false;
        }

        return getNext();
    }

    /** by default we can return true, since our logic does not rely on hasNext() - it prefetches the next */
    protected boolean doesHaveNext() {
        return true;
    }

    /**
     * Fetch the next item
     * @return false if the next item is null. 
     */
    protected boolean getNext()
    {
        next = loadNext();

        return next!=null;
    }

    /**
     * Subclasses implement the 'get next item' functionality by implementing this method. Implementations return null when they have no more.
     * @return Null if there is no next.
     */
    protected abstract T loadNext();

    /**
     * Return the next item from the wrapped iterator.
     */
    public T next()
    {
        if (!hasNext())
        {
            throw new NoSuchElementException();
        }

        T result = next;

        next = null;

        return result;
    }

    /**
     * Not implemented.
     * @throws UnsupportedOperationException
     */
    public void remove()
    {
        throw new UnsupportedOperationException();
    }
}

then:

    this.lookaheadIterator = new LookaheadIterator<T>() {
        @Override
        protected T loadNext() {
            try {
                if (!resultSet.next()) {
                    return null;
                }

                // process your result set - I use a Spring JDBC RowMapper
                return rowMapper.mapRow(resultSet, resultSet.getRow());
            } catch (SQLException e) {
                throw new IllegalStateException("Error reading from database", e);
            }
        }
    };
}

回答5:

One option is the ResultSetIterator from the Apache DBUtils project.

BalusC rightly points out the the various concerns in doing this. You need to be very careful to properly handle the connection/resultset lifecycle. Fortunately, the DBUtils project also has solutions for safely working with resultsets.

If BalusC's solution is impractical for you (e.g. you are processing large datasets that can't all fit in memory) you might want to give it a shot.

回答6:

public class A implements Iterator<Entity>
{
    private final ResultSet entities;

    // Not required if ResultSet.isLast() is supported
    private boolean hasNextChecked, hasNext;

    . . .

    public boolean hasNext()
    {
        if (hasNextChecked)
           return hasNext;
        hasNext = entities.next();
        hasNextChecked = true;
        return hasNext;

        // You may also use !ResultSet.isLast()
        // but support for this method is optional 
    }

    public Entity next()
    {
        if (!hasNext())
           throw new NoSuchElementException();

        Entity entity = new Entity(entities.getString...etc....)

        // Not required if ResultSet.isLast() is supported
        hasNextChecked = false;

        return entity;
    }
}

回答7:

You could try the following:

public class A implements Iterator {
    private ResultSet entities;
    private Entity nextEntity;
    ...
    public Object next() {
        Entity tempEntity;
        if ( !nextEntity ) {
            entities.next();
            tempEntity = new Entity( entities.getString...etc....)
        } else {
            tempEntity = nextEntity;
        }

        entities.next();
        nextEntity = new Entity( entities.getString...ext....)

        return tempEntity;
    }

    public boolean hasNext() {
        return nextEntity ? true : false;
    }
}

This code caches the next entity, and hasNext() returns true, if the cached entity is valid, otherwise it returns false.

回答8:

There are a couple of things you could do depending on what you want your class A. If the major use case is to go through every single result then perhaps its best to preload all the Entity objects and throw away the ResultSet.

If however you don't want to do that you could use the next() and previous() method of ResultSet

public boolean hasNext(){
       boolean next = entities.next();

       if(next) {

           //reset the cursor back to its previous position
           entities.previous();
       }
}

You do have to be careful to make sure that you arent currently reading from the ResultSet, but, if your Entity class is a proper POJO (or at least properly disconnected from ResultSet then this should be a fine approach.

回答9:

entities.next returns false if there are no more rows, so you could just get that return value and set a member variable to keep track of the status for hasNext().

But to make that work you would also have to have some sort of init method that reads the first entity and caches it in the class. Then when calling next you would need to return the previously cached value and cache the next value, etc...

回答10:

You can use ResultSetIterator, just put your ResultSet in the constructor.

ResultSet rs = ...    
ResultSetIterator = new ResultSetIterator(rs);

回答11:

I agree with BalusC. Allowing an Iterator to escape from your DAO method is going to make it difficult to close any Connection resources. You will be forced to know about the connection lifecycle outside of your DAO, which leads to cumbersome code and potential connection leaks.

However, one choice that I've used is to pass a Function or Procedure type into the DAO method. Basically, pass in some sort of callback interface that will take each row in your result set.

For example, maybe something like this:

public class MyDao {

    public void iterateResults(Procedure<ResultSet> proc, Object... params)
           throws Exception {

        Connection c = getConnection();
        try {
            Statement s = c.createStatement(query);
            ResultSet rs = s.executeQuery();
            while (rs.next()) {
                proc.execute(rs);
            }

        } finally {
            // close other resources too
            c.close();
        }
    }

}


public interface Procedure<T> {
   void execute(T t) throws Exception;
}


public class ResultSetOutputStreamProcedure implements Procedure<ResultSet> {
    private final OutputStream outputStream;
    public ResultSetOutputStreamProcedure(OutputStream outputStream) {
        this.outputStream = outputStream;
    }

    @Override
    public void execute(ResultSet rs) throws SQLException {
        MyBean bean = getMyBeanFromResultSet(rs);
        writeMyBeanToOutputStream(bean);
    }    
}

In this way, you keep your database connection resources inside your DAO, which is proper. But, you are not necessarily required to fill a Collection if memory is a concern.

Hope this helps.

回答12:

Iterators are problematic for traversing ResultSets for reasons mentioned above but Iterator like behaviour with all the required semantics for handling errors and closing resources is available with reactive sequences (Observables) in RxJava. Observables are like iterators but include the notions of subscriptions and their cancellations and error handling.

The project rxjava-jdbc has implementations of Observables for jdbc operations including traversals of ResultSets with proper closure of resources, error handling and the ability to cancel the traversal as required (unsubscribe).

回答13:

Here's my iterator that wraps a ResultSet. The rows are returned in the form a Map. I hope you'll find it helpful. The strategy is that I always bring one element in advance.

public class ResultSetIterator implements Iterator<Map<String,Object>> {

    private ResultSet result;
    private ResultSetMetaData meta;
    private boolean hasNext;

    public ResultSetIterator( ResultSet result ) throws SQLException {
        this.result = result;
        meta = result.getMetaData();
        hasNext = result.next();
    }

    @Override
    public boolean hasNext() {
        return hasNext;
    }

    @Override
    public Map<String, Object> next() {
        if (! hasNext) {
            throw new NoSuchElementException();
        }
        try {
            Map<String,Object> next = new LinkedHashMap<>();
            for (int i = 1; i <= meta.getColumnCount(); i++) {
                String column = meta.getColumnName(i);
                Object value = result.getObject(i);
                next.put(column,value);
            }
            hasNext = result.next();
            return next;
        }
        catch (SQLException ex) {
            throw new RuntimeException(ex);
        }
    }
}

回答14:

Do you expect most of the data in your result set to actually be used? If so, pre-cache it. It's quite trivial using eg Spring

  List<Map<String,Object>> rows = jdbcTemplate.queryForList(sql);
  return rows.iterator();

Adjust to suit your taste.

回答15:

I think there's enough decry over why it's a really bad idea to use ResultSet in an Iterator (in short, ResultSet maintains an active connection to DB and not closing it ASAP can lead to problems).

But in a different situation, if you're getting ResultSet (rs) and are going to iterate over the elements, but you also wanted to do something before the iteration like this:

if (rs.hasNext()) { //This method doesn't exist
    //do something ONCE, *IF* there are elements in the RS
}
while (rs.next()) {
    //do something repeatedly for each element
}

You can achieve the same effect by writing it like this instead:

if (rs.next()) {
    //do something ONCE, *IF* there are elements in the RS
    do {
        //do something repeatedly for each element
    } while (rs.next());
}

回答16:

It can be done like this:

public boolean hasNext() {
    ...
    return !entities.isLast();
    ...
}

回答17:

It sounds like you are stuck between either providing an inefficient implementation of hasNext or throwing an exception stating that you do not support the operation.

Unfortunately there are times when you implement an interface and you don't need all of the members. In that case I would suggest that you throw an exception in that member that you will not or cannot support and document that member on your type as an unsupported operation.