Why Hibernate inlines Integer parameter list passe

2020-03-01 12:45发布

I am building a query using JPA Criteria API. When I created two restriction predicates using javax.persistence.criteria.Path#in(Collection<?>) method the generated SQL query was a little bit different than I excpected.

The first predicate which was build over int attribute produced SQL with all elements of parameter collection inlined: in (10, 20, 30).

The second predicate which was build over String attribute produced parametrized SQL: in (?, ?, ?).

Let me show:

Entity:

@Entity
public class A {
    @Id 
    private Integer id;
    private int intAttr;
    private String stringAttr;
    //getter/setters
}

Query:

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<A> q = cb.createQuery(A.class);
Root<A> root = q.from(A.class);
q.where(
    root.get("intAttr").in(Arrays.asList(10, 20, 30)),
    root.get("stringAttr").in(Arrays.asList("a", "b", "c"))
);
entityManager.createQuery(q).getResultList();

Log:

select
    a0_.id as id1_0_,
    a0_.intAttr as intAttr2_0_,
    a0_.stringAttr as stringAt3_0_ 
from
    A a0_ 
where
    (
        a0_.intAttr in (
            10 , 20 , 30
        )
    ) 
    and (
        a0_.stringAttr in (
            ? , ? , ?
        )
    ) 
org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [1] as [VARCHAR] - [a] 
org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [2] as [VARCHAR] - [b] 
org.hibernate.type.descriptor.sql.BasicBinder - binding parameter [3] as [VARCHAR] - [c] 

My questions:

  1. Why are elements of Integer list inlined directly to sql and elements of String list are handled as prepared statement parameters?
  2. Is this feature Hibernate specific or is it guaranteed by JPA?
  3. From DB perspective which of two should be preferred?
  4. Is this int-yes string-no inlining somehow related to sql injection?
  5. Is this somehow related to limitation of number of values in sql IN clause the RDMBS can process?
  6. How to write a criteria query which will handle Integer parameter list the same way as String parameter list.

4条回答
我命由我不由天
2楼-- · 2020-03-01 12:46
  1. Because Strings can contain SQL and Integers cannot, there is no need to from a security aspect (SQL injection).
  2. The JPA spec doesn't specify it as explicit as you would like it to be. It seems to be an implementation details.
  3. Prepared statement parameters for String parameters. For int parameters it doesn't matter since they cannot be misused by hackers.
  4. YES
  5. You should look that up in the documentation of the specific database you're using. JPA does not care about such things.
  6. Why? What are the benefits? Don't try to improve things when you don't know what you're improving.
查看更多
别忘想泡老子
3楼-- · 2020-03-01 13:06

In issue HHH-9576 a new parameter was added to fix this issue, applicable since version 5.2.12 (?)

<property name="hibernate.criteria.literal_handling_mode" value="bind"/>

If you use this parameter you don't need the verbose solution proposed by Pace anymore.

From hibernate documentation of literal_handling_mode:

This enum defines how literals are handled by JPA Criteria. By default (AUTO), Criteria queries uses bind parameters for any literal that is not a numeric value. However, to increase the likelihood of JDBC statement caching, you might want to use bind parameters for numeric values too. The BIND mode will use bind variables for any literal value. The INLINE mode will inline literal values as-is. To prevent SQL injection, never use INLINE with String variables. Always use constants with the INLINE mode.

查看更多
迷人小祖宗
4楼-- · 2020-03-01 13:09

Why are strings bound and numeric literals not bound?

One should always do parameter binding for strings (as opposed to putting the literal in the query) to avoid SQL injection.

However, the real question, is why to insert the literal directly into the query instead of using binding. The original reason was:

So iirc the issue that lead me to use literals here had to do with scale and operations. Meaning (again, iirc) some databases needed to know type information to be able to properly handle something like ... ? + ? ..., etc. So the choice was to either wrap all such params in CAST function calls and hope/pray the db implemented a proper CAST function or use literals. In the end I opted for the literal route because, well, thats what the user asked for up front. Wrapping in function calls will limit the databases ability to leverage indexes in quite a few databases.

Which is better for the db?

It depends on the database and the query and likely won't make a huge difference. For example, Oracle can only do certain partitions when the value is a literal, other databases can only do certain optimizations when the value is a bound parameter. If it becomes an issue (e.g. you profile it and you know that is what is slowing you down) then just switch to the other method.

Is this in the JPA spec?

No.

Is this related to the # of values allowed in an in statement?

No.

Can I have a numeric literal bound instead of inserted directly into the query

Yes, but it is somewhat verbose.

CriteriaBuilder cb = getEntityManager().getCriteriaBuilder();
CriteriaQuery<Foo> query = cb.createQuery(Foo.class);
Root<Foo> root = query.from(Foo.class);
ParameterExpression<Long> paramOne = cb.parameter(Long.class);
Predicate versionPredicate = root.get("bar").in(paramOne);
query.select(root).where(versionPredicate);
TypedQuery<Foo> typedQuery = getEntityManager().createQuery(query);
typedQuery.setParameter(paramOne, 1L);

That will use parameter binding for the long. It is only one parameter but one could easily extrapolate from here for multiple parameters and helper methods could clean things up.

References:

Most of the reasoning is explained and discussed in HHH-6280. The particular method in question that does this rendering is LiteralExpression.render.

查看更多
地球回转人心会变
5楼-- · 2020-03-01 13:11

I fully agree with Niels that the string parameters should not be inlined in order to prevent SQL injection.

But I checked it with DataNucleus 4.1.9 and Derby db, and to my surprise the log shows inlining for Strings too. Also it shows that DataNucleus implements the "IN" criteria query with combinations of "OR" conditions. Probably this is inferior to Hibernate, and probably a security risk. Example for the possible dangers of ever-higher level abstractions. You can not be too cautious :-).

The log:

Begin compiling prepared statement: 
SELECT 'pack.entities.I' AS NUCLEUS_TYPE,DN_THIS.ID,DN_THIS.INTATTR,DN_THIS.STRINGATTR FROM I DN_THIS 
WHERE (((DN_THIS.INTATTR = 10) OR (DN_THIS.INTATTR = 20)) OR (DN_THIS.INTATTR = 30)) 
AND (((DN_THIS.STRINGATTR = 'a') OR (DN_THIS.STRINGATTR = 'b')) OR (DN_THIS.STRINGATTR = 'c')) :End prepared statement

Tue Apr 26 15:46:01 CEST 2016 Thread[DRDAConnThread_3,5,derby.daemons] 
End compiling prepared statement: 
SELECT 'pack.entities.I' AS NUCLEUS_TYPE,DN_THIS.ID,DN_THIS.INTATTR,DN_THIS.STRINGATTR FROM I DN_THIS 
WHERE (((DN_THIS.INTATTR = 10) OR (DN_THIS.INTATTR = 20)) OR (DN_THIS.INTATTR = 30)) 
AND (((DN_THIS.STRINGATTR = 'a') OR (DN_THIS.STRINGATTR = 'b')) OR (DN_THIS.STRINGATTR = 'c')) :End prepared statement

Tue Apr 26 15:46:01 CEST 2016 Thread[DRDAConnThread_3,5,derby.daemons] Executing prepared statement: 
SELECT 'pack.entities.I' AS NUCLEUS_TYPE,DN_THIS.ID,DN_THIS.INTATTR,DN_THIS.STRINGATTR FROM I DN_THIS 
WHERE (((DN_THIS.INTATTR = 10) OR (DN_THIS.INTATTR = 20)) OR (DN_THIS.INTATTR = 30)) 
AND (((DN_THIS.STRINGATTR = 'a') OR (DN_THIS.STRINGATTR = 'b')) OR (DN_THIS.STRINGATTR = 'c')) :End prepared statement
查看更多
登录 后发表回答