I know that the only really correct way to protect SQL queries against SQL injection in Java is using PreparedStatements.
However, such a statement requires that the basic structure (selected attributes, joined tables, the structure of the WHERE condition) will not vary.
I have here a JSP application that contains a search form with about a dozen fields. But the user does not have to fill in all of them - just the one he needs. Thus my WHERE condition is different every time.
What should I do to still prevent SQL injection?
Escape the user-supplied values? Write a wrapper class that builds a PreparedStatement each time? Or something else?
The database is PostgreSQL 8.4, but I would prefer a general solution.
Thanks a lot in advance.
Have you seen the JDBC NamedParameterJDBCTemplate ?
The NamedParameterJdbcTemplate class
adds support for programming JDBC
statements using named parameters (as
opposed to programming JDBC statements
using only classic placeholder ('?')
arguments.
You can do stuff like:
String sql = "select count(0) from T_ACTOR where first_name = :first_name";
SqlParameterSource namedParameters = new MapSqlParameterSource("first_name", firstName);
return namedParameterJdbcTemplate.queryForInt(sql, namedParameters);
and build your query string dynamically, and then build your SqlParameterSource
similarly.
I think that fundamentally, this question is the same as the other questions that I referred to in my comment above, but I do see why you disagree — you're changing what's in your where
clause based on what the user supplied.
That still isn't the same as using user-supplied data in the SQL query, though, which you definitely want to use PreparedStatement
for. It's actually very similar to the standard problem of needing to use an in
statement with PreparedStatement
(e.g., where fieldName in (?, ?, ?)
but you don't know in advance how many ?
you'll need). You just need to build the query dynamically, and add the parameters dynamically, based on information the user supplied (but not directly including that information in the query).
Here's an example of what I mean:
// You'd have just the one instance of this map somewhere:
Map<String,String> fieldNameToColumnName = new HashMap<String,String>();
// You'd actually load these from configuration somewhere rather than hard-coding them
fieldNameToColumnName.put("title", "TITLE");
fieldNameToColumnName.put("firstname", "FNAME");
fieldNameToColumnName.put("lastname", "LNAME");
// ...etc.
// Then in a class somewhere that's used by the JSP, have the code that
// processes requests from users:
public AppropriateResultBean[] doSearch(Map<String,String> parameters)
throws SQLException, IllegalArgumentException
{
StringBuilder sql;
String columnName;
List<String> paramValues;
AppropriateResultBean[] rv;
// Start the SQL statement; again you'd probably load the prefix SQL
// from configuration somewhere rather than hard-coding it here.
sql = new StringBuilder(2000);
sql.append("select appropriate,fields from mytable where ");
// Loop through the given parameters.
// This loop assumes you don't need to preserve some sort of order
// in the params, but is easily adjusted if you do.
paramValues = new ArrayList<String>(parameters.size());
for (Map.Entry<String,String> entry : parameters.entrySet())
{
// Only process fields that aren't blank.
if (entry.getValue().length() > 0)
{
// Get the DB column name that corresponds to this form
// field name.
columnName = fieldNameToColumnName.get(entry.getKey());
// ^-- You'll probably need to prefix this with something, it's not likely to be part of this instance
if (columnName == null)
{
// Somehow, the user got an unknown field into the request
// and that got past the code calling us (perhaps the code
// calling us just used `request.getParameterMap` directly).
// We don't allow unknown fields.
throw new IllegalArgumentException(/* ... */);
}
if (paramValues.size() > 0)
{
sql.append("and ");
}
sql.append(columnName);
sql.append(" = ? ");
paramValues.add(entry.getValue());
}
}
// I'll assume no parameters is an invalid case, but you can adjust the
// below if that's not correct.
if (paramValues.size() == 0)
{
// My read of the problem being solved suggests this is not an
// exceptional condition (users frequently forget to fill things
// in), and so I'd use a flag value (null) for this case. But you
// might go with an exception (you'd know best), either way.
rv = null;
}
else
{
// Do the DB work (below)
rv = this.buildBeansFor(sql.toString(), paramValues);
}
// Done
return rv;
}
private AppropriateResultBean[] buildBeansFor(
String sql,
List<String> paramValues
)
throws SQLException
{
PreparedStatement ps = null;
Connection con = null;
int index;
AppropriateResultBean[] rv;
assert sql != null && sql.length() > 0);
assert paramValues != null && paramValues.size() > 0;
try
{
// Get a connection
con = /* ...however you get connections, whether it's JNDI or some conn pool or ... */;
// Prepare the statement
ps = con.prepareStatement(sql);
// Fill in the values
index = 0;
for (String value : paramValues)
{
ps.setString(++index, value);
}
// Execute the query
rs = ps.executeQuery();
/* ...loop through results, creating AppropriateResultBean instances
* and filling in your array/list/whatever...
*/
rv = /* ...convert the result to what we'll return */;
// Close the DB resources (you probably have utility code for this)
rs.close();
rs = null;
ps.close();
ps = null;
con.close(); // ...assuming pool overrides `close` and expects it to mean "release back to pool", most good pools do
con = null;
// Done
return rv;
}
finally
{
/* If `rs`, `ps`, or `con` is !null, we're processing an exception.
* Clean up the DB resources *without* allowing any exception to be
* thrown, as we don't want to hide the original exception.
*/
}
}
Note how we use information the user supplied us (the fields they filled in), but we didn't ever put anything they actually supplied directly in the SQL we executed, we always ran it through PreparedStatement
.
The best solution is to use a middle that does data validation and binding and acts as an intermediary between the JSP and the database.
There might be a list of column names, but it's finite and countable. Let the JSP worry about making the user's selection known to the middle tier; let the middle tier bind and validate before sending it on to the database.
I'm not confident if there is a quote() method, which was widely used in PHP's PDO. This would allow you a more flexible query building approach.
Also, one of the possible ideas could be creating special class, which would process filter criterias and would save into a stack all placeholders and their values.
Here is a useful technique for this particular case, where you have a number of clauses in your WHERE
but you don't know in advance which ones you need to apply.
Will your user search by title?
select id, title, author from book where title = :title
Or by author?
select id, title, author from book where author = :author
Or both?
select id, title, author from book where title = :title and author = :author
Bad enough with only 2 fields. The number of combinations (and therefore of distinct PreparedStatements) goes up exponentially with the number of conditions. True, chances are you have enough room in your PreparedStatement pool for all those combinations, and to build the clauses programatically in Java, you just need one if
branch per condition. Still, it's not that pretty.
You can fix this in a neat way by simply composing a SELECT
that looks the same regardless of whether each individual condition is needed.
I hardly need mention that you use a PreparedStatement
as suggested by the other answers, and a NamedParameterJdbcTemplate is nice if you're using Spring.
Here it is:
select id, title, author
from book
where coalesce(:title, title) = title
and coalesce(:author, author) = author
Then you supply NULL
for each unused condition. coalesce()
is a function that returns its first non-null argument. Thus if you pass NULL
for :title
, the first clause is where coalesce(NULL, title) = title
which evaluates to where title = title
which, being always true, has no effect on the results.
Depending on how the optimiser handles such queries, you may take a performance hit. But probably not in a modern database.
(Though similar, this problem is not the same as the IN (?, ?, ?)
clause problem where you don't know the number of values in the list, since here you do have a fixed number of possible clauses and you just need to activate/disactivate them individually.)