可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
In terms of SQL injection, I completely understand the necessity to parameterize a string
parameter; that's one of the oldest tricks in the book. But when can it be justified to not parameterize an SqlCommand
? Are any data types considered "safe" to not parameterize?
For example: I don't consider myself anywhere near an expert in SQL, but I can't think of any cases where it would be potentially vulnerable to SQL injection to accept a bool
or an int
and just concatenate it right into the query.
Is my assumption correct, or could that potentially leave a huge security vulnerability in my program?
For clarification, this question is tagged c# which is a strongly-typed language; when I say "parameter," think something like public int Query(int id)
.
回答1:
I think it's safe... technically, but it's a terrible habit to get into. Do you really want to be writing queries like this?
var sqlCommand = new SqlCommand("SELECT * FROM People WHERE IsAlive = " + isAlive +
" AND FirstName = @firstName");
sqlCommand.Parameters.AddWithValue("firstName", "Rob");
It also leaves you vulnerable in the situation where a type changes from an integer to a string (Think employee number which, despite its name - may contain letters).
So, we've changed the type of EmployeeNumber from int
to string
, but forgot to update our sql queries. Oops.
回答2:
When using a strongly-typed platform on a computer you control (like a web server), you can prevent code injection for queries with only bool
, DateTime
, or int
(and other numeric) values. What is a concern are performance issues caused by forcing sql server to re-compile every query, and by preventing it from getting good statistics on what queries are run with what frequency (which hurts cache management).
But that "on a computer you control" part is important, because otherwise a user can change the behavior used by the system for generating strings from those values to include arbitrary text.
I also like to think long-term. What happens when today's old-and-busted strongly-typed code base gets ported via automatic translation to the new-hotness dynamic language, and you suddenly lose the type checking, but don't have all the right unit tests yet for the dynamic code?
Really, there's no good reason not to use query parameters for these values. It's the right way to go about this. Go ahead and hard-code values into the sql string when they really are constants, but otherwise, why not just use a parameter? It's not like it's hard.
Ultimately, I wouldn't call this a bug, per se, but I would call it a smell: something that falls just short of a bug by itself, but is a strong indication that bugs are nearby, or will be eventually. Good code avoids leaving smells, and any good static analysis tool will flag this.
I'll add that this is not, unfortunately, the kind of argument you can win straight up. It sounds like a situation where being "right" is no longer enough, and stepping on your co-workers toes to fix this issue on your own isn't likely to promote good team dynamics; it could ultimately hurt more than it helps. A better approach in this case may be to promote the use of a static analysis tool. That would give legitimacy and credibility to efforts aimed and going back and fixing existing code.
回答3:
In some cases, it IS possible to perform SQL injection attack with non-parametrized (concatenated) variables other than string values - see this article by Jon: http://codeblog.jonskeet.uk/2014/08/08/the-bobbytables-culture/ .
Thing is that when ToString
is called, some custom culture provider can transform a non-string parameter into its string representation which injects some SQL into the query.
回答4:
This is not safe even for non-string types. Always use parameters. Period.
Consider following code example:
var utcNow = DateTime.UtcNow;
var sqlCommand = new SqlCommand("SELECT * FROM People WHERE created_on <= '" + utcNow + "'");
At the first glance code looks safe, but everything changes if you make some changes in Windows Regional Settings and add injection in short date format:
Now resulting command text looks like this:
SELECT * FROM People WHERE created_on <= '26.09.2015' OR '1'<>' 21:21:43'
The same can be done for int
type as user can define custom negative sign which can be easily changed into SQL injection.
One could argue that invariant culture should be used instead of current culture, but I have seen string concatenations like this so many times and it is quite easy to miss when concatenating strings with objects using +
.
回答5:
"SELECT * FROM Table1 WHERE Id=" + intVariable.ToString()
Security
It is OK.
Attackers can not inject anything in your typed int variable.
Performance
Not OK.
It's better to use parameters, so the query will be compiled once and cached for next usage. Next time even with different parameter values, query is cached and doesn't need to compile in database server.
Coding Style
Bad practice.
- Parameters are more readable
- Maybe it makes you get used to queries without parameters, then maybe you made a mistake once and use a string value this way and then you probably should say goodbye to your data. Bad habit!
"SELECT * FROM Product WHERE Id=" + TextBox1.Text
Although it is not your question, but maybe useful for future readers:
Security
Disaster!
Even when the Id
field is integer, your query may be subject to SQL Injection.
Suppose you have a query in your application "SELECT * FROM Table1 WHERE Id=" + TextBox1.Text
, An attacker can insert into text box 1; DELETE Table1
and the query will be:
"SELECT * FROM Table1 WHERE Id=1; DELETE Table1"
If you don't want to use parametrized query here, you should use typed values:
string.Format("SELECT * FROM Table1 WHERE Id={0}", int.Parse(TextBox1.Text))
Your Question
My question arose because a coworker wrote a bunch of queries
concatenating integer values, and I was wondering whether it was a
waste of my time to go through and fix all of them.
I think changing those codes is not waste of time. Indeed change is Recommended!
if your coworker uses int variables, it has no security risk , But I think changing those codes is not waste of time and indeed changing those codes is recommended. It makes code more readable, more maintainable and makes execution faster.
回答6:
There are actually two questions in one. And question from the title has very little to do with concerns expressed by the OP in the comments afterwards.
Although I realize that for the OP it is their particular case that matters, for the readers coming from Google, it is important to answer to the more general question, that can be phrased as "is concatenation as safe as prepared statements if I made sure that every literal I am concatenating is safe?". So, I would like to concentrate on this latter one. And the answer is
Definitely NO.
The explanation is not that direct as most readers would like, but I'll try my best.
I have been pondering on the matter for a while, resulting in the article (though based on the PHP environment) where I tried to sum everything up. It occurred to me that the question of protection from SQL injection is often eludes toward some related but narrower topics, like string escaping, type casting and such. Although some of the measures can be considered safe when taken by themselves, there is no system, nor a simple rule to follow. Which makes it very slippery ground, putting too much on the developer's attention and experience.
The question of SQL injection cannot be simplified to a matter of some particular syntax issue. It is wider than average developer used to think. It's a methodological question as well. It is not only "Which particular formatting we have to apply", but "How it have to be done" as well.
(From this point of view, an article from Jon Skeet cited in the other answer is doing rather bad than good, as it is again nitpicking on some edge case, concentrating on a particular syntax issue and failing to address the problem at whole.)
When you're trying to address the question of protection not as whole but as a set of different syntax issues, you're facing multitude of problems.
- the list of possible formatting choices is really huge. Means one can easily overlook some. Or confuse them (by using string escaping for identifier for example).
- Concatenation means that all protection measures have to be done by the programmer, not program. This issue alone leads to several consequences:
- such a formatting is manual. Manual means extremely error prone. One could simply forget to apply.
- moreover, there is a temptation to move formatting procedures into some centralized function, messing things even more, and spoiling data that is not going to database.
- when more than one developers involved, problems multiply by a factor of ten.
- when concatenation is used, one cannot tell a potentially dangerous query at glance: they all potentially dangerous!
Unlike that mess, prepared statements are indeed The Holy Grail:
- it can be expressed in the form of one simple rule that is easy to follow.
- it is essentially undetacheable measure, means the developer cannot interfere, and, willingly or unwillingly, spoil the process.
- protection from injection is really only a side effect of the prepared statements, which real purpose is to produce syntactically correct statement. And a syntactically correct statement is 100% injection proof. Yet we need our syntax to be correct despite of any injection possibility.
- if used all the way around, it protects the application regardless of the developer's experience. Say, there is a thing called second order injection. And a very strong delusion that reads "in order to protect, Escape All User Supplied Input". Combined together, they lead to injection, if a developer takes the liberty to decide, what needs to be protected and what not.
(Thinking further, I discovered that current set of placeholders is not enough for the real life needs and have to be extended, both for the complex data structures, like arrays, and even SQL keywords or identifiers, which have to be sometimes added to the query dynamically too, but a developer is left unarmed for such a case, and forced to fall back to string concatenation but that's a matter of another question).
Interestingly, this question's controversy is provoked by the very controversial nature of Stack Overflow. The site's idea is to make use of particular questions from users who ask directly to achieve the goal of having a database of general purpose answers suitable for users who come from search. The idea is not bad per se, but it fails in a situation like this: when a user asks a very narrow question, particularly to get an argument in a dispute with a colleague (or to decide if it worth to refactor the code). While most of experienced participants are trying to write an answer, keeping in mind the mission of Stack Overflow at whole, making their answer good for as many readers as possible, not the OP only.
回答7:
Let's not just think about security or type-safe considerations.
The reason you use parametrized queries is to improve performance at the database level. From a database perspective, a parametrized query is one query in the SQL buffer (to use Oracle's terminology although I imagine all databases have a similar concept internally). So, the database can hold a certain amount of queries in memory, prepared and ready to execute. These queries do not need to be parsed and will be quicker. Frequently run queries will usually be in the buffer and will not need parsing every time they are used.
UNLESS
Somebody doesn't use parametrized queries. In this case, the buffer gets continually flushed through by a stream of nearly identical queries each of which needs to be parsed and run by the database engine and performance suffers all-round as even frequently run queries end up being re-parsed many times a day. I have tuned databases for a living and this has been one of the biggest sources of low-hanging fruit.
NOW
To answer your question, IF your query has a small number of distinct numeric values, you will probably not be causing issues and may in fact improve performance infinitesimally. IF however there are potentially hundreds of values and the query gets called a lot, you are going to affect the performance of your system so don't do it.
Yes you can increase the SQL buffer but it's always ultimately at the expense of other more critical uses for memory like caching Indexes or Data. Moral, use parametrized queries pretty religiously so you can optimize your database and use more server memory for the stuff that matters...
回答8:
To add some info to Maciek answer:
It is easy to alter the culture info of a .NET third party app by calling the main-function of the assembly by reflection:
using System;
using System.Globalization;
using System.Reflection;
using System.Threading;
namespace ConsoleApplication2
{
class Program
{
static void Main(string[] args)
{
Assembly asm = Assembly.LoadFile(@"C:\BobbysApp.exe");
MethodInfo mi = asm.GetType("Test").GetMethod("Main");
mi.Invoke(null, null);
Console.ReadLine();
}
static Program()
{
InstallBobbyTablesCulture();
}
static void InstallBobbyTablesCulture()
{
CultureInfo bobby = (CultureInfo)CultureInfo.InvariantCulture.Clone();
bobby.DateTimeFormat.ShortDatePattern = @"yyyy-MM-dd'' OR ' '=''";
bobby.DateTimeFormat.LongTimePattern = "";
bobby.NumberFormat.NegativeSign = "1 OR 1=1 OR 1=";
Thread.CurrentThread.CurrentCulture = bobby;
}
}
}
This only works if the Main function of BobbysApp is public. If Main is not public, there might be other public functions you might call.
回答9:
In my opinion if you can guarantee that the parameter you working with will never contain a string it is safe but I would not do it in any case. Also, you will see a slight performance drop due to the fact that you are performing concatenation. The question I would ask you is why don't you want to use parameters?
回答10:
It is ok but never safe.. and the security always depend on the inputs, for example if the input object is TextBox, the attackers can do something tricky since the textbox can accept string, so you have to put some kind of validation/conversion to be able prevent users the wrong input. But the thing is, it is not safe. As simply as that.
回答11:
No you can get an SQL injection attack that way. I have written an old article in Turkish which shows how here. Article example in PHP and MySQL but concept works same in C# and SQL Server.
Basically you attack following way. Lets consider you have a page which shows information according to integer id value. You do not parametrized this in value, like below.
http://localhost/sqlEnjeksiyon//instructors.aspx?id=24
Okay, I assume you are using MySQL and I attack following way.
http://localhost/sqlEnjeksiyon//instructors.aspx?id=ASCII((SELECT%20DATABASE()))
Note that here injected value is not string. We are changing char value to int using ASCII function. You can accomplish same thing in SQL Server using "CAST(YourVarcharCol AS INT)".
After that I use length and substring functions to find about your database name.
http://localhost/sqlEnjeksiyon//instructors.aspx?id=LEN((SELECT%20DATABASE()))
http://localhost/sqlEnjeksiyon//instructors.aspx?id=ASCII(SUBSTR(SELECT%20DATABASE(),1,1))
Then using database name, you start to get table names in database.
http://localhost/sqlEnjeksiyon//instructors.aspx?id=ASCII(SUBSTR((SELECT table_name FROM INFORMATION_SCHEMA.TABLES LIMIT 1),1,1))
Of course you have to automate this process, since you only get ONE character per query. But you can easily automate it. My article shows one example in watir. Using only one page and not parameterized ID value. I can learn every table name in your database. After that I can look for important tables. It will take time but it is doable.
回答12:
Well... one thing is sure:
Security it is NOT ok, when you concatenate a string (taken by the user) with your SQL command string.
It is not matter whenever the where clause refers to an Integer or to any type; injections could occur.
What matters in SQL Injection is the data type of the variable that used to get the value from the user.
Supposing we have an integer in the where clause and:
the user-variable is a string. Then ok, it is not very easy to
inject (using UNION) but it is very easy to bypass using 'OR 1=1' -
like attacks...
If the user-variable is a integer. Then again we can 'test' the
strength of the system by passing unusual big numbers testing for
system crashes or even for a hidden buffer overflow (on the final
string)... ;)
Maybe the parameters to queries or (even better - imo) to Stored Procedures are not a 100% Threats safe, but they are the least required measure (or the elementary one if you prefer) to minimize them.