Lexing partial SQL in C#

2019-03-25 15:19发布

I'd need to parse partial SQL queries (it's for a SQL injection auditing tool). For example

'1' AND 1=1--

Should break down into tokens like

[0] => [SQL_STRING, '1']
[1] => [SQL_AND]
[2] => [SQL_INT, 1]
[3] => [SQL_AND]
[4] => [SQL_INT, 1]
[5] => [SQL_COMMENT]
[6] => [SQL_QUERY_END]

Are their any at least lexers for SQL that I base mine off of or any good tools like bison for C# (though I'd rather not write my own grammar as I need to support most if not all the grammar of MySQL 5)

2条回答
疯言疯语
2楼-- · 2019-03-25 15:33

Also there may be some way to utilized fully parsed (by Microsoft) T-SQL via database editions of Visual studio -

The crown jewels of the Database Edition product are the SQL parsers and script generator, these two pieces form the foundation of what the database project system does internally.

http://blogs.msdn.com/b/gertd/archive/2008/08/21/getting-to-the-crown-jewels.aspx

查看更多
Fickle 薄情
3楼-- · 2019-03-25 15:35

Seems that there's a few good parsers out there.

This SO article has a sample using MS's Entity Framework:
Parsing SQL code in C#

Seems someone else rolled their own and put it up on Code Project:
http://www.codeproject.com/KB/dotnet/SQL_parser.aspx

Personally, I'd go with the Entity Framework solution, since it was created and maintained by MS, but it also therefore probably is closely coupled with SQL Server. Since you're looking at MySQL, you may want to go with the custom solution on Code Project, as I'm sure you can then code in more custom solutions as the grammar requires.

I'll be using this soon (for Oracle, not MySQL), so please let the community know how the solution works out!

UPDATE:
I just came back to this and read the comments... upon further reflection, I'd really recommend ANTLR, since it supports multiple grammars. Once again, I haven't used it, so it'll be good to hear how it worked out, and it's up to you to decide.
https://stackoverflow.com/questions/76083/parsing-sql-in-net/76151

查看更多
登录 后发表回答