Keywords in SQL script data causing problems when

2019-06-03 20:24发布

问题:

I'm fairly new to sql and am having a problem with keywords causing havoc in my sql script. I'm trying to execute a list of premade .sql script files in C#. I'm currently reading the file to a string and executing it with command.ExecuteNonQuery(). This works great for most of the scripts, but I'm running into one that inadvertently contains a keyword:

INSERT INTO [thetable]
SELECT '123123', 'abcabc', 'I WANT TO GO TO BED'
UNION ALL
SELECT '123124', 'abcdef', 'SOOO TIRED'

Essentially, when it hits that GO the command fails.

I am in charge of creating these insert files, so if I need to reformat them in some way then that is possible; however, the data in them is non negotiable. Also, since I am loading them from a file with many rows, parameterization to avoid this stuff doesn't seem feasible either.

Any help would be greatly appreciated at this point. Thanks a lot!

EDIT to add info:

To clarify, the actual string is something more like 'ASVFDS4+23eF3da34sddsdf3d3t4g...100charslater...sd5OAyGOsiISIssdsd/sNUIGsdisd354f'. When I try and execute the command, I catch the exception, which says:

"Unclosed quotation mark after character string 'ASVFDS4+23eF3da34sddsdf3d3t4g...100charslater...sd5OAy'

Note that 5OAy is immediately followed by GOsiIS..., which lead me to believe that the GO is actually being read as a command, causing it to expect an end to the string before that command.

Running .NET 3.5

EDIT 2 I also should clarify, I am currently splitting on the actual GO statements and executing the commands individually.

i.e.

USE MyDatabase
GO
INSERT INTO [thetable]
SELECT '123123', 'abcabc', 'I WANT TO GO TO BED'
UNION ALL
SELECT '123124', 'abcdef', 'SOOO TIRED'
UNION ALL
...
SELECT '123189', 'abcabc', 'HAD SOME SLEEP'
GO

splits, so I execute

USE MyDatabase

and

INSERT INTO [thetable]
SELECT '123123', 'abcabc', 'I WANT TO GO TO BED'
UNION ALL
SELECT '123124', 'abcdef', 'SOOO TIRED'
UNION ALL
...
SELECT '123189', 'abcabc', 'HAD SOME SLEEP'

separately. So my issue isn't with actual GO statements, it's with the characters 'GO' appearing in the data string.

ANSWER: The problem is that I've made a terribly dumb mistake. I split on "GO", which splits the command string right in the middle of that parameter where the letters GO appear.

/facepalm

Thanks for all of the help!

回答1:

You need to recognise the GO yourself, and use it to split the file up into batches, then execute each one individually.

Use a regex something like m/^\s+GO\s+$/i to recognise the GO lines.



回答2:

If you need to parse any Sql scripts with comments and string values with 'go' ("smth go smth") and etc. you can use gplex tool. Gplex rules for sql script parsing:

%namespace LexScanner
%option verbose, summary, noparser, unicode

%x QUOTE
%x COMMENT

%{
    static string line = "";
    static List<string> butch = new List<string>();
    enum TokenType {
        SL_COMMENT,
        ML_COMMENT,
        STRING,
        WORD,
        OTHER,
        ending
    };
%}

dotchr [^\r\n] 
eol (\r\n?|\n)
%%
\-\-[^\n]*$             { add(yytext, TokenType.SL_COMMENT); }

\/\*                    { add(yytext, TokenType.ML_COMMENT); BEGIN(COMMENT); }
<COMMENT>\*\/           { add(yytext, TokenType.ML_COMMENT); BEGIN(INITIAL); }
<COMMENT>[^\*]+         { add(yytext, TokenType.ML_COMMENT); }
<COMMENT>\*             { add(yytext, TokenType.ML_COMMENT); }

\'                      { add(yytext, TokenType.STRING); BEGIN(QUOTE); }
<QUOTE>\'\'             { add(yytext, TokenType.STRING); }
<QUOTE>[^\']+           { add(yytext, TokenType.STRING); }
<QUOTE>\'               { add(yytext, TokenType.STRING); BEGIN(INITIAL); }

[gG][oO]                { push(); }

[a-zA-Z0-9]+            { add(yytext, TokenType.WORD); }
.                       { add(yytext, TokenType.OTHER); }
\r?\n                   { add(yytext, TokenType.OTHER); }
<<EOF>>                 { push(); }
%%

Then you generate C# class and use it.

EDIT:
Some more comments how to use it.
Functions add(string text, TokenType token) and push() - what to do with parsed string. add() function collects parsed strings between GO keywords and writes results to output file (just to control):

private void add(string text, TokenType token)
{
    //write to the file for output control (for test only)
    using (StreamWriter str = new StreamWriter("C:\\temp\\temp.txt", true))
    {
        str.WriteLine(token + " : " + text); 
    }
    line += text;
}

push() collects butch strings for execution:

private void push()
{
    //write to the file for output control (for test only)
    using (StreamWriter str = new StreamWriter("C:\\temp\\butch.txt", true))
    {
        str.WriteLine("GO: " + line); 
    }

    butch.Add(line);
    line = "";
}

To use this class from C# code your should specify entry point. For example:

 public static List<string> ParseFile(String fileToParse)
 {
     int tok;
     Scanner scnr = new Scanner();
     scnr.SetSource(fileToParse, 0);
     do {
             tok = scnr.yylex();
         } while (tok > (int)Tokens.EOF);
     return butch;
 }

Or define a Main function to use it as standalone application.

All the above code should be placed in the .lex file. The file sqlparser.cs is created by invoking from command line:

gplex sqlparser.lex

Gplex has a good documentation and examples how to use it.