SQL error on inserting UTF8 string into SQL Server

2019-07-25 09:45发布

I'm having problems trying to insert strings containing UTF-8 encoded Chinese characters and punctuations into a SQL Server 2008 table (default installation) from my Delphi 7 application using Zeosdb native SQL Server library.

I remembered in the past I had problems inserting UTF8 string into SQL Server even using PHP and other methods so I believe that this problem is not unique to Zeosdb.

It doesn't happen all the time, some UTF8 encoded strings can get inserted successfully but some not. I can't figure out what is it in the string that caused the failure.

Table schema:

CREATE TABLE [dbo].[incominglog](
    [number] [varchar](50) NULL,
    [keyword] [varchar](1000) NULL,
    [message] [varchar](1000) NULL,
    [messagepart1] [varchar](1000) NULL,
    [datetime] [varchar](50) NULL,
    [recipient] [varchar](50) NULL
) ON [PRIMARY]

SQL statement template:

INSERT INTO INCOMINGLOG ([Number], [Keyword], [Message], [MessagePart1], [Datetime], [Recipient]) 
VALUES('{N}', '{KEYWORD}', '{M}', '{M1}', '{TIMESTAMP}', '{NAME}')

The parameter {KEYWORD}, {M} and {M1} can contain UTF8 string.

For example, the following statement will return an error:

Incorrect syntax near 'é¢'. Unclosed quotation mark after the character string '全力克æœå››ç§å±é™©','2013-06-19 17:07:28','')'.

INSERT INTO INCOMINGLOG ([Number], [Keyword], [Message], [MessagePart1], [Datetime], [Recipient]) 
VALUES('+6590621005', '题', '题 [全力克æœå››ç§å±é™© åšå†³æ‰«é™¤ä½œé£Žä¹‹å¼Š]', '[全力克æœå››ç§å±é™©','2013-06-19 17:07:28', '')

Note: Please ignore the actual characters as the utf8 encoding is lost after copy and paste.

I've also tried using NVARCHAR instead of VARCHAR:

CREATE TABLE [dbo].[incominglog](
    [number] [varchar](50) NULL,
    [keyword] [nvarchar](max) NULL,
    [message] [nvarchar](max) NULL,
    [messagepart1] [nvarchar](max) NULL,
    [datetime] [varchar](50) NULL,
    [recipient] [varchar](50) NULL
) ON [PRIMARY]

And also tried amending the SQL statement into:

INSERT INTO INCOMINGLOG ([Number],[Keyword],[Message],[MessagePart1],[Datetime],[Recipient]) VALUES('{N}',N'{KEYWORD}',N'{M}',N'{M1}','{TIMESTAMP}','{NAME}')

They don't work either. I would appreciate any pointer. Thanks.

EDITED: As indicated by marc_s below, the N prefix must be outside the single quotes. It is correct in my actual test, the initial statement is a typo, which I've corrected.

The test with the N prefix also returned an error:

Incorrect syntax near '原标é¢'. Unclosed quotation mark after the character string '全力克æœ?å››ç§?å?±é™©','2013-06-19 21:22:08','')'.

The SQL statement:

INSERT INTO INCOMINGLOG ([Number],[Keyword],[Message],[MessagePart1],[Datetime],[Recipient]) VALUES('+6590621005',N'原标题',N'原标题 [全力克�四��险 �决扫除作风之弊]',N'[全力克�四��险','2013-06-19','')

. .

REPLY TO gbn's Answer: I've tried using parameterized SQL but still encountering "Unclosed quotation mark after the character string" error.

For the new test, I used a simplified SQL statement:

INSERT INTO INCOMINGLOG ([Keyword],[Message]) VALUES(:KEYWORD,:M)

The error returned for the above statement:

Incorrect syntax near '原标é¢'. Unclosed quotation mark after the character string '')'.

For info, the values of KEYWORD and M are:

KEYWORD:原标题

M:原标题 [

. . .

Further tests on 20th June Parametarized SQL query don't work so I tried a different approach by trying to isolate the character that caused the error. After trial and error, I managed to identify the problematic character.

The following character produces an error: 题

SQL Statement: INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题')

Interestingly, note that the string in the return error tax contains a "?" character which didn't exist in the original statement.

Error: Unclosed quotation mark after the character string '�)'. Incorrect syntax near '�)'.

If I were to place some latin characters immediately after the culprit character, there will be no error. For example, INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题Ok') works ok. Note: It doesn't work with all characters.

1条回答
Luminary・发光体
2楼-- · 2019-07-25 10:19

There are ' characters in the UTF-8 which abnormally terminate the SQL.

Classic SQL injection.

Use proper parametrisation, not string concatenation basically.

Edit, after Question updates...

Without the Delphi code, I don't think we can help you
All SQL side code works. For example, this works in SSMS

DECLARE @t TABLE ([Keyword] nvarchar(100) COLLATE Chinese_PRC_CI_AS);
INSERT INTO @t ([Keyword]) VALUES('题');
INSERT INTO @t ([Keyword]) VALUES(N'题');
SELECT * FROM @t T;

Something is missing to help us fic this

Also see

查看更多
登录 后发表回答