Python / pypyODBC: Row Insert Using String and NUL

2019-04-09 14:43发布

问题:

I am pretty new to Python and have not really interacted with SQL Server and XML Parsing until this current project (might not even the best approach to take). Long story short, my IT team is very back-logged and I need to get data into a sandbox table (parsed from an XML).

There is a segment of the XML that contains attributes [xx]001 to [xx]025. Not all of these attributes are included for each XML I will be parsing. Therefore, I iterate through a loop of all the possible attributes and append the results to a list. Since not all attributes are in each XML, I can incur noneTypes by trying to represent each one. Not an issue until a get to creating my SQL insert statement where I need them converted to NULLs. There is probably some very simple way to handle this, but my newbie status is impeding my progress. Relevant pieces of the code follow...

Maybe there is a better way to do it in general? I am concerned that sqlList might only be able to get so large before I hit a limit as well.

#this list contains Nones
sqlList = (", ".join(map(repr,appendedList)))

#the Nones are an issue when I get to here
curs.execute("USE Sandbox INSERT INTO myTable VALUES (%s)" % (sqlList))
curs.commit()

Here is an example of what sqlList looks like:

'20_2014', '20_2014_3/25/2015 2:01 PM', 'FBR', 'A', '0', '0', '3', '1', '134', None, None, '0', None, '0', '0', '0', '0', '0', None, None, '2', None, None, None, None

I get the following error message:

pypyodbc.ProgrammingError: ('42S22', "[42S22] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid column name 'None'.")

回答1:

In general would still like to know if this is the "right" way to do this.

Your solution is still Dynamic SQL, which always suffers from

  • SQL injection issues (e.g., What if one of the strings contains a single quote?),
  • type juggling (e.g., None -> NULL as in your question),
  • having to use the right delimiters for your literal values (strings, dates, etc.)

All of those issues go away if you use a parameterized query. This example uses only 2 attributes for clarity (instead of the 25 in your question), but the principle is exactly the same. Notice that I don't have to do anything special to handle None values in order for them to be inserted as NULLs.

import pypyodbc
myData = [
    (1, 'foo'),
    (2, None),
    (3, 'bar'),
    ]
connStr = """
DSN=myDb_SQLEXPRESS;
"""
cnxn = pypyodbc.connect(connStr)
crsr = cnxn.cursor()
sql = """
INSERT INTO myTable VALUES (?, ?)
"""
for dataRow in myData:
    print(dataRow)
    crsr.execute(sql, dataRow)
cnxn.commit()
crsr.close()
cnxn.close()

The console output is ...

(1, 'foo')
(2, None)
(3, 'bar')

... and the three rows are inserted into the table correctly, including the NULL (None) in the second row.



回答2:

Never mind. See below:

#this string contains Nones
sqlList = (", ".join(map(repr,appendedList)))

#this converts Nones to NULLs
sqlList = sqlList.replace("None", "NULL")

#all good now
curs.execute("USE Sandbox INSERT INTO myTable VALUES (%s)" % (sqlList))
curs.commit()

In general would still like to know if this is the "right" way to do this.