I have a python file that creates and populates a table in ms sql. The only sticking point is that the code breaks if there are any non-ascii characters or single apostrophes (and there are quite a few of each). Although I can run the replace function to rid the strings of apostrophes, I would prefer to keep them intact. I have also tried converting the data into utf-8, but no luck there either.
Below are th error messages I get:
"'ascii' codec can't encode character u'\2013' in position..." (for non-ascii characters)
and for the single quotes
class 'pyodbc.ProgrammingError'>: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server] Incorrect syntax near 'S, 230 X 90M.; Eligibilty....
When I try to encode string in utf-8, I instead get the following error message:
<type 'exceptions.UnicodeDecodeError'>: ascii' codec can't decode byte 0xe2 in position 219: ordinal not in range(128)
The python code is included below. I believe the point in the code where this break occurs is after the following line: InsertValue = str(row.GetValue(CurrentField['Name'])).
# -*- coding: utf-8 -*-
import pyodbc
import sys
import arcpy
import arcgisscripting
gp = arcgisscripting.create(9.3)
SQL_KEYWORDS = ['PERCENT', 'SELECT', 'INSERT', 'DROP', 'TABLE']
#SourceFGDB = '###'
#SourceTable = '###'
SourceTable = sys.argv[1]
TempInputName = sys.argv[2]
SourceTable2 = sys.argv[3]
#---------------------------------------------------------------------------------------------------------------------
# Target Database Settings
#---------------------------------------------------------------------------------------------------------------------
TargetDatabaseDriver = "{SQL Server}"
TargetDatabaseServer = "###"
TargetDatabaseName = "###"
TargetDatabaseUser = "###"
TargetDatabasePassword = "###"
# Get schema from FGDB table.
# This should be an ordered list of dictionary elements [{'FGDB_Name', 'FGDB_Alias', 'FGDB_Type', FGDB_Width, FGDB_Precision, FGDB_Scale}, {}]
if not gp.Exists(SourceTable):
print ('- The source does not exist.')
sys.exit(102)
#### Should see if it is actually a table type. Could be a Feature Data Set or something...
print(' - Processing Items From : ' + SourceTable)
FieldList = []
Field_List = gp.ListFields(SourceTable)
print(' - Getting number of rows.')
result = gp.GetCount_management(SourceTable)
Number_of_Features = gp.GetCount_management(SourceTable)
print(' - Number of Rows: ' + str(Number_of_Features))
print(' - Getting fields.')
Field_List1 = gp.ListFields(SourceTable, 'Layer')
Field_List2 = gp.ListFields(SourceTable, 'Comments')
Field_List3 = gp.ListFields(SourceTable, 'Category')
Field_List4 = gp.ListFields(SourceTable, 'State')
Field_List5 = gp.ListFields(SourceTable, 'Label')
Field_List6 = gp.ListFields(SourceTable, 'DateUpdate')
Field_List7 = gp.ListFields(SourceTable, 'OBJECTID')
for Current_Field in Field_List1 + Field_List2 + Field_List3 + Field_List4 + Field_List5 + Field_List6 + Field_List7:
print(' - Field Found: ' + Current_Field.Name)
if Current_Field.AliasName in SQL_KEYWORDS:
Target_Name = Current_Field.Name + '_'
else:
Target_Name = Current_Field.Name
print(' - Alias : ' + Current_Field.AliasName)
print(' - Type : ' + Current_Field.Type)
print(' - Length : ' + str(Current_Field.Length))
print(' - Scale : ' + str(Current_Field.Scale))
print(' - Precision: ' + str(Current_Field.Precision))
FieldList.append({'Name': Current_Field.Name, 'AliasName': Current_Field.AliasName, 'Type': Current_Field.Type, 'Length': Current_Field.Length, 'Scale': Current_Field.Scale, 'Precision': Current_Field.Precision, 'Unique': 'UNIQUE', 'Target_Name': Target_Name})
# Create table in SQL Server based on FGDB table schema.
cnxn = pyodbc.connect(r'DRIVER={SQL Server};SERVER=###;DATABASE=###;UID=sql_webenvas;PWD=###')
cursor = cnxn .cursor()
#### DROP the table first?
try:
DropTableSQL = 'DROP TABLE dbo.' + TempInputName + '_Test;'
print DropTableSQL
cursor.execute(DropTableSQL)
dbconnection.commit()
except:
print('WARNING: Can not drop table - may not exist: ' + TempInputName + '_Test')
CreateTableSQL = ('CREATE TABLE ' + TempInputName + '_Test '
' (Layer varchar(500), Comments varchar(5000), State int, Label varchar(500), DateUpdate DATETIME, Category varchar(50), OBJECTID int)')
cursor.execute(CreateTableSQL)
cnxn.commit()
# Cursor through each row in the FGDB table, get values, and insert into the SQL Server Table.
# We got Number_of_Features earlier, just use that.
Number_Processed = 0
print(' - Processing ' + str(Number_of_Features) + ' features.')
rows = gp.SearchCursor(SourceTable)
row = rows.Next()
while row:
if Number_Processed % 10000 == 0:
print(' - Processed ' + str(Number_Processed) + ' of ' + str(Number_of_Features))
InsertSQLFields = 'INSERT INTO ' + TempInputName + '_Test ('
InsertSQLValues = 'VALUES ('
for CurrentField in FieldList:
InsertSQLFields = InsertSQLFields + CurrentField['Target_Name'] + ', '
InsertValue = str(row.GetValue(CurrentField['Name']))
if InsertValue in ['None']:
InsertValue = 'NULL'
# Use an escape quote for the SQL.
InsertValue = InsertValue.replace("'","' '")
if CurrentField['Type'].upper() in ['STRING', 'CHAR', 'TEXT']:
if InsertValue == 'NULL':
InsertSQLValues = InsertSQLValues + "NULL, "
else:
InsertSQLValues = InsertSQLValues + "'" + InsertValue + "', "
elif CurrentField['Type'].upper() in ['GEOMETRY']:
## We're not handling geometry transfers at this time.
if InsertValue == 'NULL':
InsertSQLValues = InsertSQLValues + '0' + ', '
else:
InsertSQLValues = InsertSQLValues + '1' + ', '
else:
InsertSQLValues = InsertSQLValues + InsertValue + ', '
InsertSQLFields = InsertSQLFields[:-2] + ')'
InsertSQLValues = InsertSQLValues[:-2] + ')'
InsertSQL = InsertSQLFields + ' ' + InsertSQLValues
## print InsertSQL
cursor.execute(InsertSQL)
cnxn.commit()
Number_Processed = Number_Processed + 1
row = rows.Next()
print(' - Processed all ' + str(Number_Processed))
del row
del rows
In general you want to convert to unicode on data input, and convert to the desired encoding on output.
So it will be easier to find your problem if you do this. This means changing all strings to unicode, 'INSERT INTO ' to u'INSERT INTO '. (Notice the "u" before the string) Then when you send the string to be executed convert to the desired encoding, "utf8".
Also, you should add the encoding string to the top of your source code. This means adding the encoding cookie to one of the first two lines of the file:
If your pulling data from a file to build your string you can uses codecs.open to auto convert from a specific encoding to unicode on load.
James, I believe the real issue is that your are not using Unicode accross the board. Try to do the following:
Also, please provide us the full stack trace and the exception name.
I'm going to use my psychic debugging skills and say you are trying to
str()
ify something and getting an error with the ascii codec. What you really should do is to use the utf-8 codec instead like this:When I converted my str() to unicode, that solved the problem. A simple answer, and I appreciate everyone's help on this.
Or you can take the view that only ASCII is allowed and use the awesomely named Unicode Hammer