SQL multiple inserts with Python

2020-05-29 08:33发布

问题:

UPDATE
After passing execute() a list of rows as per Nathan's suggestion, below, the code executes further but still gets stuck on the execute function. The error message reads:

    query = query % db.literal(args)
TypeError: not all arguments converted during string formatting

So it still isn't working. Does anybody know why there is a type error now?
END UPDATE

I have a large mailing list in .xls format. I am using python with xlrd to retrieve the name and email from the xls file into two lists. Now I want to put each name and email into a mysql database. I'm using MySQLdb for this part. Obviously I don't want to do an insert statement for every list item.
Here's what I have so far.

from xlrd import open_workbook, cellname
import MySQLdb

dbname = 'h4h'
host = 'localhost'
pwd = 'P@ssw0rd'
user = 'root'

book = open_workbook('h4hlist.xls')
sheet = book.sheet_by_index(0)
mailing_list = {}
name_list = []
email_list = []

for row in range(sheet.nrows):
    """name is in the 0th col. email is the 4th col."""
    name = sheet.cell(row, 0).value  
    email =  sheet.cell(row, 4).value
    if name and email:
        mailing_list[name] = email

for n, e in sorted(mailing_list.iteritems()):
    name_list.append(n)
    email_list.append(e)

db = MySQLdb.connect(host=host, user=user, db=dbname, passwd=pwd)
cursor = db.cursor()
cursor.execute("""INSERT INTO mailing_list (name,email) VALUES (%s,%s)""",
              (name_list, email_list))

The problem when the cursor executes. This is the error: _mysql_exceptions.OperationalError: (1241, 'Operand should contain 1 column(s)') I tried putting my query into a var initially, but then it just barfed up a message about passing a tuple to execute().

What am I doing wrong? Is this even possible?

The list is huge and I definitely can't afford to put the insert into a loop. I looked at using LOAD DATA INFILE, but I really don't understand how to format the file or the query and my eyes bleed when I have to read MySQL docs. I know I could probably use some online xls to mysql converter, but this is a learning exercise for me as well. Is there a better way?

回答1:

To fix TypeError: not all arguments converted during string formatting - you need to use the cursor.executemany(...) method, as this accepts an iterable of tuples (more than one row), while cursor.execute(...) expects the parameter to be a single row value.

After the command is executed, you need to ensure that the transaction is committed to make the changes active in the database by using db.commit().



回答2:

You need to give executemany() a list of rows. You don't need break the name and email out into separate lists, just create one list with both of the values in it.

rows = []

for row in range(sheet.nrows):
    """name is in the 0th col. email is the 4th col."""
    name = sheet.cell(row, 0).value  
    email =  sheet.cell(row, 4).value
    rows.append((name, email))

db = MySQLdb.connect(host=host, user=user, db=dbname, passwd=pwd)
cursor = db.cursor()
cursor.executemany("""INSERT INTO mailing_list (name,email) VALUES (%s,%s)""", rows)

Update: as @JonClements mentions, it should be executemany() not execute().



回答3:

If you are interested in high-performance of the code, this answer may be better.

Compare to excutemany method, the below execute will much faster:

INSERT INTO mailing_list (name,email) VALUES ('Jim','jim@yahoo.com'),('Lucy','Lucy@gmail.com')

You can easily modify the answer from @Nathan Villaescusa and get the new code.

cursor.execute("""INSERT INTO mailing_list (name,email) VALUES (%s)""".format(",".join(str(i) for i in rows))

here is my own test result:

excutemany:10000 runs takes 220 seconds

execute:10000 runs takes 12 seconds.

The speed difference will be about 15 times.