Optimizing MySQL Import (Converting a Verbose SQL

2019-05-16 04:05发布

问题:

We are using mysqldump with the options --complete-insert --skip-extended-insert to create database dumps that are kept in VCS. We use these options (and the VCS) to have the possibility to easily compare different database versions.

Now importing of the dump takes quite a while because there are - of course - single inserts per database row.

Is there an easy way to convert such a verbose dump to one with a single insert per table? Does anyone maybe already have a some script at hand?

回答1:

I wrote a little python script that converts this:

LOCK TABLES `actor` WRITE;
/*!40000 ALTER TABLE `actor` DISABLE KEYS */;
INSERT INTO `actor` (`actor_id`, `first_name`, `last_name`, `last_update`) VALUES (1,'PENELOPE','GUINESS','2006-02-15 12:34:33');
INSERT INTO `actor` (`actor_id`, `first_name`, `last_name`, `last_update`) VALUES (2,'NICK','WAHLBERG','2006-02-15 12:34:33');
INSERT INTO `actor` (`actor_id`, `first_name`, `last_name`, `last_update`) VALUES (3,'ED','CHASE','2006-02-15 12:34:33');

into this:

LOCK TABLES `actor` WRITE;
/*!40000 ALTER TABLE `actor` DISABLE KEYS */;
INSERT INTO `actor` VALUES(1,'PENELOPE','GUINESS','2006-02-15 12:34:33'),(2,'NICK','WAHLBERG','2006-02-15 12:34:33'),(3,'ED','CHASE','2006-02-15 12:34:33');

It's not very pretty or well tested, but it works on the Sakila test database dumps, so it can handle non-trivial dump files.

Anyway, here's the script:

#!/usr/bin/env python
# -*- coding: utf-8 -*- #

import re
import sys

re_insert = re.compile(r'^insert into `(.*)` \(.*\) values (.*);', re.IGNORECASE)

current_table = ''

for line in sys.stdin:
    if line.startswith('INSERT INTO'):
        m = re_insert.match(line)
        table = m.group(1)
        values = m.group(2)

        if table != current_table:
            if current_table != '':
                sys.stdout.write(";\n\n")
            current_table = table
            sys.stdout.write('INSERT INTO `' + table + '` VALUES ' + values)
        else:
            sys.stdout.write(',' + values)
    else:
        if current_table != '':
            sys.stdout.write(";\n")
            current_table = ''
        sys.stdout.write(line)

if current_table != '':
    sys.stdout.write(';')

It expects piped input on stdin and prints to stdout. If you saved the script as mysqldump-convert.py, you'd use it like this:

cat ./sakila-db/sakila-full-dump.sql | python mysqldump-convert.py > test.sql

Let me know how you get on!