I'm trying to set up a MySQL connection pool and have my worker processes access the already established pool instead of setting up a new connection each time.
I'm confused if I should pass the database cursor to each process, or if there's some other way to do this? Shouldn't MySql.connector do the pooling automatically? When I check my log files, many, many connections are opened and closed ... one for each process.
My code looks something like this:
PATH = "/tmp"
class DB(object):
def __init__(self):
connected = False
while not connected:
try:
cnxpool = mysql.connector.pooling.MySQLConnectionPool(pool_name = "pool1",
**config.dbconfig)
self.__cnx = cnxpool.get_connection()
except mysql.connector.errors.PoolError:
print("Sleeping.. (Pool Error)")
sleep(5)
except mysql.connector.errors.DatabaseError:
print("Sleeping.. (Database Error)")
sleep(5)
self.__cur = self.__cnx.cursor(cursor_class=MySQLCursorDict)
def execute(self, query):
return self.__cur.execute(query)
def isValidFile(self, name):
return True
def readfile(self, fname):
d = DB()
d.execute("""INSERT INTO users (first_name) VALUES ('michael')""")
def main():
queue = multiprocessing.Queue()
pool = multiprocessing.Pool(None, init, [queue])
for dirpath, dirnames, filenames in os.walk(PATH):
full_path_fnames = map(lambda fn: os.path.join(dirpath, fn),
filenames)
full_path_fnames = filter(is_valid_file, full_path_fnames)
pool.map(readFile, full_path_fnames)
if __name__ == '__main__':
sys.exit(main())
You created multiple DB object instance. In mysql.connector.pooling.py, pool_name is only a attribute to let you make out which pool it is. There is no mapping in the mysql pool.
So, you create multiple DB instance in
def readfile()
, then you will have several connection pool.A Singleton is useful in this case.
(I spent several hours to find it out. In Tornado framework, each http get create a new handler, which leads to making a new connection.)
First, you're creating a different connection pool for each instance of your
DB
class. The pools having the same name doesn't make them the same poolFrom the documentation:
Besides that, sharing a database connection (or connection pool) between different processes would be a bad idea (and i highly doubt it would even work correctly), so each process using it's own connections is actually what you should aim for.
You could just initialize the pool in your
init
initializer as a global variable and use that instead.Very simple example:
Or just use a simple connection instead of a connection pool, as only one connection will be active in each process at a time anyway.
The number of concurrently used connections is implicitly limited by the size of the
multiprocessing.Pool
.Code above creates a connection pool at the beginning, and get connections from it in
execute()
, once the connection pool has been created, the work is to remain it, since the pool is created only once, it will save the time to request for a connection every time you would like to connect to MySQL. Hope it helps!