I'm trying to run a Python(version 2.7.1') script where I am using pymysql package to create a table into a database from a CSV file.
It runs correctly in my local system ,however, the problem appears when running the same script as a part of a pipeline in Google Cloud Dataflow.
My Python function is the following one:
class charge_to_db(beam.DoFn):
def process(self, element):
import pymysql
with open(element, 'r') as f:
data = f.read().decode("UTF-8")
datalist = []
for line in data.split('\n'):
datalist.append(line.split(','))
db = pymysql.connect(host='IPadress', user='root', password='mypassword', database='stack_model')
cursor = db.cursor()
cursor.execute("DROP TABLE IF EXISTS stack_convergence")
# create columnn names from the first line in fList
up = "upper_bnd"
primal = "primal"
d = "dualit"
gap = "gap_rel"
teta = "teta"
alpha = "alpha"
imba = "imba_avg"
price = "price_avg"
# create STUDENT table // place a comma after each new column except the last
queryCreateConvergenceTable = """CREATE TABLE stack_convergence(
{} float not null,
{} float not null,
{} float not null,
{} float not null,
{} float not null,
{} float not null,
{} float not null,
{} float not null )""".format(up, primal, d, gap, teta, alpha, imba, price)
cursor.execute(queryCreateConvergenceTable)
When running this function in the cloud I'm obtaining the following error:
RuntimeError: OperationalError: (2003, 'Can\'t connect to MySQL server on \'35.195.1.40\' (110 "Connection timed out")')
I don't know why this error is occurring because it runs correctly in local system, so from the local system I have acces to my cloud SQL instance, but not from the dataflow in the cloud.
Why is this error occurring?