Python - Dynamic variable in Redshift SQL Query

2019-06-01 17:28发布

问题:

I'm developing in a python environment and I want to call an sql query using psycopg2

Lets say I have the following UNLOAD command in an .sql file:

UNLOAD 
(
'
Some SQL Query
'
)
TO 's3://%PATH%'
...

In the sql file the %PATH% should be explicit like: 'folder1/folder3/file_name'.

But I want the python program to set this %PATH% in runtime. which means that the .sql file containts something like %PATH% and will be set only in runtime.

Any idea of how to do it?

回答1:

You simply specify a replacement field in your SQL file, and the use a format command.

Create your file like this

UNLOAD ('Some SQL Query')
TO 's3://{bucket}/{key}'

And use this file in python like

template = open('file1.sql', 'r').read()
query = template.format(bucket='mybucket', key='folder/file.csv')


回答2:

Implementing it this way will give you a tough time.

The best way to do is to dump the file at a static location:

UNLOAD 
(
'
Some SQL Query
'
)
TO 's3://path/to/static/s3_bucket'
...

and then use (via a shellscript / or opt for a suitable command for any other script)

aws s3 mv $source $destination

Here, you may pass any value for $destination which can be easily populated during run-time.

In short, you've dumped the file in s3 at a fixed location (using UNLOAD) and moved it to the location of your choice or a location populated at run time (using aws s3 mv...)



回答3:

You would not be able to set up the UNLOAD path dynamically at runtime, however you could put your SQL statement in a something like a shell/python script where you can create variables with the path you'd like and then pass them into the query.

This UNLOAD utility by AWS will get you started if you decide to go with a python script.