Amazon Data Pipeline: How to use a script argument

2019-02-05 01:06发布

When trying to use a Script Argument in the sqlActivity:

 {
"id" : "ActivityId_3zboU",
  "schedule" : { "ref" : "DefaultSchedule" },
  "scriptUri" : "s3://location_of_script/unload.sql",
  "name" : "unload",
  "runsOn" : { "ref" : "Ec2Instance" },
  "scriptArgument" : [ "'s3://location_of_unload/#format(minusDays(@scheduledStartTime,1),'YYYY/MM/dd/hhmm/')}'", "'aws_access_key_id=????;aws_secret_access_key=*******'" ],
  "type" : "SqlActivity",
  "dependsOn" : { "ref" : "ActivityId_YY69k" },
  "database" : { "ref" : "RedshiftCluster" }
}

where the unload.sql script contains:

 unload ('
    select *
    from tbl1 
 ')  
 to ?
 credentials  ?
 delimiter ',' GZIP;

or :

 unload ('
    select *
    from tbl1 
 ')  
 to ?::VARCHAR(255)
 credentials  ?::VARCHAR(255) 
 delimiter ',' GZIP;

process fails:

syntax error at or near "$1" Position

Any idea what i'm doing wrong?

3条回答
我想做一个坏孩纸
2楼-- · 2019-02-05 01:30

in shellcommand activity we specify two scriptArguments to acces using $1 $2 in shell script(.sh)

"scriptArgument" : "'s3://location_of_unload/#format(minusDays(@scheduledStartTime,1),'YYYY/MM/dd/hhmm/')}'", # can be accesed using $1 "scriptArgument" : "'aws_access_key_id=????;aws_secret_access_key=*******'" # can be accesed using $2

I dont know will this work for you.

查看更多
太酷不给撩
3楼-- · 2019-02-05 01:33

This is the script that works fine from psql shell :

insert into tempsdf select * from source where source.id = '123';

Here are some of my tests on SqlActivity using Data-Pipelines :


Test 1 : Using ?'s

insert into mytable select * from source where source.id = ?; - works fine if used via both 'script' and 'scriptURI' option on SqlActivity object.

where "ScriptArgument" : "123"

here ? can replace the value of the condition, but not the condition itself.


Test 2 : Using parameters works when command is specified using 'script' option only

insert into #{myTable} select * from source where source.id = ?; - Works fine if used via 'script' option only

insert into #{myTable} select * from source where source.id = #{myId};
  • works fine if used via 'script' option only

where #{myTable} , #{myId} are Parameters whose value can be declared in template.

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html

(when you are using only parameters, make sure you delete an unused scriptArguments - otherwise it will still throw and error)


FAILED TESTS and inferences:

insert into ? select * from source where source.id = ?;

insert into ? select * from source where source.id = '123';

Both the above commands does not work because

Table names cannot be used for placeholders for script arguments. '?''s can only be used to pass values for a comparison condition and column values.


insert into #{myTable} select * from source where source.id = #{myId}; - doesn't work if used as 'SciptURI'

insert into tempsdf select * from source where source.id = #{myId}; - does not work when used with 'ScriptURI'

Above 2 commands does not work because

Parameters cannot be evaluated if script is stored in S3.


insert into tempsdf select * from source where source.id = $1 ; - doesnt work with 'scriptURI'

insert into tempsdf values ($1,$2,$3); - does not work.

using $'s - doesn't not work in any combination


Other tests :

"ScriptArgument" : "123" "ScriptArgument" : "456" "ScriptArgument" : "789"

insert into tempsdf values (?,?,?); - works as both scriptURI , script and translates to insert into tempsdf values ('123','456','789');

scriptArguments will follow the order you insert and replaces "?" in the script.


查看更多
乱世女痞
4楼-- · 2019-02-05 01:41

I believe you are using this sql activity for Redshift. Can you modify your sql script to refer to parameters using their positional notation. To refer to the parameters in the sql statement itself, use $1, $2, etc.

See http://www.postgresql.org/docs/9.1/static/sql-prepare.html

查看更多
登录 后发表回答