I am trying to run hql file present in cloud storage using airflow script,there are two parameters through which we can pass the path to DataprocHiveOperator :
- Query: 'gs://bucketpath/filename.q'
Error occuring - cannot recognize input near 'gs' ':' '/'
- query_uri :'gs://bucketpath/filename.q'
Error occuring: PendingDeprecationWarning: Invalid arguments were passed to DataProcHiveOperator. Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were:
*args: ()
**kwargs: {'query_uri': 'gs://poonamp_pcloud/hive_file1.q'
Using Query param , i have successfully run hive queries(select * from table
)
Is there any way to run hql file stored in cloud storage bucket through dataprochiveoperator ?
That's because you are using both
query
as well asquery_uri
.If you are querying using a file, you have to use
query_uri
andquery = None
OR you can ignore writing query.If you are using
inline query
then you have to usequery
.Here is a sample for querying through a file.
query_uri
is the indeed the correct parameter for running an hql file off of Cloud Storage. However, it was only added to theDataProcHiveOperator
in https://github.com/apache/incubator-airflow/pull/2402. Based on the warning message you got, I don't think you're running on code that supports the parameter. The change is not on the latest release (1.8.2), so you'll need to wait for another release or grab it off the master branch.