How to run dynamic second query in google cloud da

2020-07-28 12:04发布

问题:

I'm attempting to do an operation wherein I get a list of Ids via a query, transform them into a string separated by commas (i.e. "1,2,3") and then use it in a secondary query. When attempting to run the second query, I'm given a syntax error:

"Target type of a lambda conversion must be an interface"

String query = "SELECT DISTINCT campaignId FROM `" + options.getEligibilityInputTable() + "` ";

    Pipeline p = Pipeline.create(options);
    p.apply("GetCampaignIds", BigQueryIO.readTableRows().withTemplateCompatibility().fromQuery(query).usingStandardSql())
      .apply("TransformCampaignIds",
        MapElements.into(TypeDescriptors.strings())
        .via((TableRow row) -> (String)row.get("campaignId")))
      .apply(Combine.globally(new StringToCsvCombineFn()))
      .apply("GetAllCampaigns", campaignIds -> BigQueryIO.readTableRows().withTemplateCompatibility().fromQuery("SELECT id AS campaignId, dataQuery FROM `{projectid}.mysql_standard.campaigns` WHERE campaignId IN (" + campaignIds + ")").usingStandardSql())
....

How can I chain queries together?

回答1:

You cannot do this with the existing sources, unfortunately. Your options here are two:

  • You manually call the BQ API from a ParDo.
  • You write a complex SQL query that does this for you.

The second option looks something like so:

String query = "SELECT id AS campaignId, dataQuery \
               FROM `{projectid}.mysql_standard.campaigns` \
               WHERE campaignId IN ( \
                   SELECT DISTINCT campaignId \
                   FROM `" + options.getEligibilityInputTable() 
                   + "`)";

Pipeline p = Pipeline.create(options);
p.apply("GetAllCampaigns", BigQueryIO.readTableRows()
                                     .withTemplateCompatibility()
                                     .fromQuery(query)
                                     .usingStandardSql());