I just started working with the new Google Cloud Datalab and IPython last week (though I've been using BigQuery for a few months). The tutorials and samples in github are very helpful, but as my scripts and queries become more complex I'm wondering a few things. The first one is this: can I refer to queries defined in one %%sql module in a later %%sql module? The other, somewhat related question is can I somehow store the results from one %%sql module and then put that information into something like an IN clause in a subsequent %%sql module?
相关问题
- How to export crash-free users from firebase?
- Updating a value in an ARRAY in a BigQuery table
- Import json data with null values
- How to transform IP addresses into geolocation in
- How to count frequency of elements in a bigquery a
相关文章
- BigQuery - Concatenate multiple rows into a single
- Where do you get Google Bigquery usage info (mainl
- How do I configure Google BigQuery command line to
- How can I make integration tests with google cloud
- How to get gcloud auth activate-service-account pe
-
Error in Google BigQuery
- CLI “bq load” - how to use non-printable character
- How to capitalize a string?
Here's some things to try and see if they meet your needs. If they don't, I welcome you to file issues in github, as I think both of your scenarios are things we want to make sure work well.
For the first, it requires a combination of sql cells and code cells [for now]
Cell 1
Cell 2
Cell 3
Essentially, %%sql modules are turned into auto-imported python modules behind the scenes.
I used to split out queries per %%sql cell myself, but since the introduction of modules, I also depending on the scenario, define multiple queries within a single module, where you don't need a bit of python code stitching together. Depends on your scenario, which is better.
For your second question, again, if the queries are split across cells, you'll need some python glue in the middle. Execute one query, get its result, and use that as a parameter for the next query. This would work for general scalar values, but for IN clauses and tuples/lists of values, we have this issue we need to address: https://github.com/GoogleCloudPlatform/datalab/issues/615
For more ideas on how you can use JOINs in BigQuery to produce scalar results in one query that you consume in the next query, you can also see the query under Step 3 in the BigQuery tutorial notebook titled "SQL Query Composition".
Hope that helps.
As mentioned, if you hit specific issues where something didn't work as you expected, please do file an issue, and we can see if it makes sense to address, and possibly you or someone else might even step up to make a contribution. :)