U-Sql use rowset variable for decision making

2019-07-24 06:30发布

问题:

I want to use rowset variable as scaler variable.

@cnt = Select count(*) from @tab1; If (@cnt > 0) then @cnt1= select * from @tab2; End;

Is it possible?

======================================

I want to block the complex u-sql code based on some condition, lets say based on some control table. In my original code, I wrote 10-15 u-sql statements and I want to bound them within the If statement. I don't want to do cross join because it again start trying to join the table. If I use cross join, there is no significant save in execution time. Use of IF statement is, If the condition does not met, complete piece of code should not execute. Is it possible?

回答1:

To add to wBob's and Alex's answers:

U-SQL does not provide data driven control flow within a script. The current IF statement requires the expression to be evaluated at compile time.

Consider a U-SQL script as just a single declarative query. So you have the following options:

  1. Express your problem with relational expressions. This means that you will have to write a (cross) join to guard the execution. If you feel that the query optimizer does a bad job at optimizing such guards (e.g., it evaluates the expensive side of the join before the cheap guard), please report an issue and we will take a look.

  2. Split your script into several scripts and look at the result of each script before doing your next step. This is a form of orchestration that you can do with ADF or writing your own orchestration with Powershell or any of the SDKs. The caveat here is that you will have to write intermediate results into files and download the files into your orchestration layer.

Having said this, it theoretically is possible to extent the language algebra with a "don't execute the remaining part of this operator tree if a condition is not satisfied" operator. However that is a major work item and can lead to very large query plans during compilation that may be going beyond the current limits. If you feel that neither 1 nor 2 above are sufficient to help with your scenario, please add your vote to https://feedback.azure.com/forums/327234-data-lake/suggestions/17635906-please-add-dynamic-if-evaluation-to-u-sql.



回答2:

@cnt1 =
    SELECT @tab2.*
    FROM @tab2
    CROSS JOIN (SELECT COUNT(*) AS cnt FROM @tab1) AS c
    WHERE c.cnt > 0;

(Adding explanation) CROSS JOIN returns a cartesian product of all rows from @tab2 and the single row generated by the COUNT query. There WHERE condition then ensures the result of the query is all rows from @tab2 if COUNT(*)>0, no rows otherwise.



标签: u-sql