Why does using rank() windowing function break the

2019-08-11 04:10发布

问题:

The windowing functions online docs for spark sql include the following example:

https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html

SELECT
  product,
  category,
  revenue
FROM (
  SELECT
    product,
    category,
    revenue,
    dense_rank() OVER (PARTITION BY category ORDER BY revenue DESC) as rank
  FROM productRevenue) tmp
WHERE
  rank <= 2

I have created what would seem to be a similar structure sql. But it does not work

select id,r from (
          select id, name, 
          rank() over (partition by name order by name) as r
          from tt) v 
          where v.r >= 7 and v.r <= 12

Here is the error:

Exception in thread "main" java.lang.RuntimeException: [3.25] 
      failure: ``)'' expected but `(' found

            rank() over (partition by fp order by fp) as myrank
                        ^

Anyone can see where they differ structurally? I am on spark 1.6.0-SNAPSHOT from 11/18/15.

回答1:

I checked the source code and it appears the rank() requires hive support. I am rebuilding spark with

 -Phive -Phive-thriftserver

I did confirm: when using a HiveContext the query works.