问题:

I have a dataframe call productPrice have column ID and Price, I want to get the ID that had the highest price, if two ID have the same highest price, I only get the one the have the smaller ID number. I use

val highestprice = productPrice.orderBy(asc("ID")).orderBy(desc("price")).limit(1) But the result I got is not the one that have the smaller ID, instead the one I got is the one the have a larger ID. I don't know what's wrong with my logic, any idea?

回答1:

Try this.

scala> val df = Seq((4, 30),(2,50),(3,10),(5,30),(1,50),(6,25)).toDF("id","price")
df: org.apache.spark.sql.DataFrame = [id: int, price: int]

scala> df.show
+---+-----+
| id|price|
+---+-----+
|  4|   30|
|  2|   50|
|  3|   10|
|  5|   30|
|  1|   50|
|  6|   25|
+---+-----+


scala> df.sort(desc("price"), asc("id")).show
+---+-----+
| id|price|
+---+-----+
|  1|   50|
|  2|   50|
|  4|   30|
|  5|   30|
|  6|   25|
|  3|   10|
+---+-----+

回答2:

Approaching the same problem using Spark SQL:

val df = Seq((4, 30),(2,50),(3,10),(5,30),(1,50),(6,25)).toDF("id","price")

df.createOrReplaceTempView("prices")

%sql
SELECT id, price
FROM prices
ORDER BY price DESC, id ASC
LIMIT(1)

Get the highest price with smaller ID when two ID

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮