Distinct on specific column in Hive

2019-04-19 12:54发布

站内文章 / 后端开发

19 0

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am running Hive 071 I have a table, with mulitple rows, with the same column value e.g.

 x | y |
---------
 1 | 2 |
 1 | 3 |
 1 | 4 |
 2 | 2 |
 3 | 2 |
 3 | 1 |

I want to have the x column unique, and remove rows that have the same x val e.g.

 x | y |
---------
 1 | 2 |
 2 | 2 |
 3 | 2 |

 x | y |
---------
 1 | 4 |
 2 | 2 |
 3 | 1 |

are both good as distinct works only on the whole rs in hive, I couldn't find a way to do it

help please Tx

You can use the distinct keyword:

SELECT DISTINCT x FROM table

try following query to get result :

select A.x , A.y from (select x , y , rank() over ( partition by x order by y) as ranked from testingg)A where ranked=1;

标签： unique distinct hive

乱世女痞

女 | 书童

私信

Ta的文章更多文章

0条评论

还没有人评论过~