How to get position of rows where a condition meet

2020-05-03 10:47发布

This is my sample data set...

CREATE TABLE blockhashtable (
    id SERIAL PRIMARY KEY 
    ,pos int
    ,filehash varchar(35)
    ,blockhash varchar(130) 
);    

insert into blockhashtable 
(pos,filehash,blockhash) values 
(1, "randommd51", "randstr1"),
(2, "randommd51", "randstr2"),
(3, "randommd51", "randstr3"),
(1, "randommd52", "randstr2"),
(2, "randommd52", "randstr2"),
(3, "randommd52", "randstr1"),
(4, "randommd52", "randstr7"),
(1, "randommd53", "randstr2"),
(2, "randommd53", "randstr1"),
(3, "randommd53", "randstr2"),
(4, "randommd53", "randstr3"),
(1, "randommd54", "randstr4"),
(2, "randommd54", "randstr55");

...and fiddle of same http://sqlfiddle.com/#!9/e5b201/14

This is my current SQL query and output:

select pos,filehash,avg( (blockhash in ('randstr1', 'randstr2', 'randstr3') )) as matching_ratio from blockhashtable group by filehash;

pos filehash    matching_ratio
1   randommd51  1
1   randommd52  0.75
1   randommd53  1
1   randommd54  0

My expected output is something like this this:

pos       filehash      matching_ratio
1,2       randommd51    1
1,3       randommd52    0.5
1,2,4     randommd53    0.75
0         randommd54    0

The pos in last row can be 1 also, I can remove it using a custom condition in python later.

Basically, in my python list, randstr2 only repeat one time, so I want only maximum one match found in the SQL query. That's why matching_ratio is different in my expected output.

标签: python mysql sql
1条回答
▲ chillily
2楼-- · 2020-05-03 11:01

I don't see how your result set corresponds to your data set, but you seem to be after something like this...

SELECT filehash
     , GROUP_CONCAT(pos ORDER BY pos) pos
     , 1-(COUNT(DISTINCT blockhash IN('randstr1','randstr2','randstr3'))/(COUNT(*))) ratio
  FROM blockhashtable
 GROUP
    BY filehash;
+------------+---------+--------+
| filehash   | pos     | ratio  |
+------------+---------+--------+
| randommd51 | 1,2,3   | 0.6667 |
| randommd52 | 1,2,3,4 | 0.5000 |
| randommd53 | 1,2,3,4 | 0.7500 |
| randommd54 | 1,2     | 0.5000 |
+------------+---------+--------+
查看更多
登录 后发表回答