Column contains column 4

2020-02-07 06:12发布

I have a dataframe. I would like to test whether, (C), on each row, the number in column (B) is in the string, column (A).

df = pd.DataFrame({'A': ["me 123", "me-123", "1234", "me 12", "123 me", "6 you 123-me"],
                   'B': [123,       123,      123,    123,     6,        123]})

I can almost do that using extract

df['C'] = df.A.str.extract('(\d+)', expand=False).astype(float).eq(df.B,0).astype(int)

              A    B  C
0        me 123  123  1
1        me-123  123  1
2          1234  123  0
3         me 12  123  0
4        123 me    6  0
5  6 you 123-me  123  0

However on the bottom row it is not seeing the number 123 becasue of the number 6. I would like to get

              A    B  C
0        me 123  123  1
1        me-123  123  1
2          1234  123  0
3         me 12  123  0
4        123 me    6  0
5  6 you 123-me  123  1

标签： pandas

3条回答

一夜七次

2楼-- · 2020-02-07 06:49

Using findall

[y in x for x , y in zip(df.A.str.findall('(\d+)'),df.B.astype(str))]
Out[733]: [True, True, False, False, False, True]

0人赞添加讨论(0) 举报

Rolldiameter

3楼-- · 2020-02-07 06:52

Use Series.str.extractall for get all numeric from column, reshape by Series.unstack, check values and add DataFrame.any for test at least one True per row:

df['C'] = (df.A.str.extractall('(\d+)')[0]
               .unstack()
               .astype(float)
               .eq(df.B,0)
               .any(axis=1)
               .astype(int))
print (df)

              A    B  C
0        me 123  123  1
1        me-123  123  1
2          1234  123  0
3         me 12  123  0
4        123 me    6  0
5  6 you 123-me  123  1

0人赞添加讨论(0) 举报

仙女界的扛把子

4楼-- · 2020-02-07 06:55

`re.split`

Use 'One or more not-digits' as a pattern

import re

df.assign(C=[int(str(b) in re.split('\D+', a)) for a, b in zip(df.A, df.B)])

              A    B  C
0        me 123  123  1
1        me-123  123  1
2          1234  123  0
3         me 12  123  0
4        123 me    6  0
5  6 you 123-me  123  1

0人赞添加讨论(0) 举报

Column contains column 4

re.split

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间

`re.split`