How to Split or Slice the text inside a coulmn in

2019-06-14 07:54发布

  1. Row1_1368083_US_PBPR_STD
  2. Row215_1368083_US_PBPR_ENH
  3. Row216_60902413_US_PBPR_ENH
  4. Row227_37758281_US_PBPR_ENH

The final output should only be 1368083 numbers in a column

3条回答
别忘想泡老子
2楼-- · 2019-06-14 08:29

Use sed to extract the digit part between two '_',

sed 's/^.*_\([0-9]*\)_.*/\1/'

Or use awk to extract the 2nd field separated by '_',

awk -F'_' '{print $2}'
查看更多
▲ chillily
3楼-- · 2019-06-14 08:37

Use str.split

s1 = "Row1_1368083_US_PBPR_STD"
s2 ="Row215_1368083_US_PBPR_ENH"

print(s1.split("_")[1])
print(s2.split("_")[1])

Output:

1368083
1368083

Or Regex.

import re

s1 = "Row216_60902413_US_PBPR_ENH"
s2 ="Row227_37758281_US_PBPR_ENH"

print(re.findall(r"\d{6,}", s1)[0])
print(re.findall(r"\d{6,}", s2)[0])
查看更多
仙女界的扛把子
4楼-- · 2019-06-14 08:38
awk -F_ '$2 ~/1368083/{print $2}' file
1368083
1368083
查看更多
登录 后发表回答