String separation in required format, Pythonic way

I have a string in the format:

t='@abc @def Hello this part is text'

I want to get this:

l=["abc", "def"] 
s='Hello this part is text'

I did this:

a=t[t.find(' ',t.rfind('@')):].strip()
s=t[:t.find(' ',t.rfind('@'))].strip()
b=a.split('@')
l=[i.strip() for i in b][1:]

It works for the most part, but it fails when the text part has the '@'. Eg, when:

t='@abc @def My email is red@hjk.com'

it fails. The @names are there in the beginning and there can be text after @names, which may possibly contain @.

Clearly I can append initally with a space and find out first word without '@'. But that doesn't seem an elegant solution.

What is a pythonic way of solving this?

标签： regex string format python

7条回答

狗以群分

2楼-- · 2019-03-29 05:54

[edit: this is implementing what was suggested by Osama above]

This will create L based on the @ variables from the beginning of the string, and then once a non @ var is found, just grab the rest of the string.

t = '@one @two @three some text   afterward with @ symbols@ meow@meow'

words = t.split(' ')         # split into list of words based on spaces
L = []
s = ''
for i in range(len(words)):  # go through each word
    word = words[i]
    if word[0] == '@':       # grab @'s from beginning of string
        L.append(word[1:])
        continue
    s = ' '.join(words[i:])  # put spaces back in
    break                    # you can ignore the rest of the words

You can refactor this to be less code, but I'm trying to make what is going on obvious.

0人赞添加讨论(0) 举报

神经病院院长

3楼-- · 2019-03-29 05:57

t='@abc @def Hello this part is text'

words = t.split(' ')

names = []
while words:
    w = words.pop(0)
    if w.startswith('@'):
        names.append(w[1:])
    else:
        break

text = ' '.join(words)

print names
print text

0人赞添加讨论(0) 举报

Bombasti

4楼-- · 2019-03-29 05:59

 [i.strip('@') for i in t.split(' ', 2)[:2]]     # for a fixed number of @def
 a = [i.strip('@') for i in t.split(' ') if i.startswith('@')]
 s = ' '.join(i for i in t.split(' ') if not i.startwith('@'))

0人赞添加讨论(0) 举报

Emotional °昔

5楼-- · 2019-03-29 06:03

You might also use regular expressions:

import re
rx = re.compile("@([\w]+) @([\w]+) (.*)")
t='@abc @def Hello this part is text and my email is foo@ba.r'
a,b,s = rx.match(t).groups()

But this all depends on how your data can look like. So you might need to adjust it. What it does is basically creating group via () and checking for what's allowed in them.

0人赞添加讨论(0) 举报

太酷不给撩

6楼-- · 2019-03-29 06:11

Here's just another variation that uses split() and no regexpes:

t='@abc @def My email is red@hjk.com'
tags = []
words = iter(t.split())

# iterate over words until first non-tag word
for w in words:
  if not w.startswith("@"):
    # join this word and all the following
    s = w + " " + (" ".join(words))
    break
  tags.append(w[1:])
else:
  s = "" # handle string with only tags

print tags, s

Here's a shorter but perhaps a bit cryptic version that uses a regexp to find the first space followed by a non-@ character:

import re
t = '@abc @def My email is red@hjk.com @extra bye'
m = re.search(r"\s([^@].*)$", t)
tags = [tag[1:] for tag in t[:m.start()].split()]
s = m.group(1)
print tags, s # ['abc', 'def'] My email is red@hjk.com @extra bye

This doesn't work properly if there are no tags or no text. The format is underspecified. You'll need to provide more test cases to validate.

0人赞添加讨论(0) 举报

爱情/是我丢掉的垃圾

7楼-- · 2019-03-29 06:13

How about this:

Splitting by space.
foreach word, check

2.1. if word starts with @ then Push to first list

2.2. otherwise just join the remaining words by spaces.

0人赞添加讨论(0) 举报

1 2 下一页

String separation in required format, Pythonic way

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间