Adding a DataFrame column with len() of another co

2019-01-15 14:25发布

I'm having a problem trying to get a character count column of the string values in another column, and haven't figured out how to do it efficiently.

for index in range(len(df)):
    df['char_length'][index] = len(df['string'][index]))

This apparently involves first creating a column of nulls and then rewriting it, and it takes a really long time on my data set. So what's the most effective way of getting something like

'string'     'char_length'
abcd          4
abcde         5

I've checked around quite a bit, but I haven't been able to figure it out.

标签： python string pandas dataframe string-length

2条回答

啃猪蹄的小仙女

2楼-- · 2019-01-15 14:38

Here's one way to do it.

In [3]: df
Out[3]:
  string
0   abcd
1  abcde

In [4]: df['len'] = df['string'].str.len()

In [5]: df
Out[5]:
  string  len
0   abcd    4
1  abcde    5

0人赞添加讨论(0) 举报

小情绪 Triste *

3楼-- · 2019-01-15 14:45

Pandas has a vectorised string method for this: str.len(). To create the new column you can write:

df['char_length'] = df['string'].str.len()

For example:

>>> df
  string
0   abcd
1  abcde

>>> df['char_length'] = df['string'].str.len()
>>> df
  string  char_length
0   abcd            4
1  abcde            5

This should be considerably faster than looping over the DataFrame with a Python for loop.

Many other familiar string methods from Python have been introduced to Pandas. For example, lower (for converting to lowercase letters), count for counting occurrences of a particular substring, and replace for swapping one substring with another.

0人赞添加讨论(0) 举报

Adding a DataFrame column with len() of another co

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间