可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a column called SSN in a CSV file with values like this

289-31-9165

I need to loop through the values in this column and replace the first five characters so it looks like this

***-**-9165

Here's the code I have so far:

emp_file = "Resources/employee_data1.csv"

emp_pd = pd.read_csv(emp_file) 

new_ssn = emp_pd["SSN"].str.replace([:5], "*")

emp_pd["SSN"] = new_ssn

How do I loop through the value and replace just the first five numbers (only) with asterisks and keep the hiphens as is?

回答1:

Similar to Mr. Me, this will instead remove everything before the first 6 characters and replace them with your new format.

emp_pd["SSN"] = emp_pd["SSN"].apply(lambda x: "***-**" + x[6:])

回答2:

You can simply achieve this with replace() method:

Example dataframe :

borrows from @AkshayNevrekar..

>>> df
           ssn
0  111-22-3333
1  121-22-1123
2  345-87-3425

Result:

>>> df.replace(r'^\d{3}-\d{2}', "***-**", regex=True)
           ssn
0  ***-**-3333
1  ***-**-1123
2  ***-**-3425

>>> df.ssn.replace(r'^\d{3}-\d{2}', "***-**", regex=True)
0    ***-**-3333
1    ***-**-1123
2    ***-**-3425
Name: ssn, dtype: object

OR:

df['ssn'] = df['ssn'].str.replace(r'^\d{3}-\d{2}', "***-**", regex=True)

回答3:

Put your asterisks in front, then grab the last 4 digits.

new_ssn = '***-**-' + emp_pd["SSN"][-4:]

回答4:

You can use regex

df = pd.DataFrame({'ssn':['111-22-3333','121-22-1123','345-87-3425']})

def func(x):
    return re.sub(r'\d{3}-\d{2}','***-**', x)

df['ssn'] = df['ssn'].apply(func)    

print(df)

Output:

           ssn                                                                                                                                 
0  ***-**-3333                                                                                                                                 
1  ***-**-1123                                                                                                                                 
2  ***-**-3425