How to display/convert a string of utf-8 to the pr

2019-06-01 08:18发布

站内文章 / Python

18 0

老娘就宠你

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a list that has WhatsApp emoticons encoded as utf-8 characters. The table I am using to decode the emoticons is at http://apps.timwhitlock.info/emoji/tables/unicode

With this table I am trying to count the number of emoticons used, which I have successfully done using regex techniques. The problem is I have created a dictionary where the keys are the utf-8 characters as strings and the key_values are integers. The following:

print d_emo
for k, v in d_emo.items():
    print k.encode('utf8'), v

produces this output:

{'\\xF0\\x9F\\x98\\xA2': 2, '\\xF0\\x9F\\x98\\x82': 1, '\\xF0\\x9F\\x98\\x86': 2, '\\xF0\\x9F\\x98\\x89': 1, '\\xF0\\x9F\\x8D\\xB5': 2, '\\xF0\\x9F\\x8D\\xB0': 4, '\\xF0\\x9F\\x8D\\xAB': 2, '\\xF0\\x9F\\x8D\\xA9': 2, '\\xF0\\x9F\\x98\\x98': 1, '\\xE2\\x98\\xBA': 33, '\\xE2\\x98\\x95': 1}
\xF0\x9F\x98\xA2 2
\xF0\x9F\x98\x82 1
\xF0\x9F\x98\x86 2
\xF0\x9F\x98\x89 1
\xF0\x9F\x8D\xB5 2
\xF0\x9F\x8D\xB0 4
\xF0\x9F\x8D\xAB 2
\xF0\x9F\x8D\xA9 2
\xF0\x9F\x98\x98 1
\xE2\x98\xBA 33
\xE2\x98\x95 1

If I use this code:

for k, v in d_emo.items():
    print k.encode('utf-8').decode('unicode_escape'), v

I get

ð¢ 2
ð 1
ð 2
ð 1
ðµ 2
ð° 4
ð« 2
ð© 2
ð 1
âº 33
â 1

I should be getting smiley faces and the like. Any suggestions? This is in Python 2.7.

回答1:

This will decode the Unicode characters correctly, but in Python 2.X you are somewhat limited when using characters outside the BMP (Basic Multilingual Plane, characters U+0000 to U+FFFF):

import unicodedata as ud
D = {'\\xF0\\x9F\\x98\\xA2': 2, '\\xF0\\x9F\\x98\\x82': 1, '\\xF0\\x9F\\x98\\x86': 2, '\\xF0\\x9F\\x98\\x89': 1, '\\xF0\\x9F\\x8D\\xB5': 2, '\\xF0\\x9F\\x8D\\xB0': 4, '\\xF0\\x9F\\x8D\\xAB': 2, '\\xF0\\x9F\\x8D\\xA9': 2, '\\xF0\\x9F\\x98\\x98': 1, '\\xE2\\x98\\xBA': 33, '\\xE2\\x98\\x95': 1}
for k,v in D.iteritems():
    k = k.decode('unicode-escape').encode('latin1').decode('utf8')
    try:
        n = ud.name(k)
    except ValueError:
        n = 'no such name'
    print k,repr(k),n

Output:

☺ u'\u263a' WHITE SMILING FACE


        
           
    





        
            
                
            
        
        

        
            老娘就宠你
            
                
            
     
        
        
                女 | 书童
            
            
                
      
                
                        
                                   
                                                
                                                        
                                                   私信
                     
                                  
    

    







    
    
        
            收藏的人(0)
           
        
        
            
            






                           
        
    



    
    





   



           
           
                    
                        Ta的文章
                  
                        更多文章
                    
                    




    
        
           
        
                             
                    
                
                    
                    
                    
                    
                        
                        
                    
                    
                        go 升级到1.16
                        
                            
                        
                    
                
                                       
                    
                
                    
                    
                    
                    
                        
                        
                    
                    
                        PaaS-cf
                        
                            
                        
                    
                
                                       
                    
                
                    
                    
                    
                    
                        
                        
                    
                    
                        我在GitHub上找到了这些游戏项目，刺激！
                        
                            
                        
                    
                
                                       
                    
                
                    
                    
                    
                    
                        
                        
                    
                    
                        Map的几种遍历方式
                        
                            
                        
                    
                
                                       
                    
                
                    
                    
                    
                    
                        
                        
                    
                    
                        遍历循环输出map的几种方式
                        
                            
                        
                    
                
                                       
                    
                
                    
                    
                    
                    
                        
                        
                    
                    
                        XHTML触发和在XHTML中使用JavaScript
                        
                            
                        
                    
                
                                       
                    
                
                    
                    
                    
                    
                        
                        
                    
                    
                        本地git仓库推送到服务器自建的git仓库实现目录文件同步教程
                        
                            
                        
                    
                
                                       
                    
                
                    
                    
                    
                    
                        
                        
                    
                    
                        如何实现左右布局可拉伸改变宽度？
                        
                            
                        
                    
                
                                   
              
              
                
    
        
     


                
                
                

   
   
    
       


 


        
               登录 后发表评论

                    
        
        
        
        
        0条评论


           
           
            
                        

               
           还没有人评论过~
          

                          
             
 
 
 
 
 
 
  
             
             
    






   
   


           
            




  
    
      
      举报内容
    
    






检举类型


检举内容


检举用户




检举原因



广告推广


恶意灌水


回答内容与提问无关



抄袭答案


其他





检举说明(必填)






    

                
                 
      



    

  


 打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮
 

 



                            
   




  
  
    
    



      
      

    
        
            
                
                    标签大全
                    站内问题
                    专栏文章
                    站内专家
                    站内话题
                    站内公告
                     财富值规则
                
               
            
                         
                宁ICP备15000671号-9
                
                站内文章地图xml
                
                站内问答地图xml
                
                站内作者地图xml
               
                站内标签地图xml
            
            
            
                        本站部分内容来自互联网，其发布内容言论不代表本站观点，如果其链接、内容的侵犯您的权益，烦请联系我们，我们将及时予以处理。
            
            
                        邮箱：z19940522666@163.com
            
            
                
                Copyright © 2016-2018 WHATSNSV3.8