Does python support unicode beyond basic multiling

Below is a simple test. repr seems to work fine. yet len and x for x in doesn't seem to divide the unicode text correctly in Python 2.6 and 2.7:

In [1]: u"


   
    



        
        
        
        
        1条回答

           
       
           
           
           
                                              
            
                                  
            
            
            
            
            
            一纸荒年 Trace。                          
            
             
             2楼-- · 2020-06-20 16:46
             
             
             
                          
             
                                                                          
Yes, provided you compiled your Python with wide-unicode support.

By default, Python is built with narrow unicode support only. Enable wide support with:

./configure --enable-unicode=ucs4


You can verify what configuration was used by testing sys.maxunicode:

import sys
if sys.maxunicode == 0x10FFFF:
    print 'Python built with UCS4 (wide unicode) support'
else:
    print 'Python built with UCS2 (narrow unicode) support'


A wide build will use UCS4 characters for all unicode values, doubling memory usage for these. Python 3.3 switched to variable width values; only enough bytes are used to represent all characters in the current value.

Quick demo showing that a wide build handles your sample Unicode string correctly:

$ python2.6
Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40) 
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.maxunicode
1114111
>>> [x for x in u'\U0002f920\U0002f921']
[u'\U0002f920', u'\U0002f921']

    
                                                                    
                                                        
            
              
                查看更多
                
             
              0人赞

                                                     添加讨论(0)

                                                                                                            
                               举报
                
                
                
                  
                


                        
                            

                               
             
                        
               
            

                            
                            
                                 加载中...
                            
                        

                
   
   
               
               
     
                      登录 后发表回答



   
   
   
  
   相关问题
      
    
    
   
   

     


   
   how to define constructor for Python's new Nam   

   



     


   
   streaming md5sum of contents of a large remote tar   

   



     


   
   How to get the background from multiple images by   

   



     


   
   Evil ctypes hack in python   

   



     


   
   Correctly parse PDF paragraphs with Python   

   



        
      
    查看全部
   
   
  
   相关文章
 
   
   

     


   
   问个python基础问题，为什么时间不更新 及 name 'ss' is not   

     


   
   c#调用python3程序   

     


   
   如何安全的关闭程序   

     


   
   反爬能检测到JS模拟的键盘输入吗   

     


   
   有没有方法即使程序最小化也能对其发送按键   

     


   
   tkinter这样怎么不能分别赋值？   

     


   
   mouseMoveEvent奇怪的崩溃   

     


   
   在liunx 安装Levenshtein错误   

        
        
    查看全部
                 收藏的人(5)

Does python support unicode beyond basic multiling

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间