Home Python Unicode Casting on Variable Bug

# Python Unicode Casting on Variable Bug

user2287463
1#
user2287463 Published in 2018-02-13 22:41:13Z
 I've found out this weird python2 behavior related to unicode and variable: >>> u"\u2730".encode('utf-8').encode('hex') 'e29cb0'  This is the expected result I need, but I want to dynamically control the first part ("u\u2730") >>> type(u"\u2027")  Good, so the first part is casted as unicode. Now declaring a string variable and casting it to unicode: >>> a='20' >>> b='27' >>> myvar='\u'+a+b.decode('utf-8') >>> type(myvar) >>> print myvar \u2027  It seems that now I can use the variable in my original code, right? >>> myvar.encode('utf-8').encode('hex') '5c7532303237'  The results, as you can see, is not the original one. It seems that python is treating 'myvar' as string instead of unicode. Do I miss something? Anyway, my final goal is to loop Unicode from \u0000 to \uFFFF, cast them as string and cast the string as HEX. Is there an easy way?
juanpa.arrivillaga
2#
 You are confusing the Unicode escape sequence with an the \u characters. It's like confusing r"\n" (or "\\n") with an actual newline. You want to usecodecs.raw_unicode_escape_decode decode the str with 'unicode_escape': >>> import codecs >>> a='20' >>> b='27' >>> myvar='\u'+a+b.decode('utf-8') >>> myvar u'\\u2027' >>> myvar.decode('unicode_escape') (u'\u2027', 6) >>> print(myvar.decode('unicode_escape')[0]) ‧ 
 unichr() in Python 2 or chr() in Python 3 are the ways to construct a character from a number. \uxxxx escapes codes can only be typed directly in code. Python 2: >>> a='20' >>> b='27' >>> unichr(int(a+b,16)) u'\u2027'  Python 3: >>> a='20' >>> b='27' >>> chr(int(a+b,16)) '‧'