Home Python Unicode Casting on Variable Bug
Reply: 2

Python Unicode Casting on Variable Bug

user2287463
1#
user2287463 Published in 2018-02-13 22:41:13Z

I've found out this weird python2 behavior related to unicode and variable:

>>> u"\u2730".encode('utf-8').encode('hex')
'e29cb0'

This is the expected result I need, but I want to dynamically control the first part ("u\u2730")

>>> type(u"\u2027")
<type 'unicode'>

Good, so the first part is casted as unicode. Now declaring a string variable and casting it to unicode:

>>> a='20'
>>> b='27'
>>> myvar='\u'+a+b.decode('utf-8')
>>> type(myvar)
<type 'unicode'>
>>> print myvar
\u2027

It seems that now I can use the variable in my original code, right?

>>> myvar.encode('utf-8').encode('hex')
'5c7532303237'

The results, as you can see, is not the original one. It seems that python is treating 'myvar' as string instead of unicode. Do I miss something?

Anyway, my final goal is to loop Unicode from \u0000 to \uFFFF, cast them as string and cast the string as HEX. Is there an easy way?

juanpa.arrivillaga
2#
juanpa.arrivillaga Reply to 2018-02-14 07:35:49Z

You are confusing the Unicode escape sequence with an the \u characters. It's like confusing r"\n" (or "\\n") with an actual newline. You want to usecodecs.raw_unicode_escape_decode decode the str with 'unicode_escape':

>>> import codecs
>>> a='20'
>>> b='27'
>>> myvar='\u'+a+b.decode('utf-8')
>>> myvar
u'\\u2027'
>>> myvar.decode('unicode_escape')
(u'\u2027', 6)
>>> print(myvar.decode('unicode_escape')[0])
‧
Mark Tolonen
3#
Mark Tolonen Reply to 2018-02-14 05:17:19Z

unichr() in Python 2 or chr() in Python 3 are the ways to construct a character from a number. \uxxxx escapes codes can only be typed directly in code.

Python 2:

>>> a='20'
>>> b='27'
>>> unichr(int(a+b,16))
u'\u2027'

Python 3:

>>> a='20'
>>> b='27'
>>> chr(int(a+b,16))
'‧'
You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.299338 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO