Several errors can arise when an attempt to change from one datatype to another is made. The reason is the inability of some datatype to get casted/converted into others. One of the most common errors during these conversions is Unicode Encode Error which occurs when a text containing a Unicode literal is attempted to be encoded bytes. This article will teach you how to fix UnicodeEncodeError in Python.
Why does the UnicodeEncodeError error arise?
An error occurs when an attempt is made to save characters outside the range (or representable range) of an encoding scheme because code points outside the encoding scheme’s upper bound (for example, ASCII has a 256 range) do not exist. An error would be produced by values greater than +127 or -128. To solve the issue, the string would need to be encoded using an encoding technique that permitted representation of that code point. UTF-8 (Unicode Transformation-8-bit), UTF-16, UTF-32, ASCII, and others are examples of frequently used encodings. UTF-8 would often fix this problem.
For demonstration, the same error would be reproduced and then fixed:
Python3
a = 'neveropen1234567\xa0' .encode( "ASCII" ) print (a) |
Output:
Traceback (most recent call last):
File “C:/Users/test.py”, line 1, in <module>
b = ‘neveropen1234567\xa0’.encode(“ASCII”)
UnicodeEncodeError: ‘ascii’ codec can’t encode character ‘\xa0’ in position 20: ordinal not in range(128)
How to solve this UnicodeEncodeError?
The error is the same as the one in hand. The error arose as an attempt to represent a character was made, which was outside the range of the ASCII encoding system. i.e., ASCII could only represent character values between the range -128 to 127, but \xa0 = 128, which is outside the range of ASCII. This led to the error. To rectify this error, we have to encode the text in a scheme that allows more code points (range) than ASCII. UTF-8 would serve this purpose.
Python3
a = 'neveropen1234567\xa0' .encode( "UTF-8" ) print (a) |
Output:
b'neveropen1234567\xc2\xa0'
The program was executed this time because the string was encoded by a standard that allowed encoding code points greater than 128. Due to this, the character \xa0 (code point 128) got converted to \xc2\xa0, a two-byte representation.
Similarly, the error UnicodeEncodeError could be resolved by encoding to a format such as UTF-16/32, etc.
Python3
a = 'neveropen1234567\xa0' .encode( "UTF-16" ) print (a, end = "\n\n\n" ) a = 'neveropen1234567\xa0' .encode( "UTF-32" ) print (a) |
Output:
b’\xff\xfeg\x00e\x00e\x00k\x00s\x00f\x00o\x00r\x00g\x00e\x00e\x00k\x00s\x001\x002\x003\x004\x005\x006\x007\x00\xa0\x00′
b’\xff\xfe\x00\x00g\x00\x00\x00e\x00\x00\x00e\x00\x00\x00k\x00\x00\x00s\x00\x00\x00f\x00\x00\x00o\x00\x00\x00r\x00\x00\x00g\x00\x00\x00e\x00\x00\x00e\x00\x00\x00k\x00\x00\x00s\x00\x00\x001\x00\x00\x002\x00\x00\x003\x00\x00\x004\x00\x00\x005\x00\x00\x006\x00\x00\x007\x00\x00\x00\xa0\x00\x00\x00′