Python Strings encode() method

28 July 2024

0

Python String encode() converts a string value into a collection of bytes, using an encoding scheme specified by the user.

Python String encode() Method Syntax:

Syntax: encode(encoding, errors)

Parameters:

encoding: Specifies the encoding on the basis of which encoding has to be performed.

errors: Decides how to handle the errors if they occur, e.g ‘strict’ raises Unicode error in case of exception and ‘ignore’ ignores the errors that occurred. There are six types of error response

strict – default response which raises a UnicodeDecodeError exception on failure

ignore – ignores the unencodable unicode from the result

replace – replaces the unencodable unicode to a question mark ?

xmlcharrefreplace – inserts XML character reference instead of unencodable unicode

backslashreplace – inserts a \uNNNN escape sequence instead of unencodable unicode

namereplace – inserts a \N{…} escape sequence instead of unencodable unicode

Return: Returns the string in the encoded form

Python String encode() Method Example:

Python3

print("¶".encode('utf-8'))

Output:

b'\xc2\xb6'

Example 1: Code to print encoding schemes available

There are certain encoding schemes supported by Python String encode() method. We can get the supported encodings using the Python code below.

Python3

from encodings.aliases import aliases
 
# Printing list available
print("The available encodings are : ")
print(aliases.keys())

Output:

The available encodings are : 
dict_keys(['ibm039', 'iso_ir_226', '1140', 'iso_ir_110', '1252', 'iso_8859_8', 'iso_8859_3', 'iso_ir_166', 'cp367', 'uu', 'quotedprintable', 'ibm775', 'iso_8859_16_2001', 'ebcdic_cp_ch', 'gb2312_1980', 'ibm852', 'uhc', 'macgreek', '850', 'iso2022jp_2', 'hz_gb_2312', 'elot_928', 'iso8859_1', 'eucjp', 'iso_ir_199', 'ibm865', 'cspc862latinhebrew', '863', 'iso_8859_5', 'latin4', 'windows_1253', 'csisolatingreek', 'latin5', '855', 'windows_1256', 'rot13', 'ms1361', 'windows_1254', 'ibm863', 'iso_8859_14_1998', 'utf8_ucs2', '500', 'iso8859', '775', 'l7', 'l2', 'gb18030_2000', 'l9', 'utf_32be', 'iso_ir_100', 'iso_8859_4', 'iso_ir_157', 'csibm857', 'shiftjis2004', 'iso2022jp_1', 'iso_8859_2_1987', 'cyrillic', 'ibm861', 'ms950', 'ibm437', '866', 'csibm863', '932', 'iso_8859_14', 'cskoi8r', 'csptcp154', '852', 'maclatin2', 'sjis', 'korean', '865', 'u32', 'csshiftjis', 'dbcs', 'csibm037', 'csibm1026', 'bz2', 'quopri', '860', '1255', '861', 'iso_ir_127', 'iso_celtic', 'chinese', 'l8', '1258', 'u_jis', 'cspc850multilingual', 'iso_2022_jp_2', 'greek8', 'csibm861', '646', 'unicode_1_1_utf_7', 'ibm862', 'latin2', 'ecma_118', 'csisolatinarabic', 'zlib', 'iso2022jp_3', 'ksx1001', '858', 'hkscs', 'shiftjisx0213', 'base64', 'ibm857', 'maccentraleurope', 'latin7', 'ruscii', 'cp_is', 'iso_ir_101', 'us_ascii', 'hebrew', 'ansi_x3.4_1986', 'csiso2022jp', 'iso_8859_15', 'ibm860', 'ebcdic_cp_us', 'x_mac_simp_chinese', 'csibm855', '1250', 'maciceland', 'iso_ir_148', 'iso2022jp', 'u16', 'u7', 's_jisx0213', 'iso_8859_6_1987', 'csisolatinhebrew', 'csibm424', 'quoted_printable', 'utf_16le', 'tis260', 'utf', 'x_mac_trad_chinese', '1256', 'cp866u', 'jisx0213', 'csiso58gb231280', 'windows_1250', 'cp1361', 'kz_1048', 'asmo_708', 'utf_16be', 'ecma_114', 'eucjis2004', 'x_mac_japanese', 'utf8', 'iso_ir_6', 'cp_gr', '037', 'big5_tw', 'eucgb2312_cn', 'iso_2022_jp_3', 'euc_cn', 'iso_8859_13', 'iso_8859_5_1988', 'maccyrillic', 'ks_c_5601_1987', 'greek', 'ibm869', 'roman8', 'csibm500', 'ujis', 'arabic', 'strk1048_2002', '424', 'iso_8859_11_2001', 'l5', 'iso_646.irv_1991', '869', 'ibm855', 'eucjisx0213', 'latin1', 'csibm866', 'ibm864', 'big5_hkscs', 'sjis_2004', 'us', 'iso_8859_7', 'macturkish', 'iso_2022_jp_2004', '437', 'windows_1255', 's_jis_2004', 's_jis', '1257', 'ebcdic_cp_wt', 'iso2022jp_2004', 'ms949', 'utf32', 'shiftjis', 'latin', 'windows_1251', '1125', 'ks_x_1001', 'iso_8859_10_1992', 'mskanji', 'cyrillic_asian', 'ibm273', 'tis620', '1026', 'csiso2022kr', 'cspc775baltic', 'iso_ir_58', 'latin8', 'ibm424', 'iso_ir_126', 'ansi_x3.4_1968', 'windows_1257', 'windows_1252', '949', 'base_64', 'ms936', 'csisolatin2', 'utf7', 'iso646_us', 'macroman', '1253', '862', 'iso_8859_1_1987', 'csibm860', 'gb2312_80', 'latin10', 'ksc5601', 'iso_8859_10', 'utf8_ucs4', 'csisolatin4', 'ebcdic_cp_be', 'iso_8859_1', 'hzgb', 'ansi_x3_4_1968', 'ks_c_5601', 'l3', 'cspc8codepage437', 'iso_8859_7_1987', '8859', 'ibm500', 'ibm1026', 'iso_8859_6', 'csibm865', 'ibm866', 'windows_1258', 'iso_ir_138', 'l4', 'utf_32le', 'iso_8859_11', 'thai', '864', 'euc_jis2004', 'cp936', '1251', 'zip', 'unicodebigunmarked', 'csHPRoman8', 'csibm858', 'utf16', '936', 'ibm037', 'iso_8859_8_1988', '857', 'csibm869', 'ebcdic_cp_he', 'cp819', 'euccn', 'iso_8859_2', 'ms932', 'iso_2022_jp_1', 'iso_2022_kr', 'csisolatin6', 'iso_2022_jp', 'x_mac_korean', 'latin3', 'csbig5', 'hz_gb', 'csascii', 'u8', 'csisolatin5', 'csisolatincyrillic', 'ms_kanji', 'cspcp852', 'rk1048', 'iso2022jp_ext', 'csibm273', 'iso_2022_jp_ext', 'ibm858', 'ibm850', 'sjisx0213', 'tis_620_2529_1', 'l10', 'iso_ir_109', 'ibm1125', '1254', 'euckr', 'tis_620_0', 'l1', 'ibm819', 'iso2022kr', 'ibm367', '950', 'r8', 'hex', 'cp154', 'tis_620_2529_0', 'iso_8859_16', 'pt154', 'ebcdic_cp_ca', 'ibm1140', 'l6', 'csibm864', 'csisolatin1', 'csisolatin3', 'latin6', 'iso_8859_9_1989', 'iso_8859_3_1988', 'unicodelittleunmarked', 'macintosh', '273', 'latin9', 'iso_8859_4_1988', 'iso_8859_9', 'ebcdic_cp_nl', 'iso_ir_144'])

Example 2: Code to encode the string

Python3

string = "¶"  # utf-8 character
 
# trying to encode using utf-8 scheme
print(string.encode('utf-8'))

Output:

b'\xc2\xb6'

Errors when using wrong encoding scheme

Example 1: Python String encode() method will raise UnicodeEncodeError if wrong encoding scheme is used

Python3

string = "¶"  # utf-8 character
 
# trying to encode using ascii scheme
print(string.encode('ascii'))

Output:

UnicodeEncodeError: 'ascii' codec can't encode character '\xb6' in position 0: ordinal not in range(128)

Example 2: Using ‘errors’ parameter to ignore errors while encoding

Python String encode() method with errors parameter set to ‘ignore’ will ignore the errors in conversion of characters into specified encoding scheme.

Python3

string = "123-¶"  # utf-8 character
 
# ignore if there are any errors
print(string.encode('ascii', errors='ignore'))

Output:

b'123-'

Python Strings encode() method

Python String encode() Method Syntax:

Python String encode() Method Example:

Python3

Example 1: Code to print encoding schemes available

Python3

Example 2: Code to encode the string

Python3

Errors when using wrong encoding scheme

Example 1: Python String encode() method will raise UnicodeEncodeError if wrong encoding scheme is used

Python3

Example 2: Using ‘errors’ parameter to ignore errors while encoding

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Surfshark Black Friday & Cyber Monday Deals in 2024 by Gjurgjica Panova

7 Best Offline Password Managers in 2024: Just Updated by Manual Thomas

7 Best Parental Controls for WhatsApp in 2024 by Penka Hristovska

NordVPN Black Friday & Cyber Monday Deals in 2024 by Gjurgjica Panova

Recent Comments

EDITOR PICKS

Surfshark Black Friday & Cyber Monday Deals in 2024 by Gjurgjica Panova

7 Best Offline Password Managers in 2024: Just Updated by Manual Thomas

7 Best Parental Controls for WhatsApp in 2024 by Penka Hristovska

POPULAR POSTS

Surfshark Black Friday & Cyber Monday Deals in 2024 by Gjurgjica Panova

7 Best Offline Password Managers in 2024: Just Updated by Manual Thomas

7 Best Parental Controls for WhatsApp in 2024 by Penka Hristovska

POPULAR CATEGORY

ABOUT US

FOLLOW US