Introduction
MySQL string functions allow users to manipulate data strings or query information about a string returned by the SELECT
query.
In this article, you will learn how to use MySQL string functions.
Prerequisites
- MySQL Server and MySQL Shell installed
- A MySQL user account with root privileges
MySQL String Functions Cheat Sheet
Every string function is explained and exemplified in the article below. If it is more convenient for you, you can save the cheat sheet PDF by clicking the Download MySQL String Functions Cheat Sheet link.
Download MySQL String Functions Cheat Sheet
ASCII()
The syntax for the ASCII()
function is:
ASCII('str')
The ASCII()
string returns the ASCII (numeric) value of the leftmost character of the specified str
string. The function returns 0 if no str
is specified. Returns NULL
if str
is NULL
.
Use ASCII()
for characters with numeric values from 0 to 255.
For example:
In this example, the ASCII()
function returns the numeric value of p, the leftmost character of the specified str
string.
Note: Refer to our article to learn about different MySQL Data Types.
BIN()
The syntax for the BIN()
function is:
BIN(number)
The BIN()
function returns a binary value of the specified number
argument, where the number
is a BIGINTEGER
number. Returns NULL
if the number
argument is NULL
.
For example, the following query returns a binary representation of the number 25:
BIT_LENGTH()
The syntax for the BIT_LENGTH()
function is:
BIT_LENGTH('str')
The function outputs the length of the specified str
string in bits.
For example, the following query returns the bit length of the specified ‘example‘ string:
CHAR()
The syntax for the CHAR()
function is:
CHAR(number,... [USING charset_name])
CHAR()
interprets each specified number
argument as an integer and outputs a binary string of characters from the ASCII table. The function skips NULL
values.
For example:
Note: CHAR()
converts arguments larger than 255 into multiple result bytes. For example, CHAR(256)
is equivalent to CHAR(1,0)
.
If you want to produce an output other than binary, use the optional USING
clause and specify the desired character set. MySQL issues a warning if the result string is illegal for the specified character set.
CHAR_LENGTH(), i.e., CHARACTER_LENGTH()
The syntax for the CHAR_LENGTH
function is:
CHAR_LENGTH(str)
The function outputs the length of the specified str
string, measured in characters.
CHAR_LENGTH()
treats a multibyte character as a single character, which means that a string containing four 2-byte characters returns 4 as a result, whereas LENGTH()
returns 8.
For example:
CHARACTER_LENGTH()
is a synonym for CHAR_LENGTH()
.
Note: A multibyte character is a character composed of sequences of one or more bytes. Each byte sequence represents a single character in the extended character set.
CONCAT()
The CONCAT()
function concatenates two or more specified strings. The syntax is:
CONCAT(string1,string2,...)
The CONCAT
function converts all arguments to the string type before concatenating. If all arguments are nonbinary strings, the result is a non-binary string. On the other hand, concatenating binary strings results in a binary string. A numeric argument is converted to its equivalent nonbinary string form.
If any of the specified arguments are NULL
, CONCAT()
returns NULL
as a result.
For example:
The function puts the specified strings together into one, in this case, ‘phoenixNAP‘.
CONCAT_WS()
The syntax for CONCAT_WS()
is:
CONCAT_WS(separator,str1,str2,...)
CONCAT_WS()
is a special form of CONCAT()
that puts two or more expressions together and includes a separator. The separator splits the strings that you want to concatenate. If the separator is NULL
, the result is NULL
.
For example:
In this example, the separator is a blank space that separates the specified strings in the output.
ELT()
The syntax for the ELT()
function is:
ELT(N,str1,str2,str3,...)
The N
argument defines which of the specified strings to return as a result. ELT()
returns NULL
if N
is less than 1 or greater than the number of specified strings.
For example:
EXPORT_SET()
The syntax for EXPORT_SET()
is:
EXPORT_SET(bits,on,off[,separator[,number_of_bits]])
The EXPORT_SET()
function returns an ON
or OFF
string for every bit of the first argument, checking from right to left. The argument is an integer, but the function converts it into bits.
If the bit is 1, the function returns the ON
string. If the bit is 0, the function returns OFF
. EXPORT_SET()
places a separator between the return values. The default separator is a comma, but you can specify a different one as the fourth argument.
The strings are added to the output result from left to right, separated by the separator string. The number_of_bits
argument specifies how many bits to examine.
For example:
Explanation:
1. After conversion, the first argument 5 stands for 00000101.
2. Checking from right to left, the first bit is 1, so the function returns the ‘Yes‘ argument (the ON
string). The second bit is 0, so the function returns ‘No‘ (the OFF
string). For the third bit, it returns ‘Yes.’ For all the remaining bits (zeros), it returns ‘No.’
3. The fourth argument ‘–‘ is specified as a separator in the return result.
FIELD()
The syntax for the FIELD()
syntax is:
FIELD(str,str1,str2,str3,...)
The function returns the index position of a string in a string list. If there is no such string, the output is 0. If the string is NULL
, the function returns 0. The FIELD()
function is case insensitive.
For example:
The function returns 6, which is the position of the string ‘f‘ in the list.
FIND_IN_SET()
The syntax for the FIND_IN_SET()
function is:
FIND_IN_SET(str,strlist)
The function returns the position of a string in a list of strings. If there are several string instances, the output returns only the first position of the specified string.
For example:
FORMAT()
The syntax for the FORMAT()
function is:
FORMAT(X,D)
The function outputs the specified number X
in a format like ‘#,###,###.##’, rounded to the specified number of decimal places D
. The result has no decimal point if D
is 0.
Users can also specify the locale after the D
argument, which affects the output.
For example:
The output rounds the number to 3 decimal places, and the German locale causes a .
symbol to denote thousands and the ,
character to denote fractions.
FROM_BASE64()
The syntax for the FROM_BASE64()
function is:
FROM_BASE64(str)
The function decodes the specified base-64 encoded string and returns the result as a binary string. If the argument is NULL
or an invalid base-64 string, the result is NULL
.
FROM_BASE64()
is the reverse of TO_BASE64()
as TO_BASE64()
encodes a query in base64.
For example:
The first query encodes the specified string in base64. The second query decodes the base64 encoded string and returns the original value.
HEX()
The syntax for the HEX()
function is:
HEX(N_or_S)
The function returns a string representation of a hexadecimal value of the specified N
decimal value or S
string value.
If the argument is a string
, HEX
converts each character to two hexadecimal digits. On the other hand, if the argument is a decimal
, the output is a hexadecimal string representation of the argument and treats it as a BIGINTEGER
number.
The HEX()
string function is equivalent to the mathematical function CONV(N,10,16)
.
For example:
The output returns the hexadecimal value of the specified string.
INSERT()
The syntax for the INSERT()
function is:
INSERT(str,pos,len,newstr)
The function inserts a newstr
string within the str
string and removes the len
number of original characters beginning at the pos
position.
If the pos
argument isn’t within the original string length, INSERT()
returns the original string.
If the len
argument isn’t within the length of the rest of the string, INSERT()
replaces the rest of the string from the pos
position.
If any argument is NULL
, INSERT()
returns NULL
.
For example:
The output is the original string with the new string inserted at position 5, with no original characters removed.
INSTR()
The syntax for the INSTR()
function is:
INSTR(str,substr)
The function outputs the position of the first appearance of the substr
substring in the original str
string.
The function works the same way as LOCATE()
, except the argument order is reversed.
For example:
The output indicates the substring location – position 8.
LEFT()
The syntax for the LEFT()
function is:
LEFT('str', chars)
The function outputs the number of leftmost characters chars
from the specified str
string.
If any argument is NULL
, the output is also NULL
.
For example:
LENGTH(), i.e., OCTET_LENGTH()
The syntax for the LENGTH()
function is:
LENGTH(str)
The function outputs the str
string length in bytes. Multibyte characters count as multiple bytes.
For example:
The OCTET_LENGTH()
function is a synonym for LENGTH()
.
LIKE
The syntax for the LIKE
function is:
expr LIKE pat
The function performs pattern matching by finding the specified string pattern within other strings.
LIKE
supports wildcards:
%
-Matches any number of characters, even zero._
– Matches exactly one character.
LIKE
returns 1 (true) or 0 (false). If the expr
expression or pat
pattern is NULL
, the output is also NULL
.
For example:
In this example, we retrieved all the customers whose name begins with ‘A‘.
Note: Learn how to speed up your database using the MySQL CREATE INDEX statement.
LOAD_FILE()
The syntax for the LOAD_FILE()
function is:
LOAD_FILE(file_name)
The function reads the file and outputs a string containing the file contents. The prerequisites for this function are:
- Having the file on the server host.
- Specifying the full file path in place of the file_name argument.
- Having the FILE privilege.
The server must be able to read the file, and its size must be less than max_allowed_packet
bytes. If the secure_file_priv
system variable is a non-empty directory name, place the file in that directory.
If the file doesn’t exist or the function cannot read it for one of the above reasons, the output is NULL
.
For example:
LOCATE(), i.e., POSITION()
The syntax for the LOCATE()
function is:
LOCATE(substring,str,[position])
The function outputs the position of the first occurrence of the specified substring
argument within the str
string. The position
argument is optional and used to specify from which str
string position to start searching. Omitting the position
argument starts searching from the beginning.
If the substring
is not in the str
string, LOCATE()
returns 0. If any argument is NULL
, the function returns NULL
.
For example:
The POSITION(substring IN str)
function is a synonym for LOCATE(substr,str)
.
LOWER(), i.e., LCASE()
The syntax for the LOWER()
function is:
LOWER(str)
The function changes all characters of the specified str
string to lowercase and outputs the result. The default character set mapping it uses is utf8mb4. LOWER()
is multibyte safe.
For example:
The LCASE()
function is a synonym for LOWER()
.
LPAD()
The syntax for the LPAD()
function is:
LPAD(str,len,padstr)
The function outputs the specified str
string, left-padded with the padstr
string, to a length of len
characters. The function shortens the output to len
characters if the str
argument is longer than len
.
LPAD()
is multibyte safe.
For example:
In this example, the LPAD()
function left-pads the specified argument with the specified padstr
, up to 10 characters.
Note: Follow these tips to tune up and optimize your MySQL database.
LTRIM()
The syntax for the LTRIM()
function is:
LTRIM(str)
The function outputs the specified str
string without the leading space characters.
For example:
MAKE_SET()
The syntax for the MAKE_SET()
function is:
MAKE_SET(bits,str1,str2,...)
The function outputs a set value, i.e., a string containing the specified substrings with the corresponding bit specified in the bits
argument.
The str1
argument corresponds to bit 0, str2
corresponds to bit 1, etc. If any of the arguments is NULL
, they don’t appear in the result.
For example:
In this example, the first bit is 1, i.e., 001. The rightmost digit is 1, so the function returns ‘phoenix.’ The second bit is 2, i.e., 010, the middle number is 1, so the function returns ‘NAP,’ thus completing the output.
MATCH()
The syntax for the MATCH()
function is:
MATCH(col1, col2,…) AGAINST(expr[search_modifier])
The function allows users to perform full-text searches by specifying a list of columns separated by commas. Enter a string you want to search for in place of the expr
argument.
The search_modifier
argument is optional and indicates the search type. The accepted values are:
IN NATURAL LANGUAGE MODE
(default)IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
IN BOOLEAN MODE
WITH QUERY EXPANSION
Note: When performing a full-text search, make sure your tables have a FULLTEXT
index.
For example:
NOT LIKE
The syntax for the NOT LIKE
function is:
expr NOT LIKE pat [ESCAPE 'escape_char']
NOT LIKE
is a negation of LIKE
, meaning that it operates under the same conditions as LIKE
and uses the same wildcards.
For example:
The output lists all customers and their city except the customers whose name starts with ‘A.’
NOT REGEXP
The syntax for the NOT REGEXP
function is:
expr NOT REGEXP pat
The function performs a pattern match of the expr
string against the pat
pattern. The pattern can be an extended regular expression.
NOT REGEXP
is a negation of REGEXP
.
If the expr
argument matches the pat
argument, the output is 1. Otherwise, the output is 0. If either argument is NULL
, the output is NULL
.
For example:
The above example outputs all customers who don’t live in cities starting with L. The ‘^‘ character marks the start of the city name.
OCT()
The syntax for the OCT()
function is:
OCT(N)
The function outputs the octal value of the specified N
argument, where N
is a BIGINTEGER
number. If N
is NULL
, the function returns NULL
.
For example:
ORD()
The syntax for the ORD()
function is:
ORD(str)
The function finds the code of the leftmost multibyte character in a string. If the leftmost character isn’t multibyte, ORD()
returns the character’s ASCII value.
The function calculates the character code from the numeric values of its constituent bytes. The formula used for this operation is:
(1st byte code) + (2nd byte code * 256) + (3rd byte code * 256^2) …
For example:
QUOTE()
The syntax for the QUOTE()
function is:
QUOTE(str)
The function outputs a string that represents properly escaped data value usable in an SQL statement. Single quotes enclose the string and it contains a backslash (\) before each instance of backslash (\), single quote (‘), ASCII NUL, and Control+Z.
If the str
argument is NULL
, the output is NULL
.
For example:
The example above selects all customers that live in the UK and encloses their addresses in single quotes.
Note: Learn about MyISAM and InnoDB, the two types of MySQL database storage engines.
REGEXP_LIKE(), REGEXP, RLIKE
The syntax for the REGEXP_LIKE()
function is:
REGEXP_LIKE(expr, pat, [match_type])
The function outputs 1 if the expr
string matches the expression specified in place of the pat
argument. Otherwise, the output is 0. If the expr
or pat
argument is NULL
, the output value is NULL
.
The match_type
argument is optional and represents a string that may contain any or all of the following flags that specify the matching type:
- Case-sensitive matching (
c
). Handle the arguments as binary strings with case sensitivity if either argument is a binary string. Thec
flag means case sensitivity is adopted even if thei
flag is also specified. - Case-insensitive matching (
i
). Handle the arguments without case sensitivity. - Multiple-line mode (
m
). Recognize line terminators within the string. The default setting is to match line terminators only at the string expression start and end. - The . character matches line terminators (
n
). Used to modify the . (dot) character to match line terminators. By default, . matching stops at the end of a line. - Unix-only line endings (
u
). Unix-only line endings that recognize only the newline character by the ., ^, and $ match operators.
If contradictory flags are specified within match_type
, the rightmost one takes precedence.
REGEXP
and RLIKE
are synonyms for REGEXP_LIKE()
.
Note: MySQL uses C escape syntax in strings. For example, \n
represents the newline character. If you want your expr or pat argument to contain a literal \, you must double it. In case the NO_BACKSLASH_ESCAPES SQL
mode is enabled no escape character is used.
For example:
In this example, the regular expression can specify any character in place of the dot, so the function outputs a 1 to indicate a match.
REGEXP_INSTR()
The syntax for the REGEXP_INSTR()
function is:
REGEXP_INSTR(expr, pat[, pos[, occurrence[, return_option[, match_type]]]])
The function outputs the starting index of a substring that matches the expr
expression’s pat pattern. If there is no match, the output is 0. If either argument is NULL
, the output is NULL
. Character indexes begin at 1.
The optional arguments are:
pos
– Specify the position inexpr
where to start the search. If omitted, the default is 1.occurrence
– Specify which occurrence of a match to search for. If omitted, the default is 1.return_option
– Which position type to return. If set to 0,REGEXP_INSTR()
returns the matched substring’s first character position. If set to 1,REGEXP_INSTR()
returns the position following the matched substring. If omitted, the default is 0.match_type
– Specifies how to match. The argument is the same as inREGEXP_LIKE()
and takes the same flags.
For example:
In this example, there is a match, and the substring starts at position 1.
REGEXP_REPLACE()
The syntax for the REGEXP_REPLACE()
function is:
REGEXP_REPLACE(expr, pat, repl[, pos[, occurrence[, match_type]]])
The function replaces every occurrence in the expr
string specified by the pat
pattern with the repl
string, and outputs the resulting string. If there is a match, the output is the whole string with the replacements. If there is no match, the output is the original expr
string. If any argument is NULL
, the output is NULL
.
The optional REGEXP_REPLACE()
arguments are:
pos
– The position inexpr
where to start the search. If omitted, the default is 1.occurrence
– Which match occurrence to replace. If omitted, the default is 0 and replaces all occurrences.match_type
– Specifies how to match. The argument is the same as inREGEXP_LIKE()
and takes the same flags.
For example:
REGEXP_SUBSTR()
The syntax for the REGEXP_SUBSTR()
function is:
REGEXP_SUBSTR(expr, pat[, pos[, occurrence[, match_type]]])
The function outputs the substring of the expr
string that matches the regular expression specified by the pat
pattern. If there is no match, the result is NULL
. If any argument is NULL
, the output is NULL
.
The optional arguments are:
pos
– The position inexpr
where to start the search. If omitted, the default is 1.occurrence
– Which match occurrence to replace. If omitted, the default is 1.match_type
– Specifies how to match. The argument is the same as inREGEXP_LIKE()
and takes the same flags.
For example:
In this example, the result outputs the matching substring from the specified expr
string.
REPEAT()
The syntax for the REPEAT()
function is:
REPEAT(str,count)
The function outputs a string that repeats the str
string count
times. If the count
argument is less than 1, the function outputs an empty string. If either argument is NULL
, the result is NULL
.
For example:
In the example above, the function outputs a string consisting of the ‘Work‘ string repeated six times.
REPLACE()
The syntax for the REPLACE()
function is:
REPLACE(str,from_str,to_str)
The function replaces all instances of from_str
within the str
string with the specified to_str
string. The function is case-sensitive and multibyte safe.
For example:
REVERSE()
The syntax for the REVERSE()
function is:
REVERSE(str)
The function outputs the str
string with a reversed character order. REVERSE()
is a multibyte-safe function.
For example:
RIGHT()
The syntax for the RIGHT()
function is:
RIGHT(str,len)
The function outputs the rightmost len
number of characters from the str
string. If any argument is NULL
, the result is NULL
. RIGHT()
is a multibyte-safe function.
For example:
RPAD()
The syntax for the RPAD()
function is:
RPAD(str,len,padstr)
The function outputs the specified str
string, right-padded with the padstr
string, to a length of len
characters. The str
argument being longer than len
shortens the output to len
characters.
RPAD()
is multibyte safe.
For example:
RTRIM()
The syntax for the RTRIM()
function is:
RTRIM(str)
The function outputs the str
string without the trailing space characters. The RTRIM()
function is multibyte safe.
For example:
SOUNDEX(), i.e., SOUNDS LIKE
The syntax for the SOUNDEX()
function is:
SOUNDEX(str)
The function outputs a soundex string, i.e., a phonetic representation of the input str
string. The SOUNDEX()
function allows users to compare English words that are spelled differently but sound alike.
SOUNDEX()
ignores all non-alphabetic characters in the input string and treats all characters outside the A-Z range as vowels.
Important: The SOUNDEX()
function works well only with strings in English. Results are unreliable for strings in other languages and for strings that use multibyte character sets, including utf-8.
For example:
The (expr1) SOUNDS LIKE (expr2)
function is the same as SOUNDEX(expr1) = SOUNDEX(expr2)
.
SPACE()
The syntax for the SPACE()
function is:
SPACE(N)
The function outputs a string consisting of N
number of space characters.
For example:
STRCMP()
The syntax for the STRCMP()
function is:
STRCMP(expr1,expr2)
The function compares the two expressions and outputs:
0
– If the two expressions are the same.-1
– If the first expression is smaller than the second depending on the current sort order.1
– If the second expression is smaller than the first one.
For example:
In this example, the output is 1 because the second argument is smaller than the first one.
Note: STRCMP()
compares the arguments using collation. If the collations are incompatible, one argument must be converted to ensure compatibility.
SUBSTRING(), i.e., SUBSTR(), MID()
The syntax for the SUBSTRING()
function is:
SUBSTRING(str, pos, length)
or:
SUBSTRING(str FROM pos FOR length)
The function extracts a substring from a string, starting at a specified position.
The length
argument is optional and used to return a substring length
characters long from the str
string, starting at pos
position.
The pos
argument specifies from which position to extract the substring. If pos
is a positive number, the function extracts a substring from the beginning of the string. If pos
is a negative number, the function extracts a substring from the end of the string.
For example:
MID(str,pos,length)
and SUBSTR()
are synonyms for SUBSTRING(str,pos,length)
.
SUBSTRING_INDEX()
The syntax for the SUBSTRING_INDEX()
function is:
SUBSTRING_INDEX(str,delim,count)
The function outputs a substring from the str
string before a specified count
number of delim
delimiter occurs.
If the count
argument is positive, the function outputs everything left of the final delimiter, counting from the left side.
If the count
argument is negative, the function outputs everything right of the final delimiter, counting from the right side.
SUBSTRING_INDEX()
searches for the delimiter in a case-sensitive fashion, and it is multibyte safe.
For example:
The example above shows the different outputs when the count argument is positive and negative.
TO_BASE64()
The syntax for the TO_BASE64()
function is:
TO_BASE64(str)
The function encodes a string argument to a base-64 encoded form and returns the result. If the argument isn’t a string, the function converts it to a string before base-64 encoding.
If the argument is NULL
, the result is NULL
.
TO_BASE64()
is the reverse of FROM_BASE64()
.
For example:
The output is a base-64 encoded string.
TRIM()
The syntax for the TRIM()
function is:
TRIM([{BOTH | LEADING | TRAILING} [remstr] FROM] str)
The function removes all remstr
prefixes and suffixes from the specified str
string and outputs the result.
Unless specifying the BOTH, LEADING
,or TRAILING
specifiers, the function assumes BOTH
.
The remstr
argument is optional, and omitting it removes the space characters from the string.
TRIM()
is multibyte safe.
For example:
In this example, the function removes the specified leading prefix from the string.
UPPER(), i.e., UCASE()
The syntax for the UPPER()
function is:
UPPER(str)
The function changes all characters of the specified str
string to uppercase and outputs the result. The default character set mapping it uses is utf8mb4. UPPER()
is multibyte safe.
For example:
The UCASE()
function is a synonym for UPPER()
.
UNHEX()
The syntax for the UNHEX()
function is:
UNHEX(str)
The function interprets each pair of characters in a string argument as a hexadecimal number and converts it to the byte represented by the number. The output is a binary result.
If the str
argument contains non-hexadecimal digits, the output is NULL
. A NULL
output can also occur if the argument is a BINARY
column.
UNHEX()
is the opposite of HEX()
. However, you shouldn’t use UNHEX()
to inverse the HEX()
result of numeric arguments. Instead, use the mathematical function CONV(HEX(N),16,10)
.
For example:
WEIGHT_STRING()
The syntax for the WEIGHT_STRING()
function is:
WEIGHT_STRING(str [AS {CHAR|BINARY}(N)] [flags])
str
– The input string argument.AS
– Optional clause, permits casting the input string to a binary or non-binary string, and to a specific length.flags
– Optional argument, currently unused.
The function outputs the weight string for the input str
string. The output value represents the string’s sorting and comparison value.
If used, the AS BINARY(N)
argument measures the length in bytes rather than characters, and right-pads with 0x00 bytes to the specified length.
On the other hand, the AS CHAR(N)
argument measures the characters’ length and right-pads with spaces to the specified length.
N
has a minimum value of 1. If N
is less than the input string length, the string is truncated without issuing a warning.
If the input string is a non-binary value (CHAR
, VARCHAR
, or TEXT
), the output contains the collation weights for the string. If the input string is a binary value (BINARY
, VARBINARY
, or BLOB
), the output is the same as the input string because the weight for each byte in a binary string is the byte value.
If the input string is NULL
, the output is NULL
.
Important: WEIGHT_STRING()
is a debugging function intended for internal use and collation testing and debugging. Its behavior is subject to change between different MySQL versions.
For example:
In this example, we used HEX()
to display the output because HEX()
can display binary results containing nonprinting values in a printable form.
Conclusion
You now know the different MySQL string functions and how to use them. Feel free to test them out for yourself to make sure you learn all the ropes.
Find other useful commands in our MySQL Commands Cheat Sheet.