At Alden Hosting we eat and breathe Secure FTP (sFTP)! We are the industry leader in providing
affordable, quality and efficient Secure FTP (sFTP) hosting in the shared hosting marketplace.
Expressions can be used at several points in SQL statements, such as
in the ORDER BY or HAVING
clauses of SELECT statements, in the
WHERE clause of a SELECT,
DELETE, or UPDATE statement,
or in SET statements. Expressions can be written
using literal values, column values, NULL,
built-in functions, stored functions, user-defined functions, and
operators. This chapter describes the functions and operators that
are allowed for writing expressions in MySQL. Instructions for
writing stored functions and user-defined functions are given in
Chapter 17, Stored Procedures and Functions, and
Section 24.2, “Adding New Functions to MySQL”. See
Section 9.2.3, “Function Name Parsing and Resolution”, for the rules describing how
the server interprets references to different kinds of functions.
An expression that contains NULL always produces
a NULL value unless otherwise indicated in the
documentation for a particular function or operator.
Note: By default, there must be no
whitespace between a function name and the parenthesis following it.
This helps the MySQL parser distinguish between function calls and
references to tables or columns that happen to have the same name as
a function. However, spaces around function arguments are permitted.
You can tell the MySQL server to accept spaces after function names
by starting it with the --sql-mode=IGNORE_SPACE
option. (See Section 5.2.6, “SQL Modes”.) Individual client
programs can request this behavior by using the
CLIENT_IGNORE_SPACE option for
mysql_real_connect(). In either case, all
function names become reserved words.
For the sake of brevity, most examples in this chapter display the
output from the mysql program in abbreviated
form. Rather than showing examples in this format:
mysql> SELECT MOD(29,9);
+-----------+
| mod(29,9) |
+-----------+
| 2 |
+-----------+
1 rows in set (0.00 sec)
This format is used instead:
mysql> SELECT MOD(29,9);
-> 2
12.1. Operator and Function Reference
Note
This table is part of an ongoing process to expand and simplify
the information provided on these elements. Further improvements
to the table, and corresponding descriptions will be applied
over the coming months.
Operator precedences are shown in the following list, from
lowest precedence to the highest. Operators that are shown
together on a line have the same precedence.
:=
||, OR, XOR
&&, AND
NOT
BETWEEN, CASE, WHEN, THEN, ELSE
=, <=>, >=, >, <=, <, <>, !=, IS, LIKE, REGEXP, IN
|
&
<<, >>
-, +
*, /, DIV, %, MOD
^
- (unary minus), ~ (unary bit inversion)
!
BINARY, COLLATE
The precedence shown for NOT is as of MySQL
5.0.2. For earlier versions, or from 5.0.2 on if the
HIGH_NOT_PRECEDENCE SQL mode is enabled, the
precedence of NOT is the same as that of the
! operator. See
Section 5.2.6, “SQL Modes”.
The precedence of operators determines the order of evaluation
of terms in an expression. To override this order and group
terms explicitly, use parentheses. For example:
When an operator is used with operands of different types, type
conversion occurs to make the operands compatible. Some
conversions occur implicitly. For example, MySQL automatically
converts numbers to strings as necessary, and vice versa.
It is also possible to perform explicit conversions. If you want
to convert a number to a string explicitly, use the
CAST() or CONCAT()
function (CAST() is preferable):
The following rules describe how conversion occurs for
comparison operations:
If one or both arguments are NULL, the
result of the comparison is NULL, except
for the NULL-safe
<=> equality comparison operator.
For NULL <=> NULL, the result is
true.
If both arguments in a comparison operation are strings,
they are compared as strings.
If both arguments are integers, they are compared as
integers.
Hexadecimal values are treated as binary strings if not
compared to a number.
If one of the arguments is a TIMESTAMP or
DATETIME column and the other argument is
a constant, the constant is converted to a timestamp before
the comparison is performed. This is done to be more
ODBC-friendly. Note that this is not done for the arguments
to IN()! To be safe, always use complete
datetime, date, or time strings when doing comparisons.
In all other cases, the arguments are compared as
floating-point (real) numbers.
The following examples illustrate conversion of strings to
numbers for comparison operations:
Note that when you are comparing a string column with a number,
MySQL cannot use an index on the column to look up the value
quickly. If str_col is an indexed
string column, the index cannot be used when performing the
lookup in the following statement:
SELECT * FROM tbl_name WHERE str_col=1;
The reason for this is that there are many different strings
that may convert to the value 1, such as
'1', ' 1', or
'1a'.
Comparisons that use floating-point numbers (or values that are
converted to floating-point numbers) are approximate because
such numbers are inexact. This might lead to results that appear
inconsistent:
Furthermore, the conversion from string to floating-point and
from integer to floating-point do not necessarily occur the same
way. The integer may be converted to floating-point by the CPU,
whereas the string is converted digit by digit in an operation
that involves floating-point multiplications.
The results shown will vary on different systems, and can be
affected by factors such as computer architecture or the
compiler version or optimization level. One way to avoid such
problems is to use CAST() so that a value
will not be converted implicitly to a float-point number:
mysql> SELECT CAST('18015376320243459' AS UNSIGNED) = 18015376320243459;
-> 1
Comparison operations result in a value of 1
(TRUE), 0
(FALSE), or NULL. These
operations work for both numbers and strings. Strings are
automatically converted to numbers and numbers to strings as
necessary.
Some of the functions in this section (such as
LEAST() and GREATEST())
return values other than 1
(TRUE), 0
(FALSE), or NULL. However,
the value they return is based on comparison operations
performed according to the rules described in
Section 12.2.2, “Type Conversion in Expression Evaluation”.
To convert a value to a specific type for comparison purposes,
you can use the CAST() function. String
values can be converted to a different character set using
CONVERT(). See
Section 12.9, “Cast Functions and Operators”.
By default, string comparisons are not case sensitive and use
the current character set. The default is
latin1 (cp1252 West European), which also
works well for English.
NULL-safe equal. This operator performs
an equality comparison like the =
operator, but returns 1 rather than
NULL if both operands are
NULL, and 0 rather
than NULL if one operand is
NULL.
Tests a value against a boolean value, where
boolean_value can be
TRUE, FALSE, or
UNKNOWN.
mysql> SELECT 1 IS TRUE, 0 IS FALSE, NULL IS UNKNOWN;
-> 1, 1, 1
mysql> SELECT 1 IS NOT UNKNOWN, 0 IS NOT UNKNOWN, NULL IS NOT UNKNOWN;
-> 1, 1, 0
IS [NOT]
boolean_value syntax
was added in MySQL 5.0.2.
IS NULL, IS NOT NULL
Tests whether a value is or is not NULL.
mysql> SELECT 1 IS NULL, 0 IS NULL, NULL IS NULL;
-> 0, 0, 1
mysql> SELECT 1 IS NOT NULL, 0 IS NOT NULL, NULL IS NOT NULL;
-> 1, 1, 0
To work well with ODBC programs, MySQL supports the
following extra features when using IS
NULL:
You can find the row that contains the most recent
AUTO_INCREMENT value by issuing a
statement of the following form immediately after
generating the value:
For DATE and
DATETIME columns that are declared as
NOT NULL, you can find the special
date '0000-00-00' by using a
statement like this:
SELECT * FROM tbl_name WHERE date_column IS NULL
This is needed to get some ODBC applications to work
because ODBC does not support a
'0000-00-00' date value.
expr BETWEEN
min AND
max
If expr is greater than or equal
to min and
expr is less than or equal to
max, BETWEEN
returns 1, otherwise it returns
0. This is equivalent to the expression
(min <=
expr AND
expr <=
max) if all the
arguments are of the same type. Otherwise type conversion
takes place according to the rules described in
Section 12.2.2, “Type Conversion in Expression Evaluation”, but applied to all the
three arguments.
mysql> SELECT 1 BETWEEN 2 AND 3;
-> 0
mysql> SELECT 'b' BETWEEN 'a' AND 'c';
-> 1
mysql> SELECT 2 BETWEEN 2 AND '3';
-> 1
mysql> SELECT 2 BETWEEN 2 AND 'x-3';
-> 0
For best results when using BETWEEN with
date or time values, you should use
CAST() to explicitly convert the values
to the desired data type. Examples: If you compare a
DATETIME to two DATE
values, convert the DATE values to
DATETIME values. If you use a string
constant such as '2001-1-1' in a
comparison to a DATE, cast the string to
a DATE.
expr NOT BETWEEN
min AND
max
This is the same as NOT
(expr BETWEEN
min AND
max).
COALESCE(value,...)
Returns the first non-NULL value in the
list, or NULL if there are no
non-NULL values.
Before MySQL 5.0.13, GREATEST() returns
NULL only if all arguments are
NULL. As of 5.0.13, it returns
NULL if any argument is
NULL.
expr IN
(value,...)
Returns 1 if
expr is equal to any of the
values in the IN list, else returns
0. If all values are constants, they are
evaluated according to the type of
expr and sorted. The search for
the item then is done using a binary search. This means
IN is very quick if the
IN value list consists entirely of
constants. Otherwise, type conversion takes place according
to the rules described in Section 12.2.2, “Type Conversion in Expression Evaluation”,
but applied to all the arguments.
mysql> SELECT 2 IN (0,3,5,7);
-> 0
mysql> SELECT 'wefwf' IN ('wee','wefwf','weg');
-> 1
You should never mix quoted and unquoted values in an
IN list because the comparison rules for
quoted values (such as strings) and unquoted values (such as
numbers) differ. Mixing types may therefore lead to
inconsistent results. For example, do not write an
IN expression like this:
SELECT val1 FROM tbl1 WHERE val1 IN (1,2,'a');
Instead, write it like this:
SELECT val1 FROM tbl1 WHERE val1 IN ('1','2','a');
The number of values in the IN list is
only limited by the max_allowed_packet
value.
To comply with the SQL standard, IN
returns NULL not only if the expression
on the left hand side is NULL, but also
if no match is found in the list and one of the expressions
in the list is NULL.
ISNULL() can be used instead of
= to test whether a value is
NULL. (Comparing a value to
NULL using = always
yields false.)
The ISNULL() function shares some special
behaviors with the IS NULL comparison
operator. See the description of IS NULL.
INTERVAL(N,N1,N2,N3,...)
Returns 0 if N
< N1, 1 if
N <
N2 and so on or
-1 if N is
NULL. All arguments are treated as
integers. It is required that N1
< N2 <
N3 < ...
< Nn for this function to work
correctly. This is because a binary search is used (very
fast).
Note that the preceding conversion rules can produce strange
results in some borderline cases:
mysql> SELECT CAST(LEAST(3600, 9223372036854775808.0) as SIGNED);
-> -9223372036854775808
This happens because MySQL reads
9223372036854775808.0 in an integer
context. The integer representation is not good enough to
hold the value, so it wraps to a signed integer.
In SQL, all logical operators evaluate to
TRUE, FALSE, or
NULL (UNKNOWN). In MySQL,
these are implemented as 1 (TRUE), 0
(FALSE), and NULL. Most of
this is common to different SQL database servers, although some
servers may return any non-zero value for
TRUE.
Note that MySQL evaluates any non-zero or
non-NULL value to TRUE.
For example, the following statements all assess to
TRUE:
mysql> SELECT 10 IS TRUE;
-> 1
mysql> SELECT -10 IS TRUE;
-> 1
mysql> SELECT 'string' IS NOT NULL;
-> 1
NOT, !
Logical NOT. Evaluates to 1 if the
operand is 0, to 0 if
the operand is non-zero, and NOT NULL
returns NULL.
mysql> SELECT NOT 10;
-> 0
mysql> SELECT NOT 0;
-> 1
mysql> SELECT NOT NULL;
-> NULL
mysql> SELECT ! (1+1);
-> 0
mysql> SELECT ! 1+1;
-> 1
The last example produces 1 because the
expression evaluates the same way as
(!1)+1.
Logical OR. When both operands are
non-NULL, the result is
1 if any operand is non-zero, and
0 otherwise. With a
NULL operand, the result is
1 if the other operand is non-zero, and
NULL otherwise. If both operands are
NULL, the result is
NULL.
Logical XOR. Returns NULL if either
operand is NULL. For
non-NULL operands, evaluates to
1 if an odd number of operands is
non-zero, otherwise 0 is returned.
CASE value WHEN
[compare_value] THEN
result [WHEN
[compare_value] THEN
result ...] [ELSE
result] END
CASE WHEN [condition] THEN
result [WHEN
[condition] THEN
result ...] [ELSE
result] END
The first version returns the
result where
value=compare_value.
The second version returns the result for the first condition
that is true. If there was no matching result value, the
result after ELSE is returned, or
NULL if there is no ELSE
part.
mysql> SELECT CASE 1 WHEN 1 THEN 'one'
-> WHEN 2 THEN 'two' ELSE 'more' END;
-> 'one'
mysql> SELECT CASE WHEN 1>0 THEN 'true' ELSE 'false' END;
-> 'true'
mysql> SELECT CASE BINARY 'B'
-> WHEN 'a' THEN 1 WHEN 'b' THEN 2 END;
-> NULL
The default return type of a CASE
expression is the compatible aggregated type of all return
values, but also depends on the context in which it is used.
If used in a string context, the result is returned as a
string. If used in a numeric context, then the result is
returned as a decimal, real, or integer value.
Note: The syntax of the
CASEexpression shown
here differs slightly from that of the SQL
CASEstatement
described in Section 17.2.10.2, “CASE Statement”, for use inside
stored routines. The CASE statement cannot
have an ELSE NULL clause, and it is
terminated with END CASE instead of
END.
IF(expr1,expr2,expr3)
If expr1 is TRUE
(expr1 <>
0 and expr1
<> NULL) then IF() returns
expr2; otherwise it returns
expr3. IF()
returns a numeric or string value, depending on the context in
which it is used.
If only one of expr2 or
expr3 is explicitly
NULL, the result type of the
IF() function is the type of the
non-NULL expression.
expr1 is evaluated as an integer
value, which means that if you are testing floating-point or
string values, you should do so using a comparison operation.
In the first case shown, IF(0.1) returns
0 because 0.1 is
converted to an integer value, resulting in a test of
IF(0). This may not be what you expect. In
the second case, the comparison tests the original
floating-point value to see whether it is non-zero. The result
of the comparison is used as an integer.
The default return type of IF() (which may
matter when it is stored into a temporary table) is calculated
as follows:
Expression
Return Value
expr2 or expr3
returns a string
string
expr2 or expr3
returns a floating-point value
floating-point
expr2 or expr3
returns an integer
integer
If expr2 and
expr3 are both strings, the result
is case sensitive if either string is case sensitive.
If expr1 is not
NULL, IFNULL() returns
expr1; otherwise it returns
expr2. IFNULL()
returns a numeric or string value, depending on the context in
which it is used.
The default result value of
IFNULL(expr1,expr2)
is the more “general” of the two expressions, in
the order STRING, REAL,
or INTEGER. Consider the case of a table
based on expressions or where MySQL must internally store a
value returned by IFNULL() in a temporary
table:
mysql> CREATE TABLE tmp SELECT IFNULL(1,'test') AS test;
mysql> DESCRIBE tmp;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| test | char(4) | | | | |
+-------+---------+------+-----+---------+-------+
In this example, the type of the test
column is CHAR(4).
NULLIF(expr1,expr2)
Returns NULL if
expr1 =
expr2 is true, otherwise
returns expr1. This is the same as
CASE WHEN expr1 =
expr2 THEN NULL ELSE
expr1 END.
String-valued functions return NULL if the
length of the result would be greater than the value of the
max_allowed_packet system variable. See
Section 7.5.2, “Tuning Server Parameters”.
For functions that operate on string positions, the first position
is numbered 1.
For functions that take length arguments, non-integer arguments
are rounded to the nearest integer.
ASCII(str)
Returns the numeric value of the leftmost character of the
string str. Returns
0 if str is the
empty string. Returns NULL if
str is NULL.
ASCII() works for characters with numeric
values from 0 to 255.
Returns a string representation of the binary value of
N, where
N is a longlong
(BIGINT) number. This is equivalent to
CONV(N,10,2).
Returns NULL if
N is NULL.
mysql> SELECT BIN(12);
-> '1100'
BIT_LENGTH(str)
Returns the length of the string
str in bits.
mysql> SELECT BIT_LENGTH('text');
-> 32
CHAR(N,... [USING
charset_name])
CHAR() interprets each argument
N as an integer and returns a
string consisting of the characters given by the code values
of those integers. NULL values are skipped.
As of MySQL 5.0.15, CHAR() arguments larger
than 255 are converted into multiple result bytes. For
example, CHAR(256) is equivalent to
CHAR(1,0), and
CHAR(256*256) is equivalent to
CHAR(1,0,0):
If USING is given and the result string is
illegal for the given character set, a warning is issued.
Also, if strict SQL mode is enabled, the result from
CHAR() becomes NULL.
Before MySQL 5.0.15, CHAR() returns a
string in the connection character set and the
USING clause is unavailable. In addition,
each argument is interpreted modulo 256, so
CHAR(256) and
CHAR(256*256) both are equivalent to
CHAR(0).
CHAR_LENGTH(str)
Returns the length of the string
str, measured in characters. A
multi-byte character counts as a single character. This means
that for a string containing five two-byte characters,
LENGTH() returns 10,
whereas CHAR_LENGTH() returns
5.
CHARACTER_LENGTH(str)
CHARACTER_LENGTH() is a synonym for
CHAR_LENGTH().
CONCAT(str1,str2,...)
Returns the string that results from concatenating the
arguments. May have one or more arguments. If all arguments
are non-binary strings, the result is a non-binary string. If
the arguments include any binary strings, the result is a
binary string. A numeric argument is converted to its
equivalent binary string form; if you want to avoid that, you
can use an explicit type cast, as in this example:
CONCAT_WS() stands for Concatenate With
Separator and is a special form of
CONCAT(). The first argument is the
separator for the rest of the arguments. The separator is
added between the strings to be concatenated. The separator
can be a string, as can the rest of the arguments. If the
separator is NULL, the result is
NULL.
CONCAT_WS() does not skip empty strings.
However, it does skip any NULL values after
the separator argument.
CONV(N,from_base,to_base)
Converts numbers between different number bases. Returns a
string representation of the number
N, converted from base
from_base to base
to_base. Returns
NULL if any argument is
NULL. The argument
N is interpreted as an integer, but
may be specified as an integer or a string. The minimum base
is 2 and the maximum base is
36. If to_base
is a negative number, N is regarded
as a signed number. Otherwise, N is
treated as unsigned. CONV() works with
64-bit precision.
Returns str1 if
N = 1,
str2 if
N = 2, and so
on. Returns NULL if
N is less than 1
or greater than the number of arguments.
ELT() is the complement of
FIELD().
Returns a string such that for every bit set in the value
bits, you get an
on string and for every bit not set
in the value, you get an off
string. Bits in bits are examined
from right to left (from low-order to high-order bits).
Strings are added to the result from left to right, separated
by the separator string (the
default being the comma character
‘,’). The number of bits
examined is given by number_of_bits
(defaults to 64).
Returns the index (position) of str
in the str1,
str2,
str3, ... list.
Returns 0 if str
is not found.
If all arguments to FIELD() are strings,
all arguments are compared as strings. If all arguments are
numbers, they are compared as numbers. Otherwise, the
arguments are compared as double.
If str is NULL,
the return value is 0 because
NULL fails equality comparison with any
value. FIELD() is the complement of
ELT().
Returns a value in the range of 1 to
N if the string
str is in the string list
strlist consisting of
N substrings. A string list is a
string composed of substrings separated by
‘,’ characters. If the first
argument is a constant string and the second is a column of
type SET, the
FIND_IN_SET() function is optimized to use
bit arithmetic. Returns 0 if
str is not in
strlist or if
strlist is the empty string.
Returns NULL if either argument is
NULL. This function does not work properly
if the first argument contains a comma
(‘,’) character.
mysql> SELECT FIND_IN_SET('b','a,b,c,d');
-> 2
FORMAT(X,D)
Formats the number X to a format
like '#,###,###.##', rounded to
D decimal places, and returns the
result as a string. If D is
0, the result has no decimal point or
fractional part.
If N_or_S is a number, returns a
string representation of the hexadecimal value of
N, where
N is a longlong
(BIGINT) number. This is equivalent to
CONV(N,10,16).
If N_or_S is a string, returns a
hexadecimal string representation of
N_or_S where each character in
N_or_S is converted to two
hexadecimal digits.
Returns the string str, with the
substring beginning at position pos
and len characters long replaced by
the string newstr. Returns the
original string if pos is not
within the length of the string. Replaces the rest of the
string from position pos if
len is not within the length of the
rest of the string. Returns NULL if any
argument is NULL.
Returns the position of the first occurrence of substring
substr in string
str. This is the same as the
two-argument form of LOCATE(), except that
the order of the arguments is reversed.
This function is multi-byte safe, and is case sensitive only
if at least one argument is a binary string.
LCASE(str)
LCASE() is a synonym for
LOWER().
LEFT(str,len)
Returns the leftmost len characters
from the string str, or
NULL if any argument is
NULL.
mysql> SELECT LEFT('foobarbar', 5);
-> 'fooba'
LENGTH(str)
Returns the length of the string
str, measured in bytes. A
multi-byte character counts as multiple bytes. This means that
for a string containing five two-byte characters,
LENGTH() returns 10,
whereas CHAR_LENGTH() returns
5.
mysql> SELECT LENGTH('text');
-> 4
LOAD_FILE(file_name)
Reads the file and returns the file contents as a string. To
use this function, the file must be located on the server
host, you must specify the full pathname to the file, and you
must have the FILE privilege. The file must
be readable by all and its size less than
max_allowed_packet bytes.
If the file does not exist or cannot be read because one of
the preceding conditions is not satisfied, the function
returns NULL.
As of MySQL 5.0.19, the
character_set_filesystem system variable
controls interpretation of filenames that are given as literal
strings.
mysql> UPDATE t SET blob_col=LOAD_FILE('/tmp/picture') WHERE id=1;
LOCATE(substr,str),
LOCATE(substr,str,pos)
The first syntax returns the position of the first occurrence
of substring substr in string
str. The second syntax returns the
position of the first occurrence of substring
substr in string
str, starting at position
pos. Returns 0
if substr is not in
str.
This function is multi-byte safe, and is case-sensitive only
if at least one argument is a binary string.
LOWER(str)
Returns the string str with all
characters changed to lowercase according to the current
character set mapping. The default is
latin1 (cp1252 West European).
Returns the string str, left-padded
with the string padstr to a length
of len characters. If
str is longer than
len, the return value is shortened
to len characters.
Returns the string str with leading
space characters removed.
mysql> SELECT LTRIM(' barbar');
-> 'barbar'
This function is multi-byte safe.
MAKE_SET(bits,str1,str2,...)
Returns a set value (a string containing substrings separated
by ‘,’ characters) consisting
of the strings that have the corresponding bit in
bits set.
str1 corresponds to bit 0,
str2 to bit 1, and so on.
NULL values in
str1,
str2, ... are
not appended to the result.
MID(str,pos,len)
is a synonym for
SUBSTRING(str,pos,len).
OCT(N)
Returns a string representation of the octal value of
N, where
N is a longlong
(BIGINT) number. This is equivalent to
CONV(N,10,8).
Returns NULL if
N is NULL.
mysql> SELECT OCT(12);
-> '14'
OCTET_LENGTH(str)
OCTET_LENGTH() is a synonym for
LENGTH().
ORD(str)
If the leftmost character of the string
str is a multi-byte character,
returns the code for that character, calculated from the
numeric values of its constituent bytes using this formula:
If the leftmost character is not a multi-byte character,
ORD() returns the same value as the
ASCII() function.
mysql> SELECT ORD('2');
-> 50
POSITION(substr IN
str)
POSITION(substr IN
str) is a synonym for
LOCATE(substr,str).
QUOTE(str)
Quotes a string to produce a result that can be used as a
properly escaped data value in an SQL statement. The string is
returned enclosed by single quotes and with each instance of
single quote (‘'’), backslash
(‘\’), ASCII
NUL, and Control-Z preceded by a backslash.
If the argument is NULL, the return value
is the word “NULL” without enclosing single
quotes.
Returns a string consisting of the string
str repeated
count times. If
count is less than 1, returns an
empty string. Returns NULL if
str or
count are NULL.
Returns the string str with all
occurrences of the string from_str
replaced by the string to_str.
REPLACE() performs a case-sensitive match
when searching for from_str.
Returns the string str with the
order of the characters reversed.
mysql> SELECT REVERSE('abc');
-> 'cba'
This function is multi-byte safe.
RIGHT(str,len)
Returns the rightmost len
characters from the string str, or
NULL if any argument is
NULL.
mysql> SELECT RIGHT('foobarbar', 4);
-> 'rbar'
This function is multi-byte safe.
RPAD(str,len,padstr)
Returns the string str,
right-padded with the string padstr
to a length of len characters. If
str is longer than
len, the return value is shortened
to len characters.
Returns the string str with
trailing space characters removed.
mysql> SELECT RTRIM('barbar ');
-> 'barbar'
This function is multi-byte safe.
SOUNDEX(str)
Returns a soundex string from str.
Two strings that sound almost the same should have identical
soundex strings. A standard soundex string is four characters
long, but the SOUNDEX() function returns an
arbitrarily long string. You can use
SUBSTRING() on the result to get a standard
soundex string. All non-alphabetic characters in
str are ignored. All international
alphabetic characters outside the A-Z range are treated as
vowels.
Important: When using
SOUNDEX(), you should be aware of the
following limitations:
This function, as currently implemented, is intended to
work well with strings that are in the English language
only. Strings in other languages may not produce reliable
results.
This function is not guaranteed to provide consistent
results with strings that use multi-byte character sets,
including utf-8.
We hope to remove these limitations in a future release.
See Bug#22638 for more information.
Note: This function
implements the original Soundex algorithm, not the more
popular enhanced version (also described by D. Knuth). The
difference is that original version discards vowels first and
duplicates second, whereas the enhanced version discards
duplicates first and vowels second.
expr1 SOUNDS LIKE
expr2
This is the same as
SOUNDEX(expr1) =
SOUNDEX(expr2).
SPACE(N)
Returns a string consisting of N
space characters.
mysql> SELECT SPACE(6);
-> ' '
SUBSTRING(str,pos),
SUBSTRING(str FROM
pos),
SUBSTRING(str,pos,len),
SUBSTRING(str FROM
pos FOR
len)
The forms without a len argument
return a substring from string str
starting at position pos. The forms
with a len argument return a
substring len characters long from
string str, starting at position
pos. The forms that use
FROM are standard SQL syntax. It is also
possible to use a negative value for
pos. In this case, the beginning of
the substring is pos characters
from the end of the string, rather than the beginning. A
negative value may be used for pos
in any of the forms of this function.
For all forms of SUBSTRING(), the position
of the first character in the string from which the substring
is to be extracted is reckoned as 1.
If len is less than 1, the result
is the empty string.
SUBSTR() is a synonym for
SUBSTRING().
SUBSTRING_INDEX(str,delim,count)
Returns the substring from string
str before
count occurrences of the delimiter
delim. If
count is positive, everything to
the left of the final delimiter (counting from the left) is
returned. If count is negative,
everything to the right of the final delimiter (counting from
the right) is returned. SUBSTRING_INDEX()
performs a case-sensitive match when searching for
delim.
Returns the string str with all
remstr prefixes or suffixes
removed. If none of the specifiers BOTH,
LEADING, or TRAILING is
given, BOTH is assumed.
remstr is optional and, if not
specified, spaces are removed.
mysql> SELECT TRIM(' bar ');
-> 'bar'
mysql> SELECT TRIM(LEADING 'x' FROM 'xxxbarxxx');
-> 'barxxx'
mysql> SELECT TRIM(BOTH 'x' FROM 'xxxbarxxx');
-> 'bar'
mysql> SELECT TRIM(TRAILING 'xyz' FROM 'barxxyz');
-> 'barx'
This function is multi-byte safe.
UCASE(str)
UCASE() is a synonym for
UPPER().
UNHEX(str)
Performs the inverse operation of
HEX(str). That
is, it interprets each pair of hexadecimal digits in the
argument as a number and converts it to the character
represented by the number. The resulting characters are
returned as a binary string.
The characters in the argument string must be legal
hexadecimal digits: '0' ..
'9', 'A' ..
'F', 'a' ..
'f'. If UNHEX()
encounters any non-hexadecimal digits in the argument, it
returns NULL:
A NULL result can occur if the argument to
UNHEX() is a BINARY
column, because values are padded with 0x00 bytes when stored
but those bytes are not stripped on retrieval. For example
'aa' is stored into a
CHAR(3) column as
'aa ' and retrieved as
'aa' (with the trailing pad space
stripped), so UNHEX() for the column value
returns 'A'. By contrast
'aa' is stored into a
BINARY(3) column as
'aa\0' and retrieved as
'aa\0' (with the trailing pad
0x00 byte not stripped).
'\0' is not a legal hexadecimal digit, so
UNHEX() for the column value returns
NULL.
UPPER(str)
Returns the string str with all
characters changed to uppercase according to the current
character set mapping. The default is
latin1 (cp1252 West European).
If a string function is given a binary string as an argument,
the resulting string is also a binary string. A number converted
to a string is treated as a binary string. This affects only
comparisons.
Normally, if any expression in a string comparison is case
sensitive, the comparison is performed in case-sensitive
fashion.
expr LIKE
pat [ESCAPE
'escape_char']
Pattern matching using SQL simple regular expression
comparison. Returns 1
(TRUE) or 0
(FALSE). If either
expr or
pat is NULL,
the result is NULL.
The pattern need not be a literal string. For example, it
can be specified as a string expression or table column.
Per the SQL standard, LIKE performs
matching on a per-character basis, thus it can produce
results different from the = comparison
operator:
With LIKE you can use the following two
wildcard characters in the pattern:
Character
Description
%
Matches any number of characters, even zero characters
_
Matches exactly one character
mysql> SELECT 'David!' LIKE 'David_';
-> 1
mysql> SELECT 'David!' LIKE '%D%v%';
-> 1
To test for literal instances of a wildcard character,
precede it by the escape character. If you do not specify
the ESCAPE character,
‘\’ is assumed.
String
Description
\%
Matches one ‘%’ character
\_
Matches one ‘_’ character
mysql> SELECT 'David!' LIKE 'David\_';
-> 0
mysql> SELECT 'David_' LIKE 'David\_';
-> 1
To specify a different escape character, use the
ESCAPE clause:
mysql> SELECT 'David_' LIKE 'David|_' ESCAPE '|';
-> 1
The escape sequence should be empty or one character long.
As of MySQL 5.0.16, if the
NO_BACKSLASH_ESCAPES SQL mode is enabled,
the sequence cannot be empty.
The following two statements illustrate that string
comparisons are not case sensitive unless one of the
operands is a binary string:
mysql> SELECT 'abc' LIKE 'ABC';
-> 1
mysql> SELECT 'abc' LIKE BINARY 'ABC';
-> 0
In MySQL, LIKE is allowed on numeric
expressions. (This is an extension to the standard SQL
LIKE.)
mysql> SELECT 10 LIKE '1%';
-> 1
Note: Because MySQL uses C
escape syntax in strings (for example,
‘\n’ to represent a newline
character), you must double any
‘\’ that you use in
LIKE strings. For example, to search for
‘\n’, specify it as
‘\\n’. To search for
‘\’, specify it as
‘\\\\’; this is because the
backslashes are stripped once by the parser and again when
the pattern match is made, leaving a single backslash to be
matched against. (Exception: At the end of the pattern
string, backslash can be specified as
‘\\’. At the end of the
string, backslash stands for itself because there is nothing
following to escape.)
expr NOT LIKE
pat [ESCAPE
'escape_char']
This is the same as NOT
(expr LIKE
pat [ESCAPE
'escape_char']).
Note
Aggregate queries involving NOT LIKE
comparisons with columns containing
NULL may yield unexpected results. For
example, consider the following table and data:
The query SELECT COUNT(*) FROM foo WHERE bar LIKE
'%baz%'; returns 0. You might
assume that SELECT COUNT(*) FROM foo WHERE bar
NOT LIKE '%baz%'; would return
2. However, this is not the case: The
second query returns 0. This is because
NULL NOT LIKE
expr always returns
NULL, regardless of the value of
expr. The same is true for
aggregate queries involving NULL and
comparisons using NOT RLIKE or
NOT REGEXP. In such cases, you must
test explicitly for NOT NULL using
OR (and not AND), as
shown here:
SELECT COUNT(*) FROM foo WHERE bar NOT LIKE '%baz%' OR bar IS NULL;
expr NOT REGEXP
pat,
expr NOT RLIKE
pat
This is the same as NOT
(expr REGEXP
pat).
expr REGEXP
pat,
expr RLIKE
pat
Performs a pattern match of a string expression
expr against a pattern
pat. The pattern can be an
extended regular expression. The syntax for regular
expressions is discussed in Section 12.4.2, “Regular Expressions”.
Returns 1 if
expr matches
pat; otherwise it returns
0. If either
expr or
pat is NULL,
the result is NULL.
RLIKE is a synonym for
REGEXP, provided for
mSQL compatibility.
The pattern need not be a literal string. For example, it
can be specified as a string expression or table column.
Note: Because MySQL uses
the C escape syntax in strings (for example,
‘\n’ to represent the newline
character), you must double any
‘\’ that you use in your
REGEXP strings.
REGEXP is not case sensitive, except when
used with binary strings.
REGEXP and RLIKE use
the current character set when deciding the type of a
character. The default is latin1 (cp1252
West European). Warning:
These operators are not multi-byte safe.
STRCMP(expr1,expr2)
STRCMP() returns 0 if
the strings are the same, -1 if the first
argument is smaller than the second according to the current
sort order, and 1 otherwise.
STRCMP() uses the current character set
when performing comparisons. This makes the default
comparison behavior case insensitive unless one or both of
the operands are binary strings.
This section is a summary, with examples, of the special
characters and constructs that can be used in MySQL for
REGEXP operations. It does not contain all
the details that can be found in Henry Spencer's
regex(7) manual page. That manual page is
included in MySQL source distributions, in the
regex.7 file under the
regex directory.
A regular expression describes a set of strings. The simplest
regular expression is one that has no special characters in it.
For example, the regular expression hello
matches hello and nothing else.
Non-trivial regular expressions use certain special constructs
so that they can match more than one string. For example, the
regular expression hello|word matches either
the string hello or the string
word.
As a more complex example, the regular expression
B[an]*s matches any of the strings
Bananas, Baaaaas,
Bs, and any other string starting with a
B, ending with an s, and
containing any number of a or
n characters in between.
A regular expression for the REGEXP operator
may use any of the following special characters and constructs:
{n} or {m,n} notation
provides a more general way of writing regular expressions
that match many occurrences of the previous atom (or
“piece”) of the pattern. m
and n are integers.
a*
Can be written as a{0,}.
a+
Can be written as a{1,}.
a?
Can be written as a{0,1}.
To be more precise, a{n} matches exactly
n instances of a.
a{n,} matches n or
more instances of a.
a{m,n} matches m
through n instances of
a, inclusive.
m and n must be in the
range from 0 to
RE_DUP_MAX (default 255), inclusive. If
both m and n are
given, m must be less than or equal to
n.
Matches any character that is (or is not, if ^ is used)
either a, b,
c, d or
X. A - character
between two other characters forms a range that matches all
characters from the first character to the second. For
example, [0-9] matches any decimal digit.
To include a literal ] character, it must
immediately follow the opening bracket [.
To include a literal - character, it must
be written first or last. Any character that does not have a
defined special meaning inside a [] pair
matches only itself.
Within a bracket expression (written using
[ and ]), matches the
sequence of characters of that collating element.
characters is either a single character
or a character name like newline. The
following table lists the allowable character names.
The following table shows the allowable character names and
the characters that they match. For characters given as
numeric values, the values are represented in octal.
Within a bracket expression (written using
[ and ]),
[=character_class=] represents an
equivalence class. It matches all characters with the same
collation value, including itself. For example, if
o and (+) are the
members of an equivalence class, then
[[=o=]], [[=(+)=]],
and [o(+)] are all synonymous. An
equivalence class may not be used as an endpoint of a range.
[:character_class:]
Within a bracket expression (written using
[ and ]),
[:character_class:] represents a
character class that matches all characters belonging to
that class. The following table lists the standard class
names. These names stand for the character classes defined
in the ctype(3) manual page. A particular
locale may provide other class names. A character class may
not be used as an endpoint of a range.
These markers stand for word boundaries. They match the
beginning and end of words, respectively. A word is a
sequence of word characters that is not preceded by or
followed by word characters. A word character is an
alphanumeric character in the alnum class
or an underscore (_).
mysql> SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]'; -> 1
mysql> SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]'; -> 0
To use a literal instance of a special character in a regular
expression, precede it by two backslash (\) characters. The
MySQL parser interprets one of the backslashes, and the regular
expression library interprets the other. For example, to match
the string 1+2 that contains the special
+ character, only the last of the following
regular expressions is the correct one:
The usual arithmetic operators are available. The precision of
the result is determined according to the following rules:
Note that in the case of -,
+, and *, the result
is calculated with BIGINT (64-bit)
precision if both arguments are integers.
If one of the arguments is an unsigned integer, and the
other argument is also an integer, the result is an unsigned
integer.
If any of the operands of a +,
-, /,
*, % is a real or
string value, then the precision of the result is the
precision of the argument with the maximum precision.
In multiplication and division, the precision of the result
when using two exact values is the precision of the first
argument + the value of the
div_precision_increment global variable.
For example, the expression 5.05 / 0.0014
would have a precision of six decimal places
(3607.142857).
These rules are applied for each operation, such that nested
calculations imply the precision of each component. Hence,
(14620 / 9432456) / (24250 / 9432456), would
resolve first to (0.0014) / (0.0026), with
the final result having 8 decimal places
(0.57692308).
Because of these rules and the method they are applied, care
should be taken to ensure that components and sub-components of
a calculation use the appropriate level of precision. See
Section 12.9, “Cast Functions and Operators”.
+
Addition:
mysql> SELECT 3+5;
-> 8
-
Subtraction:
mysql> SELECT 3-5;
-> -2
-
Unary minus. This operator changes the sign of the argument.
mysql> SELECT - 2;
-> -2
Note: If this operator is
used with a BIGINT, the return value is
also a BIGINT. This means that you should
avoid using – on integers that may
have the value of –263.
The result of the last expression is incorrect because the
result of the integer multiplication exceeds the 64-bit
range of BIGINT calculations. (See
Section 11.2, “Numeric Types”.)
/
Division:
mysql> SELECT 3/5;
-> 0.60
Division by zero produces a NULL result:
mysql> SELECT 102/(1-1);
-> NULL
A division is calculated with BIGINT
arithmetic only if performed in a context where its result
is converted to an integer.
DIV
Integer division. Similar to FLOOR(), but
is safe with BIGINT values.
mysql> SELECT 5 DIV 2;
-> 2
N %
M
Modulo operation. Returns the remainder of
N divided by
M. For more information, see the
description for the MOD() function in
Section 12.5.2, “Mathematical Functions”.
Returns the arc tangent of the two variables
X and
Y. It is similar to calculating
the arc tangent of Y /
X, except that the
signs of both arguments are used to determine the quadrant
of the result.
For exact-value numeric arguments, the return value has an
exact-value numeric type. For string or floating-point
arguments, the return value has a floating-point type.
COS(X)
Returns the cosine of X, where
X is given in radians.
Computes a cyclic redundancy check value and returns a
32-bit unsigned value. The result is NULL
if the argument is NULL. The argument is
expected to be a string and (if possible) is treated as one
if it is not.
For exact-value numeric arguments, the return value has an
exact-value numeric type. For string or floating-point
arguments, the return value has a floating-point type.
FORMAT(X,D)
Formats the number X to a format
like '#,###,###.##', rounded to
D decimal places, and returns the
result as a string. For details, see
Section 12.4, “String Functions”.
LN(X)
Returns the natural logarithm of
X; that is, the
base-e logarithm of
X.
Returns the argument X, converted
from degrees to radians. (Note that π radians equals 180
degrees.)
mysql> SELECT RADIANS(90);
-> 1.5707963267949
RAND(),
RAND(N)
Returns a random floating-point value
v in the range
0 <= v <
1.0. If a constant integer argument
N is specified, it is used as the
seed value, which produces a repeatable sequence of column
values.
The effect of using a non-constant argument is undefined. As
of MySQL 5.0.13, non-constant arguments are disallowed.
To obtain a random integer R in
the range i <=
R <
j, use the expression
FLOOR(i + RAND() *
(j –
i)). For example, to
obtain a random integer in the range the range
7 <= R <
12, you could use the following
statement:
SELECT FLOOR(7 + (RAND() * 5));
You cannot use a column with RAND()
values in an ORDER BY clause, because
ORDER BY would evaluate the column
multiple times. However, you can retrieve rows in random
order like this:
mysql> SELECT * FROM tbl_name ORDER BY RAND();
ORDER BY RAND() combined with
LIMIT is useful for selecting a random
sample from a set of rows:
mysql> SELECT * FROM table1, table2 WHERE a=b AND c<d -> ORDER BY RAND() LIMIT 1000;
Note that RAND() in a
WHERE clause is re-evaluated every time
the WHERE is executed.
RAND() is not meant to be a perfect
random generator, but instead is a fast way to generate
ad hoc random numbers which
is portable between platforms for the same MySQL version.
ROUND(X),
ROUND(X,D)
Rounds the argument X to
D decimal places. The rounding
algorithm depends on the data type of
X. D
defaults to 0 if not specified. D
can be negative to cause D digits
left of the decimal point of the value
X to become zero.
The return type is the same type as that of the first
argument (assuming that it is integer, double, or decimal).
This means that for an integer argument, the result is an
integer (no decimal places):
Before MySQL 5.0.3, the behavior of
ROUND() when the argument is halfway
between two integers depends on the C library
implementation. Different implementations round to the
nearest even number, always up, always down, or always
toward zero. If you need one kind of rounding, you should
use a well-defined function such as
TRUNCATE() or FLOOR()
instead.
As of MySQL 5.0.3, ROUND() uses the
precision math library for exact-value arguments when the
first argument is a decimal value:
For exact-value numbers, ROUND() uses
the “round half up” or “round toward
nearest” rule: A value with a fractional part of
.5 or greater is rounded up to the next integer if
positive or down to the next integer if negative. (In
other words, it is rounded away from zero.) A value with
a fractional part less than .5 is rounded down to the
next integer if positive or up to the next integer if
negative.
For approximate-value numbers, the result depends on the
C library. On many systems, this means that
ROUND() uses the "round to nearest
even" rule: A value with any fractional part is rounded
to the nearest even integer.
The following example shows how rounding differs for exact
and approximate values:
Returns the number X, truncated
to D decimal places. If
D is 0, the
result has no decimal point or fractional part.
D can be negative to cause
D digits left of the decimal
point of the value X to become
zero.
This section describes the functions that can be used to
manipulate temporal values. See
Section 11.3, “Date and Time Types”, for a description of the
range of values each date and time type has and the valid formats
in which values may be specified.
Here is an example that uses date functions. The following query
selects all rows with a date_col value
from within the last 30 days:
mysql> SELECT something FROM tbl_name
-> WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= date_col;
Note that the query also selects rows with dates that lie in the
future.
Functions that expect date values usually accept datetime values
and ignore the time part. Functions that expect time values
usually accept datetime values and ignore the date part.
Functions that return the current date or time each are evaluated
only once per query at the start of query execution. This means
that multiple references to a function such as
NOW() within a single query always produce the
same result (for our purposes a single query also includes a call
to a stored routine or trigger and all sub-routines called by that
routine/trigger). This principle also applies to
CURDATE(), CURTIME(),
UTC_DATE(), UTC_TIME(),
UTC_TIMESTAMP(), and to any of their synonyms.
The CURRENT_TIMESTAMP(),
CURRENT_TIME(),
CURRENT_DATE(), and
FROM_UNIXTIME() functions return values in the
connection's current time zone, which is available as the value of
the time_zone system variable. In addition,
UNIX_TIMESTAMP() assumes that its argument is a
datetime value in the current time zone. See
Section 5.10.8, “MySQL Server Time Zone Support”.
Some date functions can be used with “zero” dates or
incomplete dates such as '2001-11-00', whereas
others cannot. Functions that extract parts of dates typically
work with incomplete dates. For example:
Other functions expect complete dates and return
NULL for incomplete dates. These include
functions that perform date arithmetic or that map parts of dates
to names. For example:
When invoked with the INTERVAL form of the
second argument, ADDDATE() is a synonym for
DATE_ADD(). The related function
SUBDATE() is a synonym for
DATE_SUB(). For information on the
INTERVALunit
argument, see the discussion for
DATE_ADD().
CONVERT_TZ() converts a datetime value
dt from the time zone given by
from_tz to the time zone given by
to_tz and returns the resulting
value. Time zones are specified as described in
Section 5.10.8, “MySQL Server Time Zone Support”. This function returns
NULL if the arguments are invalid.
If the value falls out of the supported range of the
TIMESTAMP type when converted from
from_tz to UTC, no conversion
occurs. The TIMESTAMP range is described in
Section 11.1.2, “Overview of Date and Time Types”.
CURRENT_DATE and
CURRENT_DATE() are synonyms for
CURDATE().
CURTIME()
Returns the current time as a value in
'HH:MM:SS' or HHMMSS
format, depending on whether the function is used in a string
or numeric context. The value is expressed in the current time
zone.
DATEDIFF() returns
expr1 –
expr2 expressed as a value in days
from one date to the other. expr1
and expr2 are date or date-and-time
expressions. Only the date parts of the values are used in the
calculation.
These functions perform date arithmetic.
date is a
DATETIME or DATE value
specifying the starting date. expr
is an expression specifying the interval value to be added or
subtracted from the starting date.
expr is a string; it may start with
a ‘-’ for negative intervals.
unit is a keyword indicating the
units in which the expression should be interpreted.
The INTERVAL keyword and the
unit specifier are not case
sensitive.
The following table shows the expected form of the
expr argument for each
unit value.
unitValue
ExpectedexprFormat
MICROSECOND
MICROSECONDS
SECOND
SECONDS
MINUTE
MINUTES
HOUR
HOURS
DAY
DAYS
WEEK
WEEKS
MONTH
MONTHS
QUARTER
QUARTERS
YEAR
YEARS
SECOND_MICROSECOND
'SECONDS.MICROSECONDS'
MINUTE_MICROSECOND
'MINUTES.MICROSECONDS'
MINUTE_SECOND
'MINUTES:SECONDS'
HOUR_MICROSECOND
'HOURS.MICROSECONDS'
HOUR_SECOND
'HOURS:MINUTES:SECONDS'
HOUR_MINUTE
'HOURS:MINUTES'
DAY_MICROSECOND
'DAYS.MICROSECONDS'
DAY_SECOND
'DAYS HOURS:MINUTES:SECONDS'
DAY_MINUTE
'DAYS HOURS:MINUTES'
DAY_HOUR
'DAYS HOURS'
YEAR_MONTH
'YEARS-MONTHS'
The values QUARTER and
WEEK are available beginning with MySQL
5.0.0.
MySQL allows any punctuation delimiter in the
expr format. Those shown in the
table are the suggested delimiters. If the
date argument is a
DATE value and your calculations involve
only YEAR, MONTH, and
DAY parts (that is, no time parts), the
result is a DATE value. Otherwise, the
result is a DATETIME value.
Date arithmetic also can be performed using
INTERVAL together with the
+ or - operator:
date + INTERVAL exprunitdate - INTERVAL exprunit
INTERVAL exprunit is allowed on either
side of the + operator if the expression on
the other side is a date or datetime value. For the
- operator, INTERVAL
exprunit is allowed only on
the right side, because it makes no sense to subtract a date
or datetime value from an interval.
If you specify an interval value that is too short (does not
include all the interval parts that would be expected from the
unit keyword), MySQL assumes that
you have left out the leftmost parts of the interval value.
For example, if you specify a unit
of DAY_SECOND, the value of
expr is expected to have days,
hours, minutes, and seconds parts. If you specify a value like
'1:10', MySQL assumes that the days and
hours parts are missing and the value represents minutes and
seconds. In other words, '1:10' DAY_SECOND
is interpreted in such a way that it is equivalent to
'1:10' MINUTE_SECOND. This is analogous to
the way that MySQL interprets TIME values
as representing elapsed time rather than as a time of day.
If you add to or subtract from a date value something that
contains a time part, the result is automatically converted to
a datetime value:
If you add MONTH,
YEAR_MONTH, or YEAR and
the resulting date has a day that is larger than the maximum
day for the new month, the day is adjusted to the maximum days
in the new month:
Formats the date value according to
the format string.
The following specifiers may be used in the
format string. The
‘%’ character is required
before format specifier characters.
Specifier
Description
%a
Abbreviated weekday name
(Sun..Sat)
%b
Abbreviated month name (Jan..Dec)
%c
Month, numeric (0..12)
%D
Day of the month with English suffix (0th,
1st, 2nd,
3rd, …)
%d
Day of the month, numeric (00..31)
%e
Day of the month, numeric (0..31)
%f
Microseconds (000000..999999)
%H
Hour (00..23)
%h
Hour (01..12)
%I
Hour (01..12)
%i
Minutes, numeric (00..59)
%j
Day of year (001..366)
%k
Hour (0..23)
%l
Hour (1..12)
%M
Month name (January..December)
%m
Month, numeric (00..12)
%p
AM or PM
%r
Time, 12-hour (hh:mm:ss followed by
AM or PM)
%S
Seconds (00..59)
%s
Seconds (00..59)
%T
Time, 24-hour (hh:mm:ss)
%U
Week (00..53), where Sunday is the
first day of the week
%u
Week (00..53), where Monday is the
first day of the week
%V
Week (01..53), where Sunday is the
first day of the week; used with %X
%v
Week (01..53), where Monday is the
first day of the week; used with %x
%W
Weekday name (Sunday..Saturday)
%w
Day of the week
(0=Sunday..6=Saturday)
%X
Year for the week where Sunday is the first day of the week, numeric,
four digits; used with %V
%x
Year for the week, where Monday is the first day of the week, numeric,
four digits; used with %v
%Y
Year, numeric, four digits
%y
Year, numeric (two digits)
%%
A literal ‘%’ character
%x
x, for any
‘x’ not listed
above
Ranges for the month and day specifiers begin with zero due to
the fact that MySQL allows the storing of incomplete dates
such as '2004-00-00'.
As of MySQL 5.0.25, the language used for day and month names
and abbreviations is controlled by the value of the
lc_time_names system variable
(Section 5.10.9, “MySQL Server Locale Support”).
As of MySQL 5.0.36, DATE_FORMAT() returns a
string with a character set and collation given by
character_set_connection and
collation_connection so that it can return
month and weekday names containing non-ASCII characters.
Before 5.0.36, the return value is a binary string.
Returns the name of the weekday for
date. As of MySQL 5.0.25, the
language used for the name is controlled by the value of the
lc_time_names system variable
(Section 5.10.9, “MySQL Server Locale Support”).
Returns the day of the month for
date, in the range
0 to 31.
mysql> SELECT DAYOFMONTH('1998-02-03');
-> 3
DAYOFWEEK(date)
Returns the weekday index for date
(1 = Sunday, 2 = Monday,
…, 7 = Saturday). These index values
correspond to the ODBC standard.
mysql> SELECT DAYOFWEEK('1998-02-03');
-> 3
DAYOFYEAR(date)
Returns the day of the year for
date, in the range
1 to 366.
mysql> SELECT DAYOFYEAR('1998-02-03');
-> 34
EXTRACT(unit FROM
date)
The EXTRACT() function uses the same kinds
of unit specifiers as DATE_ADD() or
DATE_SUB(), but extracts parts from the
date rather than performing date arithmetic.
mysql> SELECT EXTRACT(YEAR FROM '1999-07-02');
-> 1999
mysql> SELECT EXTRACT(YEAR_MONTH FROM '1999-07-02 01:02:03');
-> 199907
mysql> SELECT EXTRACT(DAY_MINUTE FROM '1999-07-02 01:02:03');
-> 20102
mysql> SELECT EXTRACT(MICROSECOND
-> FROM '2003-01-02 10:30:00.000123');
-> 123
FROM_DAYS(N)
Given a day number N, returns a
DATE value.
mysql> SELECT FROM_DAYS(729669);
-> '1997-10-07'
Use FROM_DAYS() with caution on old dates.
It is not intended for use with values that precede the advent
of the Gregorian calendar (1582). See
Section 12.7, “What Calendar Is Used By MySQL?”.
Returns a representation of the
unix_timestamp argument as a value
in 'YYYY-MM-DD HH:MM:SS' or
YYYYMMDDHHMMSS format, depending on whether
the function is used in a string or numeric context. The value
is expressed in the current time zone.
unix_timestamp is an internal
timestamp value such as is produced by the
UNIX_TIMESTAMP() function.
If format is given, the result is
formatted according to the format
string, which is used the same way as listed in the entry for
the DATE_FORMAT() function.
Note: If you use UNIX_TIMESTAMP() and
FROM_UNIXTIME() to convert between
TIMESTAMP values and Unix timestamp values,
the conversion is lossy because the mapping is not one-to-one
in both directions. For details, see the description of the
UNIX_TIMESTAMP() function.
Returns a format string. This function is useful in
combination with the DATE_FORMAT() and the
STR_TO_DATE() functions.
The possible values for the first and second arguments result
in several possible format strings (for the specifiers used,
see the table in the DATE_FORMAT() function
description). ISO format refers to ISO 9075, not ISO 8601.
Function Call
Result
GET_FORMAT(DATE,'USA')
'%m.%d.%Y'
GET_FORMAT(DATE,'JIS')
'%Y-%m-%d'
GET_FORMAT(DATE,'ISO')
'%Y-%m-%d'
GET_FORMAT(DATE,'EUR')
'%d.%m.%Y'
GET_FORMAT(DATE,'INTERNAL')
'%Y%m%d'
GET_FORMAT(DATETIME,'USA')
'%Y-%m-%d %H.%i.%s'
GET_FORMAT(DATETIME,'JIS')
'%Y-%m-%d %H:%i:%s'
GET_FORMAT(DATETIME,'ISO')
'%Y-%m-%d %H:%i:%s'
GET_FORMAT(DATETIME,'EUR')
'%Y-%m-%d %H.%i.%s'
GET_FORMAT(DATETIME,'INTERNAL')
'%Y%m%d%H%i%s'
GET_FORMAT(TIME,'USA')
'%h:%i:%s %p'
GET_FORMAT(TIME,'JIS')
'%H:%i:%s'
GET_FORMAT(TIME,'ISO')
'%H:%i:%s'
GET_FORMAT(TIME,'EUR')
'%H.%i.%s'
GET_FORMAT(TIME,'INTERNAL')
'%H%i%s'
TIMESTAMP can also be used as the first
argument to GET_FORMAT(), in which case the
function returns the same values as for
DATETIME.
Returns the hour for time. The
range of the return value is 0 to
23 for time-of-day values. However, the
range of TIME values actually is much
larger, so HOUR can return values greater
than 23.
Returns the minute for time, in the
range 0 to 59.
mysql> SELECT MINUTE('98-02-03 10:05:03');
-> 5
MONTH(date)
Returns the month for date, in the
range 0 to 12.
mysql> SELECT MONTH('1998-02-03');
-> 2
MONTHNAME(date)
Returns the full name of the month for
date. As of MySQL 5.0.25, the
language used for the name is controlled by the value of the
lc_time_names system variable
(Section 5.10.9, “MySQL Server Locale Support”).
Returns the current date and time as a value in
'YYYY-MM-DD HH:MM:SS' or
YYYYMMDDHHMMSS format, depending on whether
the function is used in a string or numeric context. The value
is expressed in the current time zone.
NOW() returns a constant time that
indicates the time at which the statement began to execute.
(Within a stored routine or trigger, NOW()
returns the time at which the routine or triggering statement
began to execute.) This differs from the behavior for
SYSDATE(), which returns the exact time at
which it executes as of MySQL 5.0.13.
See the description for SYSDATE() for
additional information about the differences between the two
functions.
PERIOD_ADD(P,N)
Adds N months to period
P (in the format
YYMM or YYYYMM). Returns
a value in the format YYYYMM. Note that the
period argument P is
not a date value.
mysql> SELECT PERIOD_ADD(9801,2);
-> 199803
PERIOD_DIFF(P1,P2)
Returns the number of months between periods
P1 and
P2. P1
and P2 should be in the format
YYMM or YYYYMM. Note
that the period arguments P1 and
P2 are not
date values.
mysql> SELECT PERIOD_DIFF(9802,199703);
-> 11
QUARTER(date)
Returns the quarter of the year for
date, in the range
1 to 4.
mysql> SELECT QUARTER('98-04-01');
-> 2
SECOND(time)
Returns the second for time, in the
range 0 to 59.
mysql> SELECT SECOND('10:05:03');
-> 3
SEC_TO_TIME(seconds)
Returns the seconds argument,
converted to hours, minutes, and seconds, as a value in
'HH:MM:SS' or HHMMSS
format, depending on whether the function is used in a string
or numeric context.
This is the inverse of the DATE_FORMAT()
function. It takes a string str and
a format string format.
STR_TO_DATE() returns a
DATETIME value if the format string
contains both date and time parts, or a
DATE or TIME value if
the string contains only date or time parts.
The date, time, or datetime values contained in
str should be given in the format
indicated by format. For the
specifiers that can be used in
format, see the
DATE_FORMAT() function description. If
str contains an illegal date, time,
or datetime value, STR_TO_DATE() returns
NULL. Starting from MySQL 5.0.3, an illegal
value also produces a warning.
Range checking on the parts of date values is as described in
Section 11.3.1, “The DATETIME, DATE, and
TIMESTAMP Types”. This means, for example, that
“zero” dates or dates with part values of 0 are
allowed unless the SQL mode is set to disallow such values.
Note: You cannot use format
"%X%V" to convert a year-week string to a
date because the combination of a year and week does not
uniquely identify a year and month if the week crosses a month
boundary. To convert a year-week to a date, then you should
also specify the weekday:
When invoked with the INTERVAL form of the
second argument, SUBDATE() is a synonym for
DATE_SUB(). For information on the
INTERVALunit
argument, see the discussion for
DATE_ADD().
The second form allows the use of an integer value for
days. In such cases, it is
interpreted as the number of days to be subtracted from the
date or datetime expression expr.
SUBTIME() returns
expr1 –
expr2 expressed as a value in the
same format as expr1.
expr1 is a time or datetime
expression, and expr2 is a time
expression.
Returns the current date and time as a value in
'YYYY-MM-DD HH:MM:SS' or
YYYYMMDDHHMMSS format, depending on whether
the function is used in a string or numeric context.
As of MySQL 5.0.13, SYSDATE() returns the
time at which it executes. This differs from the behavior for
NOW(), which returns a constant time that
indicates the time at which the statement began to execute.
(Within a stored routine or trigger, NOW()
returns the time at which the routine or triggering statement
began to execute.)
In addition, the SET TIMESTAMP statement
affects the value returned by NOW() but not
by SYSDATE(). This means that timestamp
settings in the binary log have no effect on invocations of
SYSDATE().
Because SYSDATE() can return different
values even within the same statement, and is not affected by
SET TIMESTAMP, it is non-deterministic and
therefore unsafe for replication. If that is a problem, you
can start the server with the
--sysdate-is-now option to cause
SYSDATE() to be an alias for
NOW().
TIME(expr)
Extracts the time part of the time or datetime expression
expr and returns it as a string.
With a single argument, this function returns the date or
datetime expression expr as a
datetime value. With two arguments, it adds the time
expression expr2 to the date or
datetime expression expr1 and
returns the result as a datetime value.
Adds the integer expression
interval to the date or datetime
expression datetime_expr. The unit
for interval is given by the
unit argument, which should be one
of the following values: FRAC_SECOND,
SECOND, MINUTE,
HOUR, DAY,
WEEK, MONTH,
QUARTER, or YEAR.
The unit value may be specified
using one of keywords as shown, or with a prefix of
SQL_TSI_. For example,
DAY and SQL_TSI_DAY both
are legal.
Returns the integer difference between the date or datetime
expressions datetime_expr1 and
datetime_expr2. The unit for the
result is given by the unit
argument. The legal values for unit
are the same as those listed in the description of the
TIMESTAMPADD() function.
This is used like the DATE_FORMAT()
function, but the format string may
contain format specifiers only for hours, minutes, and
seconds. Other specifiers produce a NULL
value or 0.
If the time value contains an hour
part that is greater than 23, the
%H and %k hour format
specifiers produce a value larger than the usual range of
0..23. The other hour format specifiers
produce the hour value modulo 12.
TO_DAYS() is not intended for use with
values that precede the advent of the Gregorian calendar
(1582), because it does not take into account the days that
were lost when the calendar was changed. For dates before 1582
(and possibly a later year in other locales), results from
this function are not reliable. See
Section 12.7, “What Calendar Is Used By MySQL?”, for details.
Remember that MySQL converts two-digit year values in dates to
four-digit form using the rules in
Section 11.3, “Date and Time Types”. For example,
'1997-10-07' and
'97-10-07' are seen as identical dates:
If called with no argument, returns a Unix timestamp (seconds
since '1970-01-01 00:00:00' UTC) as an
unsigned integer. If UNIX_TIMESTAMP() is
called with a date argument, it
returns the value of the argument as seconds since
'1970-01-01 00:00:00' UTC.
date may be a
DATE string, a DATETIME
string, a TIMESTAMP, or a number in the
format YYMMDD or
YYYYMMDD. The server interprets
date as a value in the current time
zone and converts it to an internal value in UTC. Clients can
set their time zone as described in
Section 5.10.8, “MySQL Server Time Zone Support”.
When UNIX_TIMESTAMP is used on a
TIMESTAMP column, the function returns the
internal timestamp value directly, with no implicit
“string-to-Unix-timestamp” conversion. If you
pass an out-of-range date to
UNIX_TIMESTAMP(), it returns
0.
Note: If you use UNIX_TIMESTAMP() and
FROM_UNIXTIME() to convert between
TIMESTAMP values and Unix timestamp values,
the conversion is lossy because the mapping is not one-to-one
in both directions. For example, due to conventions for local
time zone changes, it is possible for two
UNIX_TIMESTAMP() to map two
TIMESTAMP values to the same Unix timestamp
value. FROM_UNIXTIME() will map that value
back to only one of the original TIMESTAMP
values. Here is an example, using TIMESTAMP
values in the CET time zone:
Returns the current UTC date and time as a value in
'YYYY-MM-DD HH:MM:SS' or
YYYYMMDDHHMMSS format, depending on whether
the function is used in a string or numeric context.
This function returns the week number for
date. The two-argument form of
WEEK() allows you to specify whether the
week starts on Sunday or Monday and whether the return value
should be in the range from 0 to
53 or from 1 to
53. If the mode
argument is omitted, the value of the
default_week_format system variable is
used. See Section 5.2.3, “System Variables”.
The following table describes how the
mode argument works.
One might argue that MySQL should return 52
for the WEEK() function, because the given
date actually occurs in the 52nd week of 1999. We decided to
return 0 instead because we want the
function to return “the week number in the given
year.” This makes use of the WEEK()
function reliable when combined with other functions that
extract a date part from a date.
If you would prefer the result to be evaluated with respect to
the year that contains the first day of the week for the given
date, use 0, 2,
5, or 7 as the optional
mode argument.
Returns the calendar week of the date as a number in the range
from 1 to 53.
WEEKOFYEAR() is a compatibility function
that is equivalent to
WEEK(date,3).
mysql> SELECT WEEKOFYEAR('1998-02-20');
-> 8
YEAR(date)
Returns the year for date, in the
range 1000 to 9999, or
0 for the “zero” date.
mysql> SELECT YEAR('98-02-03');
-> 1998
YEARWEEK(date),
YEARWEEK(date,mode)
Returns year and week for a date. The
mode argument works exactly like
the mode argument to
WEEK(). The year in the result may be
different from the year in the date argument for the first and
the last week of the year.
mysql> SELECT YEARWEEK('1987-01-01');
-> 198653
Note that the week number is different from what the
WEEK() function would return
(0) for optional arguments
0 or 1, as
WEEK() then returns the week in the context
of the given year.
12.7. What Calendar Is Used By MySQL?
MySQL uses what is known as a proleptic Gregorian
calendar.
Every country that has switched from the Julian to the Gregorian
calendar has had to discard at least ten days during the switch.
To see how this works, consider the month of October 1582, when
the first Julian-to-Gregorian switch occurred:
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
1
2
3
4
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
There are no dates between October 4 and October 15. This
discontinuity is called the cutover. Any
dates before the cutover are Julian, and any dates following the
cutover are Gregorian. Dates during a cutover are non-existent.
A calendar applied to dates when it wasn't actually in use is
called proleptic. Thus, if we assume there
was never a cutover and Gregorian rules always rule, we have a
proleptic Gregorian calendar. This is what is used by MySQL, as is
required by standard SQL. For this reason, dates prior to the
cutover stored as MySQL DATE or
DATETIME values must be adjusted to compensate
for the difference. It is important to realize that the cutover
did not occur at the same time in all countries, and that the
later it happened, the more days were lost. For example, in Great
Britain, it took place in 1752, when Wednesday September 2 was
followed by Thursday September 14. Russia remained on the Julian
calendar until 1918, losing 13 days in the process, and what is
popularly referred to as its “October Revolution”
occurred in November according to the Gregorian calendar.
MATCH (col1,col2,...) AGAINST (expr [search_modifier])
search_modifier: { IN BOOLEAN MODE | WITH QUERY EXPANSION }
MySQL has support for full-text indexing and searching:
A full-text index in MySQL is an index of type
FULLTEXT.
Full-text indexes can be used only with
MyISAM tables, and can be created only for
CHAR, VARCHAR, or
TEXT columns.
A FULLTEXT index definition can be given in
the CREATE TABLE statement when a table is
created, or added later using ALTER TABLE
or CREATE INDEX.
For large datasets, it is much faster to load your data into a
table that has no FULLTEXT index and then
create the index after that, than to load data into a table
that has an existing FULLTEXT index.
Full-text searching is performed using MATCH() ...
AGAINST syntax. MATCH() takes a
comma-separated list that names the columns to be searched.
AGAINST takes a string to search for, and an
optional modifier that indicates what type of search to perform.
The search string must be a literal string, not a variable or a
column name. There are three types of full-text searches:
A boolean search interprets the search string using the rules
of a special query language. The string contains the words to
search for. It can also contain operators that specify
requirements such that a word must be present or absent in
matching rows, or that it should be weighted higher or lower
than usual. Common words such as “some” or
“then” are stopwords and do not match if present
in the search string. The IN BOOLEAN MODE
modifier specifies a boolean search. For more information, see
Section 12.8.1, “Boolean Full-Text Searches”.
A natural language search interprets the search string as a
phrase in natural human language (a phrase in free text).
There are no special operators. The stopword list applies. In
addition, words that are present in more than 50% of the rows
are considered common and do not match. Full-text searches are
natural language searches if no modifier is given.
A query expansion search is a modification of a natural
language search. The search string is used to perform a
natural language search. Then words from the most relevant
rows returned by the search are added to the search string and
the search is done again. The query returns the rows from the
second search. The WITH QUERY EXPANSION
modifier specifies a query expansion search. For more
information, see Section 12.8.2, “Full-Text Searches with Query Expansion”.
mysql> CREATE TABLE articles (
-> id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
-> title VARCHAR(200),
-> body TEXT,
-> FULLTEXT (title,body)
-> );
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO articles (title,body) VALUES
-> ('MySQL Tutorial','DBMS stands for DataBase ...'),
-> ('How To Use MySQL Well','After you went through a ...'),
-> ('Optimizing MySQL','In this tutorial we will show ...'),
-> ('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'),
-> ('MySQL vs. YourSQL','In the following database comparison ...'),
-> ('MySQL Security','When configured properly, MySQL ...');
Query OK, 6 rows affected (0.00 sec)
Records: 6 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM articles
-> WHERE MATCH (title,body) AGAINST ('database');
+----+-------------------+------------------------------------------+
| id | title | body |
+----+-------------------+------------------------------------------+
| 5 | MySQL vs. YourSQL | In the following database comparison ... |
| 1 | MySQL Tutorial | DBMS stands for DataBase ... |
+----+-------------------+------------------------------------------+
2 rows in set (0.00 sec)
The MATCH() function performs a natural
language search for a string against a text
collection. A collection is a set of one or more
columns included in a FULLTEXT index. The
search string is given as the argument to
AGAINST(). For each row in the table,
MATCH() returns a relevance value; that is, a
similarity measure between the search string and the text in that
row in the columns named in the MATCH() list.
By default, the search is performed in case-insensitive fashion.
However, you can perform a case-sensitive full-text search by
using a binary collation for the indexed columns. For example, a
column that uses the latin1 character set of
can be assigned a collation of latin1_bin to
make it case sensitive for full-text searches.
When MATCH() is used in a
WHERE clause, as in the example shown earlier,
the rows returned are automatically sorted with the highest
relevance first. Relevance values are non-negative floating-point
numbers. Zero relevance means no similarity. Relevance is computed
based on the number of words in the row, the number of unique
words in that row, the total number of words in the collection,
and the number of documents (rows) that contain a particular word.
To simply count matches, you could use a query like this:
mysql> SELECT COUNT(*) FROM articles
-> WHERE MATCH (title,body)
-> AGAINST ('database');
+----------+
| COUNT(*) |
+----------+
| 2 |
+----------+
1 row in set (0.00 sec)
However, you might find it quicker to rewrite the query as
follows:
mysql> SELECT
-> COUNT(IF(MATCH (title,body) AGAINST ('database'), 1, NULL))
-> AS count
-> FROM articles;
+-------+
| count |
+-------+
| 2 |
+-------+
1 row in set (0.00 sec)
The first query sorts the results by relevance whereas the second
does not. However, the second query performs a full table scan and
the first does not. The first may be faster if the search matches
few rows; otherwise, the second may be faster because it would
read many rows anyway.
For natural-language full-text searches, it is a requirement that
the columns named in the MATCH() function be
the same columns included in some FULLTEXT
index in your table. For the preceding query, note that the
columns named in the MATCH() function
(title and body) are the
same as those named in the definition of the
article table's FULLTEXT
index. If you wanted to search the title or
body separately, you would need to create
separate FULLTEXT indexes for each column.
A full-text search that uses an index can name columns only from a
single table in the MATCH() clause because an
index cannot span multiple tables. A boolean search can be done in
the absence of an index (albeit more slowly), in which case it is
possible to name columns from multiple tables.
The preceding example is a basic illustration that shows how to
use the MATCH() function where rows are
returned in order of decreasing relevance. The next example shows
how to retrieve the relevance values explicitly. Returned rows are
not ordered because the SELECT statement
includes neither WHERE nor ORDER
BY clauses:
mysql> SELECT id, MATCH (title,body) AGAINST ('Tutorial')
-> FROM articles;
+----+-----------------------------------------+
| id | MATCH (title,body) AGAINST ('Tutorial') |
+----+-----------------------------------------+
| 1 | 0.65545833110809 |
| 2 | 0 |
| 3 | 0.66266459226608 |
| 4 | 0 |
| 5 | 0 |
| 6 | 0 |
+----+-----------------------------------------+
6 rows in set (0.00 sec)
The following example is more complex. The query returns the
relevance values and it also sorts the rows in order of decreasing
relevance. To achieve this result, you should specify
MATCH() twice: once in the
SELECT list and once in the
WHERE clause. This causes no additional
overhead, because the MySQL optimizer notices that the two
MATCH() calls are identical and invokes the
full-text search code only once.
mysql> SELECT id, body, MATCH (title,body) AGAINST
-> ('Security implications of running MySQL as root') AS score
-> FROM articles WHERE MATCH (title,body) AGAINST
-> ('Security implications of running MySQL as root');
+----+-------------------------------------+-----------------+
| id | body | score |
+----+-------------------------------------+-----------------+
| 4 | 1. Never run mysqld as root. 2. ... | 1.5219271183014 |
| 6 | When configured properly, MySQL ... | 1.3114095926285 |
+----+-------------------------------------+-----------------+
2 rows in set (0.00 sec)
The MySQL FULLTEXT implementation regards any
sequence of true word characters (letters, digits, and
underscores) as a word. That sequence may also contain apostrophes
(‘'’), but not more than one in a
row. This means that aaa'bbb is regarded as one
word, but aaa''bbb is regarded as two words.
Apostrophes at the beginning or the end of a word are stripped by
the FULLTEXT parser;
'aaa'bbb' would be parsed as
aaa'bbb.
The FULLTEXT parser determines where words
start and end by looking for certain delimiter characters; for
example, ‘’ (space),
‘,’ (comma), and
‘.’ (period). If words are not
separated by delimiters (as in, for example, Chinese), the
FULLTEXT parser cannot determine where a word
begins or ends. To be able to add words or other indexed terms in
such languages to a FULLTEXT index, you must
preprocess them so that they are separated by some arbitrary
delimiter such as ‘"’.
Some words are ignored in full-text searches:
Any word that is too short is ignored. The default minimum
length of words that are found by full-text searches is four
characters.
Words in the stopword list are ignored. A stopword is a word
such as “the” or “some” that is so
common that it is considered to have zero semantic value.
There is a built-in stopword list, but it can be overwritten
by a user-defined list.
Every correct word in the collection and in the query is weighted
according to its significance in the collection or query.
Consequently, a word that is present in many documents has a lower
weight (and may even have a zero weight), because it has lower
semantic value in this particular collection. Conversely, if the
word is rare, it receives a higher weight. The weights of the
words are combined to compute the relevance of the row.
Such a technique works best with large collections (in fact, it
was carefully tuned this way). For very small tables, word
distribution does not adequately reflect their semantic value, and
this model may sometimes produce bizarre results. For example,
although the word “MySQL” is present in every row of
the articles table shown earlier, a search for
the word produces no results:
mysql> SELECT * FROM articles
-> WHERE MATCH (title,body) AGAINST ('MySQL');
Empty set (0.00 sec)
The search result is empty because the word “MySQL”
is present in at least 50% of the rows. As such, it is effectively
treated as a stopword. For large datasets, this is the most
desirable behavior: A natural language query should not return
every second row from a 1GB table. For small datasets, it may be
less desirable.
A word that matches half of the rows in a table is less likely to
locate relevant documents. In fact, it most likely finds plenty of
irrelevant documents. We all know this happens far too often when
we are trying to find something on the Internet with a search
engine. It is with this reasoning that rows containing the word
are assigned a low semantic value for the particular
dataset in which they occur. A given word may exceed
the 50% threshold in one dataset but not another.
The 50% threshold has a significant implication when you first try
full-text searching to see how it works: If you create a table and
insert only one or two rows of text into it, every word in the
text occurs in at least 50% of the rows. As a result, no search
returns any results. Be sure to insert at least three rows, and
preferably many more. Users who need to bypass the 50% limitation
can use the boolean search mode; see
Section 12.8.1, “Boolean Full-Text Searches”.
12.8.1. Boolean Full-Text Searches
MySQL can perform boolean full-text searches using the
IN BOOLEAN MODE modifier:
mysql> SELECT * FROM articles WHERE MATCH (title,body)
-> AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
+----+-----------------------+-------------------------------------+
| id | title | body |
+----+-----------------------+-------------------------------------+
| 1 | MySQL Tutorial | DBMS stands for DataBase ... |
| 2 | How To Use MySQL Well | After you went through a ... |
| 3 | Optimizing MySQL | In this tutorial we will show ... |
| 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... |
| 6 | MySQL Security | When configured properly, MySQL ... |
+----+-----------------------+-------------------------------------+
The + and - operators
indicate that a word is required to be present or absent,
respectively, for a match to occur. Thus, this query retrieves
all the rows that contain the word “MySQL” but that
do not contain the word
“YourSQL”.
Boolean full-text searches have these characteristics:
They do not use the 50% threshold.
They do not automatically sort rows in order of decreasing
relevance. You can see this from the preceding query result:
The row with the highest relevance is the one that contains
“MySQL” twice, but it is listed last, not
first.
They can work even without a FULLTEXT
index, although a search executed in this fashion would be
quite slow.
The minimum and maximum word length full-text parameters
apply.
The stopword list applies.
The boolean full-text search capability supports the following
operators:
+
A leading plus sign indicates that this word
must be present in each row that is
returned.
-
A leading minus sign indicates that this word must
not be present in any of the rows that
are returned.
Note: The - operator acts only to exclude
rows that are otherwise matched by other search terms. Thus,
a boolean-mode search that contains only terms preceded by
- returns an empty result. It does not
return “all rows except those containing any of the
excluded terms.”
(no operator)
By default (when neither + nor
- is specified) the word is optional, but
the rows that contain it are rated higher. This mimics the
behavior of MATCH() ... AGAINST() without
the IN BOOLEAN MODE modifier.
> <
These two operators are used to change a word's contribution
to the relevance value that is assigned to a row. The
> operator increases the contribution
and the < operator decreases it. See
the example following this list.
( )
Parentheses group words into subexpressions. Parenthesized
groups can be nested.
~
A leading tilde acts as a negation operator, causing the
word's contribution to the row's relevance to be negative.
This is useful for marking “noise” words. A row
containing such a word is rated lower than others, but is
not excluded altogether, as it would be with the
- operator.
*
The asterisk serves as the truncation (or wildcard)
operator. Unlike the other operators, it should be
appended to the word to be affected.
Words match if they begin with the word preceding the
* operator.
If a stopword or too-short word is specified with the
truncation operator, it will not be stripped from a boolean
query. For example, a search for '+word
+stopword*' will likely return fewer rows than a
search for '+word +stopword' because the
former query remains as is and requires
stopword* to be present in a document.
The latter query is transformed to +word.
"
A phrase that is enclosed within double quote
(‘"’) characters matches only
rows that contain the phrase literally, as it was
typed. The full-text engine splits the phrase
into words, performs a search in the
FULLTEXT index for the words. Prior to
MySQL 5.0.3, the engine then performed a substring search
for the phrase in the records that were found, so the match
must include non-word characters in the phrase. As of MySQL
5.0.3, non-word characters need not be matched exactly:
Phrase searching requires only that matches contain exactly
the same words as the phrase and in the same order. For
example, "test phrase" matches
"test, phrase" in MySQL 5.0.3, but not
before.
If the phrase contains no words that are in the index, the
result is empty. For example, if all words are either
stopwords or shorter than the minimum length of indexed
words, the result is empty.
The following examples demonstrate some search strings that use
boolean full-text operators:
'apple banana'
Find rows that contain at least one of the two words.
'+apple +juice'
Find rows that contain both words.
'+apple macintosh'
Find rows that contain the word “apple”, but
rank rows higher if they also contain
“macintosh”.
'+apple -macintosh'
Find rows that contain the word “apple” but not
“macintosh”.
'+apple ~macintosh'
Find rows that contain the word “apple”, but if
the row also contains the word “macintosh”,
rate it lower than if row does not. This is
“softer” than a search for '+apple
-macintosh', for which the presence of
“macintosh” causes the row not to be returned
at all.
'+apple +(>turnover <strudel)'
Find rows that contain the words “apple” and
“turnover”, or “apple” and
“strudel” (in any order), but rank “apple
turnover” higher than “apple strudel”.
'apple*'
Find rows that contain words such as “apple”,
“apples”, “applesauce”, or
“applet”.
'"some words"'
Find rows that contain the exact phrase “some
words” (for example, rows that contain “some
words of wisdom” but not “some noise
words”). Note that the
‘"’ characters that enclose
the phrase are operator characters that delimit the phrase.
They are not the quotes that enclose the search string
itself.
12.8.2. Full-Text Searches with Query Expansion
Full-text search supports query expansion (and in particular,
its variant “blind query expansion”). This is
generally useful when a search phrase is too short, which often
means that the user is relying on implied knowledge that the
full-text search engine lacks. For example, a user searching for
“database” may really mean that
“MySQL”, “Oracle”, “DB2”,
and “RDBMS” all are phrases that should match
“databases” and should be returned, too. This is
implied knowledge.
Blind query expansion (also known as automatic relevance
feedback) is enabled by adding WITH QUERY
EXPANSION following the search phrase. It works by
performing the search twice, where the search phrase for the
second search is the original search phrase concatenated with
the few most highly relevant documents from the first search.
Thus, if one of these documents contains the word
“databases” and the word “MySQL”, the
second search finds the documents that contain the word
“MySQL” even if they do not contain the word
“database”. The following example shows this
difference:
mysql> SELECT * FROM articles
-> WHERE MATCH (title,body) AGAINST ('database');
+----+-------------------+------------------------------------------+
| id | title | body |
+----+-------------------+------------------------------------------+
| 5 | MySQL vs. YourSQL | In the following database comparison ... |
| 1 | MySQL Tutorial | DBMS stands for DataBase ... |
+----+-------------------+------------------------------------------+
2 rows in set (0.00 sec)
mysql> SELECT * FROM articles
-> WHERE MATCH (title,body)
-> AGAINST ('database' WITH QUERY EXPANSION);
+----+-------------------+------------------------------------------+
| id | title | body |
+----+-------------------+------------------------------------------+
| 1 | MySQL Tutorial | DBMS stands for DataBase ... |
| 5 | MySQL vs. YourSQL | In the following database comparison ... |
| 3 | Optimizing MySQL | In this tutorial we will show ... |
+----+-------------------+------------------------------------------+
3 rows in set (0.00 sec)
Another example could be searching for books by Georges Simenon
about Maigret, when a user is not sure how to spell
“Maigret”. A search for “Megre and the
reluctant witnesses” finds only “Maigret and the
Reluctant Witnesses” without query expansion. A search
with query expansion finds all books with the word
“Maigret” on the second pass.
Note: Because blind query
expansion tends to increase noise significantly by returning
non-relevant documents, it is meaningful to use only when a
search phrase is rather short.
12.8.3. Full-Text Stopwords
The following table shows the default list of full-text
stopwords.
a's
able
about
above
according
accordingly
across
actually
after
afterwards
again
against
ain't
all
allow
allows
almost
alone
along
already
also
although
always
am
among
amongst
an
and
another
any
anybody
anyhow
anyone
anything
anyway
anyways
anywhere
apart
appear
appreciate
appropriate
are
aren't
around
as
aside
ask
asking
associated
at
available
away
awfully
be
became
because
become
becomes
becoming
been
before
beforehand
behind
being
believe
below
beside
besides
best
better
between
beyond
both
brief
but
by
c'mon
c's
came
can
can't
cannot
cant
cause
causes
certain
certainly
changes
clearly
co
com
come
comes
concerning
consequently
consider
considering
contain
containing
contains
corresponding
could
couldn't
course
currently
definitely
described
despite
did
didn't
different
do
does
doesn't
doing
don't
done
down
downwards
during
each
edu
eg
eight
either
else
elsewhere
enough
entirely
especially
et
etc
even
ever
every
everybody
everyone
everything
everywhere
ex
exactly
example
except
far
few
fifth
first
five
followed
following
follows
for
former
formerly
forth
four
from
further
furthermore
get
gets
getting
given
gives
go
goes
going
gone
got
gotten
greetings
had
hadn't
happens
hardly
has
hasn't
have
haven't
having
he
he's
hello
help
hence
her
here
here's
hereafter
hereby
herein
hereupon
hers
herself
hi
him
himself
his
hither
hopefully
how
howbeit
however
i'd
i'll
i'm
i've
ie
if
ignored
immediate
in
inasmuch
inc
indeed
indicate
indicated
indicates
inner
insofar
instead
into
inward
is
isn't
it
it'd
it'll
it's
its
itself
just
keep
keeps
kept
know
knows
known
last
lately
later
latter
latterly
least
less
lest
let
let's
like
liked
likely
little
look
looking
looks
ltd
mainly
many
may
maybe
me
mean
meanwhile
merely
might
more
moreover
most
mostly
much
must
my
myself
name
namely
nd
near
nearly
necessary
need
needs
neither
never
nevertheless
new
next
nine
no
nobody
non
none
noone
nor
normally
not
nothing
novel
now
nowhere
obviously
of
off
often
oh
ok
okay
old
on
once
one
ones
only
onto
or
other
others
otherwise
ought
our
ours
ourselves
out
outside
over
overall
own
particular
particularly
per
perhaps
placed
please
plus
possible
presumably
probably
provides
que
quite
qv
rather
rd
re
really
reasonably
regarding
regardless
regards
relatively
respectively
right
said
same
saw
say
saying
says
second
secondly
see
seeing
seem
seemed
seeming
seems
seen
self
selves
sensible
sent
serious
seriously
seven
several
shall
she
should
shouldn't
since
six
so
some
somebody
somehow
someone
something
sometime
sometimes
somewhat
somewhere
soon
sorry
specified
specify
specifying
still
sub
such
sup
sure
t's
take
taken
tell
tends
th
than
thank
thanks
thanx
that
that's
thats
the
their
theirs
them
themselves
then
thence
there
there's
thereafter
thereby
therefore
therein
theres
thereupon
these
they
they'd
they'll
they're
they've
think
third
this
thorough
thoroughly
those
though
three
through
throughout
thru
thus
to
together
too
took
toward
towards
tried
tries
truly
try
trying
twice
two
un
under
unfortunately
unless
unlikely
until
unto
up
upon
us
use
used
useful
uses
using
usually
value
various
very
via
viz
vs
want
wants
was
wasn't
way
we
we'd
we'll
we're
we've
welcome
well
went
were
weren't
what
what's
whatever
when
whence
whenever
where
where's
whereafter
whereas
whereby
wherein
whereupon
wherever
whether
which
while
whither
who
who's
whoever
whole
whom
whose
why
will
willing
wish
with
within
without
won't
wonder
would
would
wouldn't
yes
yet
you
you'd
you'll
you're
you've
your
yours
yourself
yourselves
zero
12.8.4. Full-Text Restrictions
Full-text searches are supported for
MyISAM tables only.
Full-text searches can be used with most multi-byte
character sets. The exception is that for Unicode, the
utf8 character set can be used, but not
the ucs2 character set.
Ideographic languages such as Chinese and Japanese do not
have word delimiters. Therefore, the
FULLTEXT parser cannot
determine where words begin and end in these and other such
languages. The implications of this and some
workarounds for the problem are described in
Section 12.8, “Full-Text Search Functions”.
Although the use of multiple character sets within a single
table is supported, all columns in a
FULLTEXT index must use the same
character set and collation.
The MATCH() column list must match
exactly the column list in some FULLTEXT
index definition for the table, unless this
MATCH() is IN BOOLEAN
MODE. Boolean-mode searches can be done on
non-indexed columns, although they are likely to be slow.
The argument to AGAINST() must be a
constant string.
12.8.5. Fine-Tuning MySQL Full-Text Search
MySQL's full-text search capability has few user-tunable
parameters. You can exert more control over full-text searching
behavior if you have a MySQL source distribution because some
changes require source code modifications. See
Section 2.4.14, “MySQL Installation Using a Source Distribution”.
Note that full-text search is carefully tuned for the most
effectiveness. Modifying the default behavior in most cases can
actually decrease effectiveness. Do not alter the
MySQL sources unless you know what you are doing.
Most full-text variables described in this section must be set
at server startup time. A server restart is required to change
them; they cannot be modified while the server is running.
Some variable changes require that you rebuild the
FULLTEXT indexes in your tables. Instructions
for doing this are given at the end of this section.
The minimum and maximum lengths of words to be indexed are
defined by the ft_min_word_len and
ft_max_word_len system variables. (See
Section 5.2.3, “System Variables”.) The default
minimum value is four characters; the default maximum is
version dependent. If you change either value, you must
rebuild your FULLTEXT indexes. For
example, if you want three-character words to be searchable,
you can set the ft_min_word_len variable
by putting the following lines in an option file:
[mysqld]
ft_min_word_len=3
Then you must restart the server and rebuild your
FULLTEXT indexes. Note particularly the
remarks regarding myisamchk in the
instructions following this list.
To override the default stopword list, set the
ft_stopword_file system variable. (See
Section 5.2.3, “System Variables”.) The variable
value should be the pathname of the file containing the
stopword list, or the empty string to disable stopword
filtering. After changing the value of this variable or the
contents of the stopword file, restart the server and
rebuild your FULLTEXT indexes.
The stopword list is free-form. That is, you may use any
non-alphanumeric character such as newline, space, or comma
to separate stopwords. Exceptions are the underscore
character (‘_’) and a single
apostrophe (‘'’) which are
treated as part of a word. The character set of the stopword
list is the server's default character set; see
Section 10.3.1, “Server Character Set and Collation”.
The 50% threshold for natural language searches is
determined by the particular weighting scheme chosen. To
disable it, look for the following line in
myisam/ftdefs.h:
#define GWS_IN_USE GWS_PROB
Change that line to this:
#define GWS_IN_USE GWS_FREQ
Then recompile MySQL. There is no need to rebuild the
indexes in this case. Note:
By making this change, you severely
decrease MySQL's ability to provide adequate relevance
values for the MATCH() function. If you
really need to search for such common words, it would be
better to search using IN BOOLEAN MODE
instead, which does not observe the 50% threshold.
To change the operators used for boolean full-text searches,
set the ft_boolean_syntax system
variable. This variable can be changed while the server is
running, but you must have the SUPER
privilege to do so. No rebuilding of indexes is necessary in
this case. See Section 5.2.3, “System Variables”,
which describes the rules governing how to set this
variable.
If you want to change the set of characters that are
considered word characters, you can do so in two ways.
Suppose that you want to treat the hyphen character ('-') as
a word character. Use either of these methods:
Modify the MySQL source: In
myisam/ftdefs.h, see the
true_word_char() and
misc_word_char() macros. Add
'-' to one of those macros and
recompile MySQL.
Modify a character set file: This requires no
recompilation. The true_word_char()
macro uses a “character type” table to
distinguish letters and numbers from other characters. .
You can edit the
<ctype><map> contents in
one of the character set XML files to specify that
'-' is a “letter.” Then
use the given character set for your
FULLTEXT indexes.
After making the modification, you must rebuild the indexes
for each table that contains any FULLTEXT
indexes.
If you modify full-text variables that affect indexing
(ft_min_word_len,
ft_max_word_len, or
ft_stopword_file), or if you change the
stopword file itself, you must rebuild your
FULLTEXT indexes after making the changes and
restarting the server. To rebuild the indexes in this case, it
is sufficient to do a QUICK repair operation:
mysql> REPAIR TABLE tbl_name QUICK;
Each table that contains any FULLTEXT index
must be repaired as just shown. Otherwise, queries for the table
may yield incorrect results, and modifications to the table will
cause the server to see the table as corrupt and in need of
repair.
Note that if you use myisamchk to perform an
operation that modifies table indexes (such as repair or
analyze), the FULLTEXT indexes are rebuilt
using the default full-text parameter
values for minimum word length, maximum word length, and
stopword file unless you specify otherwise. This can result in
queries failing.
The problem occurs because these parameters are known only by
the server. They are not stored in MyISAM
index files. To avoid the problem if you have modified the
minimum or maximum word length or stopword file values used by
the server, specify the same ft_min_word_len,
ft_max_word_len, and
ft_stopword_file values to
myisamchk that you use for
mysqld. For example, if you have set the
minimum word length to 3, you can repair a table with
myisamchk like this:
To ensure that myisamchk and the server use
the same values for full-text parameters, place each one in both
the [mysqld] and
[myisamchk] sections of an option file:
An alternative to using myisamchk is to use
the REPAIR TABLE, ANALYZE
TABLE, OPTIMIZE TABLE, or
ALTER TABLE statements. These statements are
performed by the server, which knows the proper full-text
parameter values to use.
The BINARY operator casts the string
following it to a binary string. This is an easy way to force
a column comparison to be done byte by byte rather than
character by character. This causes the comparison to be case
sensitive even if the column isn't defined as
BINARY or BLOB.
BINARY also causes trailing spaces to be
significant.
In a comparison, BINARY affects the entire
operation; it can be given before either operand with the same
result.
BINARY str is
shorthand for CAST(str AS
BINARY).
Note that in some contexts, if you cast an indexed column to
BINARY, MySQL is not able to use the index
efficiently.
CAST(expr AS
type),
CONVERT(expr,type),
CONVERT(expr USING
transcoding_name)
The CAST() and CONVERT()
functions take a value of one type and produce a value of
another type.
The type can be one of the
following values:
BINARY[(N)]
CHAR[(N)]
DATE
DATETIME
DECIMAL
SIGNED [INTEGER]
TIME
UNSIGNED [INTEGER]
BINARY produces a string with the
BINARY data type. See
Section 11.4.2, “The BINARY and VARBINARY Types” for a description of how
this affects comparisons. If the optional length
N is given,
BINARY(N) causes
the cast to use no more than N
bytes of the argument. As of MySQL 5.0.17, values shorter than
N bytes are padded with
0x00 bytes to a length of
N.
CHAR(N) causes
the cast to use no more than N
characters of the argument.
The DECIMAL type is available as of MySQL
5.0.8.
CAST() and CONVERT(... USING
...) are standard SQL syntax. The
non-USING form of
CONVERT() is ODBC syntax.
CONVERT() with USING is
used to convert data between different character sets. In
MySQL, transcoding names are the same as the corresponding
character set names. For example, this statement converts the
string 'abc' in the default character set
to the corresponding string in the utf8
character set:
SELECT CONVERT('abc' USING utf8);
Normally, you cannot compare a BLOB value or
other binary string in case-insensitive fashion because binary
strings have no character set, and thus no concept of lettercase.
To perform a case-insensitive comparison, use the
CONVERT() function to convert the value to a
non-binary string. If the character set of the result has a
case-insensitive collation, the LIKE operation
is not case sensitive:
SELECT 'A' LIKE CONVERT(blob_col USING latin1) FROM tbl_name;
To use a different character set, substitute its name for
latin1 in the preceding statement. To ensure
that a case-insensitive collation is used, specify a
COLLATE clause following the
CONVERT() call.
CONVERT() can be used more generally for
comparing strings that are represented in different character
sets.
The cast functions are useful when you want to create a column
with a specific type in a CREATE ... SELECT
statement:
CREATE TABLE new_table SELECT CAST('2000-01-01' AS DATE);
The functions also can be useful for sorting
ENUM columns in lexical order. Normally,
sorting of ENUM columns occurs using the
internal numeric values. Casting the values to
CHAR results in a lexical sort:
SELECT enum_col FROM tbl_name ORDER BY CAST(enum_col AS CHAR);
CAST(str AS BINARY)
is the same thing as BINARY
str.
CAST(expr AS CHAR)
treats the expression as a string with the default character set.
CAST() also changes the result if you use it as
part of a more complex expression such as CONCAT('Date:
',CAST(NOW() AS DATE)).
You should not use CAST() to extract data in
different formats but instead use string functions like
LEFT() or EXTRACT(). See
Section 12.6, “Date and Time Functions”.
To cast a string to a numeric value in numeric context, you
normally do not have to do anything other than to use the string
value as though it were a number:
mysql> SELECT 1+'1';
-> 2
If you use a number in string context, the number automatically is
converted to a BINARY string.
mysql> SELECT CONCAT('hello you ',2);
-> 'hello you 2'
MySQL supports arithmetic with both signed and unsigned 64-bit
values. If you are using numeric operators (such as
+ or -) and one of the
operands is an unsigned integer, the result is unsigned. You can
override this by using the SIGNED and
UNSIGNED cast operators to cast the operation
to a signed or unsigned 64-bit integer, respectively.
mysql> SELECT CAST(1-2 AS UNSIGNED)
-> 18446744073709551615
mysql> SELECT CAST(CAST(1-2 AS UNSIGNED) AS SIGNED);
-> -1
Note that if either operand is a floating-point value, the result
is a floating-point value and is not affected by the preceding
rule. (In this context, DECIMAL column values
are regarded as floating-point values.)
mysql> SELECT CAST(1 AS UNSIGNED) - 2.0;
-> -1.0
If you are using a string in an arithmetic operation, this is
converted to a floating-point number.
If you convert a “zero” date string to a date,
CONVERT() and CAST() return
NULL when the NO_ZERO_DATE
SQL mode is enabled. As of MySQL 5.0.4, they also produce a
warning.
Note: The encryption and
compression functions return binary strings. For many of these
functions, the result might contain arbitrary byte values. If
you want to store these results, use a BLOB
column rather than a CHAR or (before MySQL
5.0.3) VARCHAR column to avoid potential
problems with trailing space removal that would change data
values.
Note: Exploits for the MD5 and
SHA-1 algorithms have become known. You may wish to consider
using one of the other encryption functions described in this
section instead.
These functions allow encryption and decryption of data
using the official AES (Advanced Encryption Standard)
algorithm, previously known as “Rijndael.”
Encoding with a 128-bit key length is used, but you can
extend it up to 256 bits by modifying the source. We chose
128 bits because it is much faster and it is secure enough
for most purposes.
AES_ENCRYPT() encrypts a string and
returns a binary string. AES_DECRYPT()
decrypts the encrypted string and returns the original
string. The input arguments may be any length. If either
argument is NULL, the result of this
function is also NULL.
Because AES is a block-level algorithm, padding is used to
encode uneven length strings and so the result string length
may be calculated using this formula:
16 × (trunc(string_length / 16) + 1)
If AES_DECRYPT() detects invalid data or
incorrect padding, it returns NULL.
However, it is possible for AES_DECRYPT()
to return a non-NULL value (possibly
garbage) if the input data or the key is invalid.
You can use the AES functions to store data in an encrypted
form by modifying your queries:
INSERT INTO t VALUES (1,AES_ENCRYPT('text','password'));
AES_ENCRYPT() and
AES_DECRYPT() can be considered the most
cryptographically secure encryption functions currently
available in MySQL.
COMPRESS(string_to_compress)
Compresses a string and returns the result as a binary
string. This function requires MySQL to have been compiled
with a compression library such as zlib.
Otherwise, the return value is always
NULL. The compressed string can be
uncompressed with UNCOMPRESS().
The compressed string contents are stored the following way:
Empty strings are stored as empty strings.
Non-empty strings are stored as a four-byte length of
the uncompressed string (low byte first), followed by
the compressed string. If the string ends with space, an
extra ‘.’ character is
added to avoid problems with endspace trimming should
the result be stored in a CHAR or
VARCHAR column. (Use of
CHAR or VARCHAR to
store compressed strings is not recommended. It is
better to use a BLOB column instead.)
DECODE(crypt_str,pass_str)
Decrypts the encrypted string
crypt_str using
pass_str as the password.
crypt_str should be a string
returned from ENCODE().
ENCODE(str,pass_str)
Encrypt str using
pass_str as the password. To
decrypt the result, use DECODE().
The result is a binary string of the same length as
str.
The strength of the encryption is based on how good the
random generator is. It should suffice for short strings.
DES_DECRYPT(crypt_str[,key_str])
Decrypts a string encrypted with
DES_ENCRYPT(). If an error occurs, this
function returns NULL.
If no key_str argument is given,
DES_DECRYPT() examines the first byte of
the encrypted string to determine the DES key number that
was used to encrypt the original string, and then reads the
key from the DES key file to decrypt the message. For this
to work, the user must have the SUPER
privilege. The key file can be specified with the
--des-key-file server option.
If you pass this function a
key_str argument, that string is
used as the key for decrypting the message.
If the crypt_str argument does
not appear to be an encrypted string, MySQL returns the
given crypt_str.
DES_ENCRYPT(str[,{key_num|key_str}])
Encrypts the string with the given key using the Triple-DES
algorithm.
The encryption key to use is chosen based on the second
argument to DES_ENCRYPT(), if one was
given:
Argument
Description
No argument
The first key from the DES key file is used.
key_num
The given key number (0-9) from the DES key file is used.
key_str
The given key string is used to encrypt str.
The key file can be specified with the
--des-key-file server option.
The return string is a binary string where the first
character is CHAR(128 |
key_num). If an error
occurs, DES_ENCRYPT() returns
NULL.
The 128 is added to make it easier to recognize an encrypted
key. If you use a string key,
key_num is 127.
The string length for the result is given by this formula:
new_len = orig_len + (8 - (orig_len % 8)) + 1
Each line in the DES key file has the following format:
key_numdes_key_str
Each key_num value must be a
number in the range from 0 to
9. Lines in the file may be in any order.
des_key_str is the string that is
used to encrypt the message. There should be at least one
space between the number and the key. The first key is the
default key that is used if you do not specify any key
argument to DES_ENCRYPT().
You can tell MySQL to read new key values from the key file
with the FLUSH DES_KEY_FILE statement.
This requires the RELOAD privilege.
One benefit of having a set of default keys is that it gives
applications a way to check for the existence of encrypted
column values, without giving the end user the right to
decrypt those values.
mysql> SELECT customer_address FROM customer_table
> WHERE crypted_credit_card = DES_ENCRYPT('credit_card_number');
ENCRYPT(str[,salt])
Encrypts str using the Unix
crypt() system call and returns a binary
string. The salt argument should
be a string with at least two characters. If no
salt argument is given, a random
value is used.
ENCRYPT() ignores all but the first eight
characters of str, at least on
some systems. This behavior is determined by the
implementation of the underlying crypt()
system call.
The use of ENCYPT() with multi-byte
character sets other than utf8 is not
recommended because the system call expects a string
terminated by a zero byte.
If crypt() is not available on your
system (as is the case with Windows),
ENCRYPT() always returns
NULL.
MD5(str)
Calculates an MD5 128-bit checksum for the string. The value
is returned as a binary string of 32 hex digits, or
NULL if the argument was
NULL. The return value can, for example,
be used as a hash key.
This is the “RSA Data Security, Inc. MD5
Message-Digest Algorithm.”
If you want to convert the value to uppercase, see the
description of binary string conversion given in the entry
for the BINARY operator in
Section 12.9, “Cast Functions and Operators”.
See the note regarding the MD5 algorithm at the beginning
this section.
OLD_PASSWORD(str)
OLD_PASSWORD() was added to MySQL when
the implementation of PASSWORD() was
changed to improve security.
OLD_PASSWORD() returns the value of the
old (pre-4.1) implementation of
PASSWORD() as a binary string, and is
intended to permit you to reset passwords for any pre-4.1
clients that need to connect to your version
5.0 MySQL server without locking them out. See
Section 5.7.9, “Password Hashing as of MySQL 4.1”.
PASSWORD(str)
Calculates and returns a password string from the plaintext
password str and returns a binary
string, or NULL if the argument was
NULL. This is the function that is used
for encrypting MySQL passwords for storage in the
Password column of the
user grant table.
PASSWORD() encryption is one-way (not
reversible).
PASSWORD() does not perform password
encryption in the same way that Unix passwords are
encrypted. See ENCRYPT().
Note: The
PASSWORD() function is used by the
authentication system in MySQL Server; you should
not use it in your own applications.
For that purpose, consider MD5() or
SHA1() instead. Also see
RFC 2195, section 2
(Challenge-Response Authentication Mechanism
(CRAM)), for more information about handling
passwords and authentication securely in your applications.
SHA1(str),
SHA(str)
Calculates an SHA-1 160-bit checksum for the string, as
described in RFC 3174 (Secure Hash Algorithm). The value is
returned as a binary string of 40 hex digits, or
NULL if the argument was
NULL. One of the possible uses for this
function is as a hash key. You can also use it as a
cryptographic function for storing passwords.
SHA() is synonymous with
SHA1().
SHA1() can be considered a
cryptographically more secure equivalent of
MD5(). However, see the note regarding
the MD5 and SHA-1 algorithms at the beginning this section.
UNCOMPRESS(string_to_uncompress)
Uncompresses a string compressed by the
COMPRESS() function. If the argument is
not a compressed value, the result is
NULL. This function requires MySQL to
have been compiled with a compression library such as
zlib. Otherwise, the return value is
always NULL.
The BENCHMARK() function executes the
expression expr repeatedly
count times. It may be used to
time how quickly MySQL processes the expression. The result
value is always 0. The intended use is
from within the mysql client, which
reports query execution times:
mysql> SELECT BENCHMARK(1000000,ENCODE('hello','goodbye'));
+----------------------------------------------+
| BENCHMARK(1000000,ENCODE('hello','goodbye')) |
+----------------------------------------------+
| 0 |
+----------------------------------------------+
1 row in set (4.74 sec)
The time reported is elapsed time on the client end, not CPU
time on the server end. It is advisable to execute
BENCHMARK() several times, and to
interpret the result with regard to how heavily loaded the
server machine is.
BENCHMARK() is intended for measuring the
runtime performance of scalar expressions, which has some
significant implications for the way that you use it and
interpret the results:
Only scalar expressions can be used. Although the
expression can be a subquery, it must return a single
column and at most a single row. For example,
BENCHMARK(10, (SELECT * FROM t)) will
fail if the table t has more than one
column or more than one row.
Executing a SELECT
expr statement
N times differs from
executing SELECT
BENCHMARK(N,
expr) in terms of
the amount of overhead involved. The two have very
different execution profiles and you should not expect
them to take the same amount of time. The former
involves the parser, optimizer, table locking, and
runtime evaluation N times
each. The latter involves only runtime evaluation
N times, and all the other
components just once. Memory structures already
allocated are reused, and runtime optimizations such as
local caching of results already evaluated for aggregate
functions can alter the results. Use of
BENCHMARK() thus measures performance
of the runtime component by giving more weight to that
component and removing the “noise”
introduced by the network, parser, optimizer, and so
forth.
Returns the connection ID (thread ID) for the connection.
Every connection has an ID that is unique among the set of
currently connected clients.
mysql> SELECT CONNECTION_ID();
-> 23786
CURRENT_USER,
CURRENT_USER()
Returns the username and hostname combination for the MySQL
account that the server used to authenticate the current
client. This account determines your access privileges. As
of MySQL 5.0.10, within a stored routine that is defined
with the SQL SECURITY DEFINER
characteristic, CURRENT_USER() returns
the creator of the routine. The return value is a string in
the utf8 character set.
The value of CURRENT_USER() can differ
from the value of USER().
mysql> SELECT USER();
-> 'davida@localhost'
mysql> SELECT * FROM mysql.user;
ERROR 1044: Access denied for user ''@'localhost' to
database 'mysql'
mysql> SELECT CURRENT_USER();
-> '@localhost'
The example illustrates that although the client specified a
username of davida (as indicated by the
value of the USER() function), the server
authenticated the client using an anonymous user account (as
seen by the empty username part of the
CURRENT_USER() value). One way this might
occur is that there is no account listed in the grant tables
for davida.
DATABASE()
Returns the default (current) database name as a string in
the utf8 character set. If there is no
default database, DATABASE() returns
NULL. Within a stored routine, the
default database is the database that the routine is
associated with, which is not necessarily the same as the
database that is the default in the calling context.
mysql> SELECT DATABASE();
-> 'test'
FOUND_ROWS()
A SELECT statement may include a
LIMIT clause to restrict the number of
rows the server returns to the client. In some cases, it is
desirable to know how many rows the statement would have
returned without the LIMIT, but without
running the statement again. To obtain this row count,
include a SQL_CALC_FOUND_ROWS option in
the SELECT statement, and then invoke
FOUND_ROWS() afterward:
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
The second SELECT returns a number
indicating how many rows the first SELECT
would have returned had it been written without the
LIMIT clause.
In the absence of the SQL_CALC_FOUND_ROWS
option in the most recent SELECT
statement, FOUND_ROWS() returns the
number of rows in the result set returned by that statement.
The row count available through
FOUND_ROWS() is transient and not
intended to be available past the statement following the
SELECT SQL_CALC_FOUND_ROWS statement. If
you need to refer to the value later, save it:
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM ... ;
mysql> SET @rows = FOUND_ROWS();
If you are using SELECT
SQL_CALC_FOUND_ROWS, MySQL must calculate how many
rows are in the full result set. However, this is faster
than running the query again without
LIMIT, because the result set need not be
sent to the client.
SQL_CALC_FOUND_ROWS and
FOUND_ROWS() can be useful in situations
when you want to restrict the number of rows that a query
returns, but also determine the number of rows in the full
result set without running the query again. An example is a
Web script that presents a paged display containing links to
the pages that show other sections of a search result. Using
FOUND_ROWS() allows you to determine how
many other pages are needed for the rest of the result.
The use of SQL_CALC_FOUND_ROWS and
FOUND_ROWS() is more complex for
UNION statements than for simple
SELECT statements, because
LIMIT may occur at multiple places in a
UNION. It may be applied to individual
SELECT statements in the
UNION, or global to the
UNION result as a whole.
The intent of SQL_CALC_FOUND_ROWS for
UNION is that it should return the row
count that would be returned without a global
LIMIT. The conditions for use of
SQL_CALC_FOUND_ROWS with
UNION are:
The SQL_CALC_FOUND_ROWS keyword must
appear in the first SELECT of the
UNION.
The value of FOUND_ROWS() is exact
only if UNION ALL is used. If
UNION without ALL
is used, duplicate removal occurs and the value of
FOUND_ROWS() is only approximate.
If no LIMIT is present in the
UNION,
SQL_CALC_FOUND_ROWS is ignored and
returns the number of rows in the temporary table that
is created to process the UNION.
LAST_INSERT_ID(),
LAST_INSERT_ID(expr)
LAST_INSERT_ID() (with no argument)
returns the first automatically
generated value that was set for an
AUTO_INCREMENT column by the
most recently executedINSERT statement to affect such a column.
For example, after inserting a row that generates an
AUTO_INCREMENT value, you can get the
value like this:
mysql> SELECT LAST_INSERT_ID();
-> 195
The currently executing statement does not affect the value
of LAST_INSERT_ID(). Suppose that you
generate an AUTO_INCREMENT value with one
statement, and then refer to
LAST_INSERT_ID() in a multiple-row
INSERT statement that inserts rows into a
table with its own AUTO_INCREMENT column.
The value of LAST_INSERT_ID() will remain
stable in the second statement; its value for the second and
later rows is not affected by the earlier row insertions.
(However, if you mix references to
LAST_INSERT_ID() and
LAST_INSERT_ID(expr),
the effect is undefined.)
If the previous statement returned an error, the value of
LAST_INSERT_ID() is undefined. For
transactional tables, if the statement is rolled back due to
an error, the value of LAST_INSERT_ID()
is left undefined. For manual ROLLBACK,
the value of LAST_INSERT_ID() is not
restored to that before the transaction; it remains as it
was at the point of the ROLLBACK.
Within the body of a stored routine (procedure or function)
or a trigger, the value of
LAST_INSERT_ID() changes the same way as
for statements executed outside the body of these kinds of
objects. The effect of a stored routine or trigger upon the
value of LAST_INSERT_ID() that is seen by
following statements depends on the kind of routine:
If a stored procedure executes statements that change
the value of LAST_INSERT_ID(), the
changed value will be seen by statements that follow the
procedure call.
For stored functions and triggers that change the value,
the value is restored when the function or trigger ends,
so following statements will not see a changed value.
The ID that was generated is maintained in the server on a
per-connection basis. This means that
the value returned by the function to a given client is the
first AUTO_INCREMENT value generated for
most recent statement affecting an
AUTO_INCREMENT column by that
client. This value cannot be affected by other
clients, even if they generate
AUTO_INCREMENT values of their own. This
behavior ensures that each client can retrieve its own ID
without concern for the activity of other clients, and
without the need for locks or transactions.
The value of LAST_INSERT_ID() is not
changed if you set the AUTO_INCREMENT
column of a row to a non-“magic” value (that
is, a value that is not NULL and not
0).
Important: If you insert
multiple rows using a single INSERT
statement, LAST_INSERT_ID() returns the
value generated for the first inserted
row only. The reason for this is to
make it possible to reproduce easily the same
INSERT statement against some other
server.
For example:
mysql> USE test;
Database changed
mysql> CREATE TABLE t (
-> id INT AUTO_INCREMENT NOT NULL PRIMARY KEY,
-> name VARCHAR(10) NOT NULL
-> );
Query OK, 0 rows affected (0.09 sec)
mysql> INSERT INTO t VALUES (NULL, 'Bob');
Query OK, 1 row affected (0.01 sec)
mysql> SELECT * FROM t;
+----+------+
| id | name |
+----+------+
| 1 | Bob |
+----+------+
1 row in set (0.01 sec)
mysql> SELECT LAST_INSERT_ID();
+------------------+
| LAST_INSERT_ID() |
+------------------+
| 1 |
+------------------+
1 row in set (0.00 sec)
mysql> INSERT INTO t VALUES
-> (NULL, 'Mary'), (NULL, 'Jane'), (NULL, 'Lisa');
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM t;
+----+------+
| id | name |
+----+------+
| 1 | Bob |
| 2 | Mary |
| 3 | Jane |
| 4 | Lisa |
+----+------+
4 rows in set (0.01 sec)
mysql> SELECT LAST_INSERT_ID();
+------------------+
| LAST_INSERT_ID() |
+------------------+
| 2 |
+------------------+
1 row in set (0.00 sec)
Although the second INSERT statement
inserted three new rows into t, the ID
generated for the first of these rows was
2, and it is this value that is returned
by LAST_INSERT_ID() for the following
SELECT statement.
If you use INSERT IGNORE and the row is
ignored, the AUTO_INCREMENT counter is
not incremented and LAST_INSERT_ID()
returns 0, which reflects that no row was
inserted.
If expr is given as an argument
to LAST_INSERT_ID(), the value of the
argument is returned by the function and is remembered as
the next value to be returned by
LAST_INSERT_ID(). This can be used to
simulate sequences:
Create a table to hold the sequence counter and
initialize it:
mysql> CREATE TABLE sequence (id INT NOT NULL);
mysql> INSERT INTO sequence VALUES (0);
Use the table to generate sequence numbers like this:
mysql> UPDATE sequence SET id=LAST_INSERT_ID(id+1);
mysql> SELECT LAST_INSERT_ID();
The UPDATE statement increments the
sequence counter and causes the next call to
LAST_INSERT_ID() to return the
updated value. The SELECT statement
retrieves that value. The
mysql_insert_id() C API function can
also be used to get the value. See
Section 22.2.3.37, “mysql_insert_id()”.
You can generate sequences without calling
LAST_INSERT_ID(), but the utility of
using the function this way is that the ID value is
maintained in the server as the last automatically generated
value. It is multi-user safe because multiple clients can
issue the UPDATE statement and get their
own sequence value with the SELECT
statement (or mysql_insert_id()), without
affecting or being affected by other clients that generate
their own sequence values.
Note that mysql_insert_id() is only
updated after INSERT and
UPDATE statements, so you cannot use the
C API function to retrieve the value for
LAST_INSERT_ID(expr)
after executing other SQL statements like
SELECT or SET.
ROW_COUNT()
ROW_COUNT() returns the number of rows
updated, inserted, or deleted by the preceding statement.
This is the same as the row count that the
mysql client displays and the value from
the mysql_affected_rows() C API function.
mysql> INSERT INTO t VALUES(1),(2),(3);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> SELECT ROW_COUNT();
+-------------+
| ROW_COUNT() |
+-------------+
| 3 |
+-------------+
1 row in set (0.00 sec)
mysql> DELETE FROM t WHERE i IN(1,2);
Query OK, 2 rows affected (0.00 sec)
mysql> SELECT ROW_COUNT();
+-------------+
| ROW_COUNT() |
+-------------+
| 2 |
+-------------+
1 row in set (0.00 sec)
ROW_COUNT() was added in MySQL 5.0.1.
SCHEMA()
This function is a synonym for
DATABASE(). It was added in MySQL 5.0.2.
SESSION_USER()
SESSION_USER() is a synonym for
USER().
SYSTEM_USER()
SYSTEM_USER() is a synonym for
USER().
USER()
Returns the current MySQL username and hostname as a string
in the utf8 character set.
mysql> SELECT USER();
-> 'davida@localhost'
The value indicates the username you specified when
connecting to the server, and the client host from which you
connected. The value can be different from that of
CURRENT_USER().
Returns the default value for a table column. Starting with
MySQL 5.0.2, an error results if the column has no default
value.
mysql> UPDATE t SET i = DEFAULT(i)+1 WHERE id < 100;
FORMAT(X,D)
Formats the number X to a format
like '#,###,###.##', rounded to
D decimal places, and returns the
result as a string. For details, see
Section 12.4, “String Functions”.
GET_LOCK(str,timeout)
Tries to obtain a lock with a name given by the string
str, using a timeout of
timeout seconds. Returns
1 if the lock was obtained successfully,
0 if the attempt timed out (for example,
because another client has previously locked the name), or
NULL if an error occurred (such as
running out of memory or the thread was killed with
mysqladmin kill). If you have a lock
obtained with GET_LOCK(), it is released
when you execute RELEASE_LOCK(), execute
a new GET_LOCK(), or your connection
terminates (either normally or abnormally). Locks obtained
with GET_LOCK() do not interact with
transactions. That is, committing a transaction does not
release any such locks obtained during the transaction.
This function can be used to implement application locks or
to simulate record locks. Names are locked on a server-wide
basis. If a name has been locked by one client,
GET_LOCK() blocks any request by another
client for a lock with the same name. This allows clients
that agree on a given lock name to use the name to perform
cooperative advisory locking. But be aware that it also
allows a client that is not among the set of cooperating
clients to lock a name, either inadvertently or
deliberately, and thus prevent any of the cooperating
clients from locking that name. One way to reduce the
likelihood of this is to use lock names that are
database-specific or application-specific. For example, use
lock names of the form
db_name.str or
app_name.str.
The second RELEASE_LOCK() call returns
NULL because the lock
'lock1' was automatically released by the
second GET_LOCK() call.
Note: If a client attempts to acquire a lock that is already
held by another client, it blocks according to the
timeout argument. If the blocked
client terminates, its thread does not die until the lock
request times out. This is a known bug.
INET_ATON(expr)
Given the dotted-quad representation of a network address as
a string, returns an integer that represents the numeric
value of the address. Addresses may be 4- or 8-byte
addresses.
Note: When storing values
generated by INET_ATON(), it is
recommended that you use an INT UNSIGNED
column. If you use a (signed) INT column,
values corresponding to IP addresses for which the first
octet is greater than 127 cannot be stored correctly. See
Section 11.2, “Numeric Types”.
INET_NTOA(expr)
Given a numeric network address (4 or 8 byte), returns the
dotted-quad representation of the address as a string.
Checks whether the lock named str
is free to use (that is, not locked). Returns
1 if the lock is free (no one is using
the lock), 0 if the lock is in use, and
NULL if an error occurs (such as an
incorrect argument).
IS_USED_LOCK(str)
Checks whether the lock named str
is in use (that is, locked). If so, it returns the
connection identifier of the client that holds the lock.
Otherwise, it returns NULL.
MASTER_POS_WAIT(log_name,log_pos[,timeout])
This function is useful for control of master/slave
synchronization. It blocks until the slave has read and
applied all updates up to the specified position in the
master log. The return value is the number of log events the
slave had to wait for to advance to the specified position.
The function returns NULL if the slave
SQL thread is not started, the slave's master information is
not initialized, the arguments are incorrect, or an error
occurs. It returns -1 if the timeout has
been exceeded. If the slave SQL thread stops while
MASTER_POS_WAIT() is waiting, the
function returns NULL. If the slave is
past the specified position, the function returns
immediately.
If a timeout value is specified,
MASTER_POS_WAIT() stops waiting when
timeout seconds have elapsed.
timeout must be greater than 0; a
zero or negative timeout means no
timeout.
NAME_CONST(name,value)
Returns the given value. When used to produce a result set
column, NAME_CONST() causes the column to
have the given name.
This function was added in MySQL 5.0.12. It is for internal
use only. The server uses it when writing statements from
stored routines that contain references to local routine
variables, as described in
Section 17.4, “Binary Logging of Stored Routines and Triggers”, You might see
this function in the output from
mysqlbinlog.
RELEASE_LOCK(str)
Releases the lock named by the string
str that was obtained with
GET_LOCK(). Returns 1
if the lock was released, 0 if the lock
was not established by this thread (in which case the lock
is not released), and NULL if the named
lock did not exist. The lock does not exist if it was never
obtained by a call to GET_LOCK() or if it
has previously been released.
Sleeps (pauses) for the number of seconds given by the
duration argument, then returns
0. If SLEEP() is interrupted, it returns
1. The duration may have a fractional part given in
microseconds. This function was added in MySQL 5.0.12.
UUID()
Returns a Universal Unique Identifier (UUID) generated
according to “DCE 1.1: Remote Procedure Call”
(Appendix A) CAE (Common Applications Environment)
Specifications published by The Open Group in October 1997
(Document Number C706,
http://www.opengroup.org/public/pubs/catalog/c706.htm).
A UUID is designed as a number that is globally unique in
space and time. Two calls to UUID() are
expected to generate two different values, even if these
calls are performed on two separate computers that are not
connected to each other.
A UUID is a 128-bit number represented by a string of five
hexadecimal numbers in
aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee
format:
The first three numbers are generated from a timestamp.
The fourth number preserves temporal uniqueness in case
the timestamp value loses monotonicity (for example, due
to daylight saving time).
The fifth number is an IEEE 802 node number that
provides spatial uniqueness. A random number is
substituted if the latter is not available (for example,
because the host computer has no Ethernet card, or we do
not know how to find the hardware address of an
interface on your operating system). In this case,
spatial uniqueness cannot be guaranteed. Nevertheless, a
collision should have very low
probability.
Currently, the MAC address of an interface is taken into
account only on FreeBSD and Linux. On other operating
systems, MySQL uses a randomly generated 48-bit number.
Note that UUID() does not yet work with
replication.
VALUES(col_name)
In an INSERT ... ON DUPLICATE KEY UPDATE
statement, you can use the
VALUES(col_name)
function in the UPDATE clause to refer to
column values from the INSERT portion of
the statement. In other words,
VALUES(col_name)
in the UPDATE clause refers to the value
of col_name that would be
inserted, had no duplicate-key conflict occurred. This
function is especially useful in multiple-row inserts. The
VALUES() function is meaningful only in
INSERT ... ON DUPLICATE KEY UPDATE
statements and returns NULL otherwise.
Section 13.2.4.3, “INSERT ... ON DUPLICATE KEY UPDATE Syntax”.
mysql> INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
-> ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
12.11. Functions and Modifiers for Use with GROUP BY Clauses
This section describes group (aggregate) functions that operate
on sets of values. Unless otherwise stated, group functions
ignore NULL values.
If you use a group function in a statement containing no
GROUP BY clause, it is equivalent to grouping
on all rows.
For numeric arguments, the variance and standard deviation
functions return a DOUBLE value. The
SUM() and AVG() functions
return a DECIMAL value for exact-value
arguments (integer or DECIMAL), and a
DOUBLE value for approximate-value arguments
(FLOAT or DOUBLE). (Before
MySQL 5.0.3, SUM() and
AVG() return DOUBLE for
all numeric arguments.)
The SUM() and AVG()
aggregate functions do not work with temporal values. (They
convert the values to numbers, losing everything after the first
non-numeric character.) To work around this problem, you can
convert to numeric units, perform the aggregate operation, and
convert back to a temporal value. Examples:
SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(time_col))) FROM tbl_name;
SELECT FROM_DAYS(SUM(TO_DAYS(date_col))) FROM tbl_name;
AVG([DISTINCT]
expr)
Returns the average value of
expr. The
DISTINCT option can be used as of MySQL
5.0.3 to return the average of the distinct values of
expr.
AVG() returns NULL if
there were no matching rows.
mysql> SELECT student_name, AVG(test_score)
-> FROM student
-> GROUP BY student_name;
BIT_AND(expr)
Returns the bitwise AND of all bits in
expr. The calculation is
performed with 64-bit (BIGINT) precision.
This function returns
18446744073709551615 if there were no
matching rows. (This is the value of an unsigned
BIGINT value with all bits set to 1.)
BIT_OR(expr)
Returns the bitwise OR of all bits in
expr. The calculation is
performed with 64-bit (BIGINT) precision.
This function returns 0 if there were no
matching rows.
BIT_XOR(expr)
Returns the bitwise XOR of all bits in
expr. The calculation is
performed with 64-bit (BIGINT) precision.
This function returns 0 if there were no
matching rows.
COUNT(expr)
Returns a count of the number of non-NULL
values of expr in the rows
retrieved by a SELECT statement. The
result is a BIGINT value.
COUNT() returns 0 if
there were no matching rows.
mysql> SELECT student.student_name,COUNT(*)
-> FROM student,course
-> WHERE student.student_id=course.student_id
-> GROUP BY student_name;
COUNT(*) is somewhat different in that it
returns a count of the number of rows retrieved, whether or
not they contain NULL values.
COUNT(*) is optimized to return very
quickly if the SELECT retrieves from one
table, no other columns are retrieved, and there is no
WHERE clause. For example:
mysql> SELECT COUNT(*) FROM student;
This optimization applies only to MyISAM
tables only, because an exact row count is stored for this
storage engine and can be accessed very quickly. For
transactional storage engines such as
InnoDB and BDB,
storing an exact row count is more problematic because
multiple transactions may be occurring, each of which may
affect the count.
COUNT(DISTINCT
expr,[expr...])
Returns a count of the number of different
non-NULL values.
COUNT(DISTINCT) returns
0 if there were no matching rows.
mysql> SELECT COUNT(DISTINCT results) FROM student;
In MySQL, you can obtain the number of distinct expression
combinations that do not contain NULL by
giving a list of expressions. In standard SQL, you would
have to do a concatenation of all expressions inside
COUNT(DISTINCT ...).
GROUP_CONCAT(expr)
This function returns a string result with the concatenated
non-NULL values from a group. It returns
NULL if there are no
non-NULL values. The full syntax is as
follows:
mysql> SELECT student_name,
-> GROUP_CONCAT(test_score)
-> FROM student
-> GROUP BY student_name;
Or:
mysql> SELECT student_name,
-> GROUP_CONCAT(DISTINCT test_score
-> ORDER BY test_score DESC SEPARATOR ' ')
-> FROM student
-> GROUP BY student_name;
In MySQL, you can get the concatenated values of expression
combinations. You can eliminate duplicate values by using
DISTINCT. If you want to sort values in
the result, you should use ORDER BY
clause. To sort in reverse order, add the
DESC (descending) keyword to the name of
the column you are sorting by in the ORDER
BY clause. The default is ascending order; this
may be specified explicitly using the ASC
keyword. SEPARATOR is followed by the
string value that should be inserted between values of
result. The default is a comma
(‘,’). You can eliminate the
separator altogether by specifying SEPARATOR
''.
The result is truncated to the maximum length that is given
by the group_concat_max_len system
variable, which has a default value of 1024. The value can
be set higher, although the maximum effective length of the
return value is constrained by the value of
max_allowed_packet. The syntax to change
the value of group_concat_max_len at
runtime is as follows, where val
is an unsigned integer:
SET [SESSION | GLOBAL] group_concat_max_len = val;
Beginning with MySQL 5.0.19, the type returned by
GROUP_CONCAT() is always
VARCHAR unless
group_concat_max_len is greater than 512,
in which case, it returns a BLOB.
(Previously, it returned a BLOB with
group_concat_max_len greater than 512
only if the query included an ORDER BY
clause.)
Returns the minimum or maximum value of
expr. MIN()
and MAX() may take a string argument; in
such cases they return the minimum or maximum string value.
See Section 7.4.5, “How MySQL Uses Indexes”. The
DISTINCT keyword can be used to find the
minimum or maximum of the distinct values of
expr, however, this produces the
same result as omitting DISTINCT.
MIN() and MAX() return
NULL if there were no matching rows.
mysql> SELECT student_name, MIN(test_score), MAX(test_score)
-> FROM student
-> GROUP BY student_name;
For MIN(), MAX(), and
other aggregate functions, MySQL currently compares
ENUM and SET columns
by their string value rather than by the string's relative
position in the set. This differs from how ORDER
BY compares them. This is expected to be rectified
in a future MySQL release.
STD(expr)STDDEV(expr)
Returns the population standard deviation of
expr. This is an extension to
standard SQL. The STDDEV() form of this
function is provided for compatibility with Oracle. As of
MySQL 5.0.3, the standard SQL function
STDDEV_POP() can be used instead.
These functions return NULL if there were
no matching rows.
STDDEV_POP(expr)
Returns the population standard deviation of
expr (the square root of
VAR_POP()). This function was added in
MySQL 5.0.3. Before 5.0.3, you can use
STD() or STDDEV(),
which are equivalent but not standard SQL.
STDDEV_POP() returns
NULL if there were no matching rows.
STDDEV_SAMP(expr)
Returns the sample standard deviation of
expr (the square root of
VAR_SAMP(). This function was added in
MySQL 5.0.3.
STDDEV_SAMP() returns
NULL if there were no matching rows.
SUM([DISTINCT]
expr)
Returns the sum of expr. If the
return set has no rows, SUM() returns
NULL. The DISTINCT
keyword can be used in MySQL 5.0 to sum only
the distinct values of expr.
SUM() returns NULL if
there were no matching rows.
VAR_POP(expr)
Returns the population standard variance of
expr. It considers rows as the
whole population, not as a sample, so it has the number of
rows as the denominator. This function was added in MySQL
5.0.3. Before 5.0.3, you can use
VARIANCE(), which is equivalent but is
not standard SQL.
VAR_POP() returns NULL
if there were no matching rows.
VAR_SAMP(expr)
Returns the sample variance of
expr. That is, the denominator is
the number of rows minus one. This function was added in
MySQL 5.0.3.
VAR_SAMP() returns
NULL if there were no matching rows.
VARIANCE(expr)
Returns the population standard variance of
expr. This is an extension to
standard SQL. As of MySQL 5.0.3, the standard SQL function
VAR_POP() can be used instead.
VARIANCE() returns
NULL if there were no matching rows.
12.11.2. GROUP BY Modifiers
The GROUP BY clause allows a WITH
ROLLUP modifier that causes extra rows to be added to
the summary output. These rows represent higher-level (or
super-aggregate) summary operations. ROLLUP
thus allows you to answer questions at multiple levels of
analysis with a single query. It can be used, for example, to
provide support for OLAP (Online Analytical Processing)
operations.
Suppose that a table named sales has
year, country,
product, and profit
columns for recording sales profitability:
CREATE TABLE sales
(
year INT NOT NULL,
country VARCHAR(20) NOT NULL,
product VARCHAR(32) NOT NULL,
profit INT
);
The table's contents can be summarized per year with a simple
GROUP BY like this:
mysql> SELECT year, SUM(profit) FROM sales GROUP BY year;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 | 4525 |
| 2001 | 3010 |
+------+-------------+
This output shows the total profit for each year, but if you
also want to determine the total profit summed over all years,
you must add up the individual values yourself or run an
additional query.
Or you can use ROLLUP, which provides both
levels of analysis with a single query. Adding a WITH
ROLLUP modifier to the GROUP BY
clause causes the query to produce another row that shows the
grand total over all year values:
mysql> SELECT year, SUM(profit) FROM sales GROUP BY year WITH ROLLUP;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 | 4525 |
| 2001 | 3010 |
| NULL | 7535 |
+------+-------------+
The grand total super-aggregate line is identified by the value
NULL in the year column.
ROLLUP has a more complex effect when there
are multiple GROUP BY columns. In this case,
each time there is a “break” (change in value) in
any but the last grouping column, the query produces an extra
super-aggregate summary row.
For example, without ROLLUP, a summary on the
sales table based on year,
country, and product might
look like this:
mysql> SELECT year, country, product, SUM(profit)
-> FROM sales
-> GROUP BY year, country, product;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2001 | Finland | Phone | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
+------+---------+------------+-------------+
The output indicates summary values only at the
year/country/product level of analysis. When
ROLLUP is added, the query produces several
extra rows:
mysql> SELECT year, country, product, SUM(profit)
-> FROM sales
-> GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | India | NULL | 1350 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2000 | USA | NULL | 1575 |
| 2000 | NULL | NULL | 4525 |
| 2001 | Finland | Phone | 10 |
| 2001 | Finland | NULL | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
| 2001 | USA | NULL | 3000 |
| 2001 | NULL | NULL | 3010 |
| NULL | NULL | NULL | 7535 |
+------+---------+------------+-------------+
For this query, adding ROLLUP causes the
output to include summary information at four levels of
analysis, not just one. Here's how to interpret the
ROLLUP output:
Following each set of product rows for a given year and
country, an extra summary row is produced showing the total
for all products. These rows have the
product column set to
NULL.
Following each set of rows for a given year, an extra
summary row is produced showing the total for all countries
and products. These rows have the country
and products columns set to
NULL.
Finally, following all other rows, an extra summary row is
produced showing the grand total for all years, countries,
and products. This row has the year,
country, and products
columns set to NULL.
Other Considerations When using
ROLLUP
The following items list some behaviors specific to the MySQL
implementation of ROLLUP:
When you use ROLLUP, you cannot also use an
ORDER BY clause to sort the results. In other
words, ROLLUP and ORDER BY
are mutually exclusive. However, you still have some control
over sort order. GROUP BY in MySQL sorts
results, and you can use explicit ASC and
DESC keywords with columns named in the
GROUP BY list to specify sort order for
individual columns. (The higher-level summary rows added by
ROLLUP still appear after the rows from which
they are calculated, regardless of the sort order.)
LIMIT can be used to restrict the number of
rows returned to the client. LIMIT is applied
after ROLLUP, so the limit applies against
the extra rows added by ROLLUP. For example:
mysql> SELECT year, country, product, SUM(profit)
-> FROM sales
-> GROUP BY year, country, product WITH ROLLUP
-> LIMIT 5;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
+------+---------+------------+-------------+
Using LIMIT with ROLLUP
may produce results that are more difficult to interpret,
because you have less context for understanding the
super-aggregate rows.
The NULL indicators in each super-aggregate
row are produced when the row is sent to the client. The server
looks at the columns named in the GROUP BY
clause following the leftmost one that has changed value. For
any column in the result set with a name that is a lexical match
to any of those names, its value is set to
NULL. (If you specify grouping columns by
column number, the server identifies which columns to set to
NULL by number.)
Because the NULL values in the
super-aggregate rows are placed into the result set at such a
late stage in query processing, you cannot test them as
NULL values within the query itself. For
example, you cannot add HAVING product IS
NULL to the query to eliminate from the output all but
the super-aggregate rows.
On the other hand, the NULL values do appear
as NULL on the client side and can be tested
as such using any MySQL client programming interface.
12.11.3. GROUP BY and HAVING with Hidden
Fields
MySQL extends the use of GROUP BY so that you
can use non-aggregated columns or calculations in the
SELECT list that do not appear in the
GROUP BY clause. You can use this feature to
get better performance by avoiding unnecessary column sorting
and grouping. For example, you do not need to group on
customer.name in the following query:
SELECT order.custid, customer.name, MAX(payments)
FROM order,customer
WHERE order.custid = customer.custid
GROUP BY order.custid;
In standard SQL, you would have to add
customer.name to the GROUP
BY clause. In MySQL, the name is redundant.
Do not use this feature if the columns you
omit from the GROUP BY part are not constant
in the group. The server is free to return any value from the
group, so the results are indeterminate unless all values are
the same.
A similar MySQL extension applies to the
HAVING clause. The SQL standard does not
allow the HAVING clause to name any column
that is not found in the GROUP BY clause if
it is not enclosed in an aggregate function. MySQL allows the
use of such columns to simplify calculations. This extension
assumes that the non-grouped columns will have the same
group-wise values. Otherwise, the result is indeterminate.
If the ONLY_FULL_GROUP_BY SQL mode is
enabled, the MySQL extension to GROUP BY does
not apply. That is, columns not named in the GROUP
BY clause cannot be used in the
SELECT list or HAVING
clause if not used in an aggregate function.
The select list extension also applies to ORDER
BY. That is, you can use non-aggregated columns or
calculations in the ORDER BY clause that do
not appear in the GROUP BY clause. This
extension does not apply if the
ONLY_FULL_GROUP_BY SQL mode is enabled.
In some cases, you can use MIN() and
MAX() to obtain a specific column value even
if it isn't unique. The following gives the value of
column from the row containing the smallest
value in the sort column:
Note that if you are trying to follow standard SQL, you can't
use expressions in GROUP BY clauses. You can
work around this limitation by using an alias for the
expression:
SELECT id,FLOOR(value/100) AS val
FROM tbl_name
GROUP BY id, val;
MySQL does allow expressions in GROUP BY
clauses. For example:
SELECT id,FLOOR(value/100)
FROM tbl_name
GROUP BY id, FLOOR(value/100);
At Alden Hosting we eat and breathe Secure FTP (sFTP)! We are the industry leader in providing
affordable, quality and efficient Secure FTP (sFTP) hosting in the shared hosting marketplace.