COLLATE
Clause in Various Parts of an SQL Query
COLLATE
Clause Precedence
BINARY
Operator
Every connection has connection character sets and connection collations, which may not be null. There are actually two connection character sets, which we will call ``connection/literals'' and ``connection/results'' when it is necessary to distinguish them.
Consider what a ``connection'' is: It's what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets, over the connection back to the client. This leads to several questions, such as: (a) what character set is the query in when it leaves the client? (b) what character set should the server translate a query to after receiving it? (c) what character set should the server translate to before shipping result sets or error messages back to the client? You can fine-tune the setting for these things, or you can depend on the defaults (in which case, you can skip this section).
There are two statements that affect the connection character sets:
SET NAMES character_set_name SET CHARACTER SET character_set_name
SET NAMES
indicates what is in the SQL statement that the client
sends. Thus, SET NAMES cp1251
tells the server ``future
incoming messages from this client will be in character set cp1251
''
and the server is free to translate to its own character set, if
appropriate.
SET CHARACTER SET
indicates what is in the SQL statement that
the client sends, and also what is in the result set that the
server sends back to the client. Thus, SET CHARACTER SET
includes
SET NAMES
, and also specifies what character set the column values
will have if, for example, you use a SELECT
statement.
EXAMPLE: Suppose that column1
is defined as CHAR(5)
CHARACTER SET latin2
. If you do not say SET CHARACTER SET
,
then for SELECT column1 FROM t
the server will send back all
the values for column1
using character set latin2
. If on
the other hand you say SET CHARACTER SET latin1
then the server
will, just before sending back, convert the latin2
values to
latin1
. Such conversion is slow and may be lossy.
When you execute SET NAMES
or SET CHARACTER SET
, you are also
changing the ``connection collation''. However, the connection
collation exists for consistency only. Usually its value doesn't
matter.
With the mysql
client, it is not necessary to
execute SET NAMES
every time you start up. You can add the
--default-character-set-name
option
setting to your mysql
statement line, or in your option file.
For example, the following option file setting will change the connection
character set each time you run mysql
:
[mysql] default-character-set-name=character_set_name