Figure 7.1 Selecting a Default Server Collation during Installation A collation defines the bit pattern that represents each character in the data set.. The collation also defines the fo
Trang 1determines the suggested default collation? If you are performing a fresh install,
i.e., SQL Server has never been installed previously on this computer, the most
appropriate collation based on the Windows regional settings will be suggested
Figure 7.1 Selecting a Default Server Collation during Installation
A collation defines the bit pattern that represents each character in the data set
Usually, the collation name starts with the language or character set, for example
Latin1_, Thai100_, Arabic100_ The collation also defines the following rules
regarding data comparison and sorting:
■
■ Case sensitivity This option defines if the comparison or sort is case
sensitive For example, in a case sensitive comparison ‘Banana’ = ‘banana’
will return true When sorted, ‘banana’ will always come before ‘Banana’
■
■ Accent sensitivity This option defines if the comparison is accent
sensitive For example, in an accent sensitive comparison ‘Valentine’ will
not be equal to ‘Vâ´lentine’
Trang 2■ Kanatype sensitivity This option defines if the comparison is sensitive to
the type of Japanese kana characters used Two types of kana characters are available: Hiragana and Katakana When a comparison is kana-insensitive, SQL Server will consider equivalent Hiragana and Katakana characters as equal for sorting purposes
■
■ Width sensitivity This option defines if the comparison treats characters
represented as a single byte as equivalent to the same character represented
as a double byte
Suffixes _CS, _AS, _KS, and _WS are used in collation set names to show that the collation is case, accent, kana, or width sensitive Alternatively CI, AI, KI, and WI can be used for case, accent, kana, or width insensitive collations Unless specifically specified, width insensitivity and kanatype insensitivity is assumed For example, the collation Latin1_General _CI_AS_KS_WS is case-insensitive, accent-sensitive, kanatype-sensitive and width-sensitive As another example, the collation SQL_Latin1_General_CP1_ CS_AS is case-sensitive, accent-sensitive, kanatype-insensitive, and width-insensitive Figure 7.2 shows the collation options when specifying a server collation
Figure 7.2 Customizing Collation Options
Trang 3We can see from Figure 7.2 that we can choose either a Windows collation or a
SQL collation The Windows collation is based on the language and rules specified by
the Windows locale This is defined in the Regional Settings Control Panel applet
One key attribute of Windows collations is that Unicode and non-Unicode data will
be sorted the same way for the language specified This avoids inconsistencies in query results, specifically, if you are sorting the same data in varchar and nvarchar type columns You may receive different results here if you are not using a Windows collation
Exam Warning
Always choose to use the default Windows collation, unless you have
a very specific reason not to For example, if you need to store non-Unicode data in a single language that is different from the language of your
computer, you will have to select a collation for that language Another
exception is when you need to run distributed queries or replicate with
an instance of SQL Server that has a different collation You should
maintain consistency between multiple interoperating systems as far
as possible.
SQL Server Collations
SQL Server collations provide comparing and sorting compatibility with earlier
versions of SQL Server, specifically SQL Server 6.5 and SQL Server 7 If you are
intending to interoperate with these versions of SQL Server, you need to pick an
appropriate SQL Server collation SQL Server collations have differing rules for
Unicode and non-Unicode data This means that if you are sorting the same data
in varchar and nvarchar type columns, you are likely to receive different results
Additionally, the sort order of non-Unicode data in the context of a SQL collation
is likely to be different to a sort order of non-Unicode data performed by the
Windows operating system Names of SQL Server collations are prefixed with SQL_
Binary Collations
Binary collations are collations that sort and compare data based on the binary values representing each character Binary collations are intrinsically case sensitive
Performing binary comparison and sort operations is simpler than performing the
same operation using a non binary collation This means that you can use binary
collations to improve the performance of your queries if they suit your requirements Binary collations have a suffix of _BIN or _BIN2 The _BIN2 binary collation
Trang 4is known as binary-code point collation This type of collation is newer and should
be used at all times whenever a binary collation is required The _BIN collation should be used for backwards compatibility only
New & Noteworthy…
New Collation Features of SQL Server 2008
SQL Server 2008 introduces 80 new collations that are aligned with the new Windows Server 2008 language features Windows Server 2008 is the latest server operating system from Microsoft, and it includes many enhancements to support multiple languages Most of the new collations
in SQL Server 2008 provide more accurate ways of sorting data for specific cultures Specifically, the new features include the following:
Unicode 5.0 case tables
■
■ Weighing for some non weighed characters
■
■ Chinese minority scripts
■
■ Support for linguistically correct surrogates
■
■ Support for new East Asian government standards
■
■ Additionally, some collations have been deprecated, like Macedonian_ CI_AS, SQL_ALTDiction_CP1253_CS_AS, Hindi_CI_AS and several others These collations are still supported, but only for backwards compatibility purposes These collations are not displayed in the list of available collations when you install SQL Server 2008, nor are they returned by the ::fn_helpcollations( ) system function.
Using Collations
You have already learned that when SQL Server 2008 is installed, you are required
to select a server level default collation As well as the server-level default, a collation can be specified at the following levels:
■
■ Database-level collation You can specify a collation for each individual
database If you don’t explicitly specify a database-level collation, the server-level default collation will be used The database collation is used as
Trang 5a default collation for new columns in the database tables Additionally, the
database collation is used when running queries against the database
■
■ Column-level collation You can specify a collation for columns of type
char, varchar, text, nchar, nvarchar, and ntext If you don’t explicitly specify
a column-level collation, the database-level default collation will be used
Example 7.1 demonstrates the use of column-level collations
■
■ Expression-level collation Sometimes you wish to use a collation different
from the database default in your query when sorting or comparing data
You can specify the collation to be used using the COLLATE clause
Database-level collations and column-level collations are specified using the
COLLATE clause for the CREATE DATABASE and CREATE TABLE statements
You can also select a collation when creating a new database or table using SQL
Server Management Studio Expression-level collation is specified using the
COLLATE clause at the end of the statement Example 7.1 demonstrates the creation
of a new database, table, and column as well as the effect of collation on query results
Example 7.1 Demonstrating Database-Level and Column-Level Collations
CREATE DATABASE ExampleDB2
COLLATE Latin1_General_CS_AS
GO
USE ExampleDB2
GO
CREATE TABLE TeamMembers
(MemberID int PRIMARY KEY IDENTITY,
MemberName nvarchar(50) COLLATE Latin1_General_CI_AI)
GO
INSERT TeamMembers(MemberName)
VALUES
(N'Valentine'),
(N'Peter'),
(N'Matthéw'),
(N'valentine'),
(N'Matthew')
GO
SELECT * FROM TeamMembers ORDER BY MemberName
Results:
MemberID MemberName
-
- 3 Matthéw
5 Matthew