300 likes | 425 Views
Globalization. Objectives. After completing this lesson, you should be able to: Determine a correct database character set that meets your business requirements Obtain globalization support configuration information
E N D
Objectives • After completing this lesson, you should be able to: • Determine a correct database character set that meets your business requirements • Obtain globalization support configuration information • Customize language-dependent behavior for the database and individual sessions • Specify different linguistic sorts for queries • Retrieve data that matches a search string ignoring case or accent differences
Globalization Support Features • Language support • Territory support • Character set support • Linguistic sorting • Message support • Date and time formats • Numeric formats • Monetary formats French data Japanese data
What Every DBA Needs to Know • What is a character set? • How are character sets used? • Problems to avoid • Choosing your character set • Obtaining character set information • Specifying language-dependent behavior • Using linguistic searching and sorting • Using data conversion
What Is a Character Set? • The Oracle database supports different classes of character-encoding schemes: • Single-byte character sets • 7-bit • 8-bit • Multibyte character sets, including Unicode
Understanding Unicode AL32UTF8 AL16UTF16 c 63 0063 á C3 91 00E1 Supplementary characters t 74 0074 A89E EE AA 9E D834 DD1E F0 9D 84 9E d 0064 64 ö 00F6 C3 B6 0424 D0 A4 Encoding: Representing characters with byte sequences
How Are Character Sets Used? • Oracle Net compares the client NLS_LANG setting to the character set on the server. • If needed, conversion occurs automatically and transparently. NLS_LANG Oracle Net Client Server
Problems to Avoid • Example: • No conversion occurs, because it does not seem to be required. • Issue: Invalid data are entered into the database. NLS_LANG: AL32UTF8 Oracle Net Client Windows English Code page: WE8MSWIN1252 Server Database character set: AL32UTF8
Another Sample Problem CREATE DATABASE ... CHARACTER SET US7ASCII NATIONAL CHARACTER SET UTF8 ... % export NLS_LANG=‘SIMPLIFIED CHINESE_HONG KONG.ZHS16GBK’
Choosing Your Character Set • Trade-offs to consider • Choosing the correct character set that meets your business requirements now and in the future • Specifying the character set • Changing the character set after database creation
Obtaining Character Set Information SQL> SELECT parameter, value 2 FROM nls_database_parameters 3 WHERE parameter LIKE '%CHARACTERSET%'; PARAMETER VALUE ----------------------- ------------- NLS_CHARACTERSET WE8ISO8859P1 NLS_NCHAR_CHARACTERSET AL16UTF16 2 rows selected.
Specifying Language-Dependent Behavior Initialization parameters for the database server Environment variables for the clients ALTER SESSION command SQL function SELECT sysdate FROM dual;
Specifying Language-Dependent Behavior for the Session • Specify the locale behavior with the NLS_LANG environment variable: • Language • Territory • Character set • Set other NLS environment variables to: • Override database initialization parameter settings for all sessions • Customize the locale behavior • Change the default location of the NLS library files NLS_LANG=FRENCH_CANADA.WE8ISO8859P1
Specifying Language-Dependent Behavior • Using NLS parameters in SQL functions: ALTER SESSION SET NLS_DATE_FORMAT='DD.MM.YYYY'; DBMS_SESSION.SET_NLS('NLS_DATE_FORMAT', '''DD.MM.YYYY''') ; SELECT TO_CHAR(hire_date,'DD.Mon.YYYY', 'NLS_DATE_LANGUAGE=FRENCH') FROM employees WHERE hire_date > '01-JAN-2000';
Linguistic Searching and Sorting • Sort order can be affected by: • Case-sensitivity • Diacritics or accent characters • Combination of characters that is treated as a single character • Phonetics or character appearance • Cultural preferences
Linguistic Searching and Sorting • Three types of sorting: • Binary sorting • Sorted according to the binary values of the encoded characters • Monolingual linguistic sorting • A two-pass sort based on a character’s assigned major and minor values • Multilingual linguistic sorting • Based on the ISO standard (ISO 14651), and the Unicode 3.2 Standard for multilingual collation • Ordered by the number of strokes, PinYin, or radicals for Chinese characters
Using Linguistic Searching and Sorting • You can specify the type of sort used for character data with the: • NLS_SORT parameter • Default value derived from the NLS_LANG environment variable, if set • Can be specified for the session, client, or server • NLSSORT function • Defines the sorting method at the query level
Case-Insensitive and Accent-InsensitiveSearch and Sort • Specify the linguistic name: • Examples: • Specify the sort action for WHERE clauses and PL/SQL blocks: • Useful for migrated databases NLS_SORT = <NLS_sort_name>[_AI | _CI] NLS_SORT = FRENCH_M_AI NLS_SORT = XGERMAN_CI NLS_COMP = BINARY | ANSI
Support in SQL and Functions • The following SQL clauses support NLS_SORT and NLS_COMP settings: • WHERE • ORDER BY • START WITH • HAVING • IN/NOT IN • BETWEEN • CASE-WHEN • The NLSSORT() function supports the case-insensitive and accent-insensitive functionality.
Linguistic Index Support • Create an index on linguistically sorted values. • Rapidly query data without having to specify ORDER BY clause and NLSSORT: • Set the NLS_SORT parameter to match the linguistic definition that you want to use for the linguistic sort when creating the index. CREATE INDEX list_word ON list (NLSSORT(word, 'NLS_SORT=French_M')); SELECT word FROM list;
Customizing Linguistic Searching and Sorting • You can customize linguistic sorting for: • Ignorable characters • Contracting or expanding characters • Special combination letters or special letters • Expanding characters or special letters • Special uppercase and lowercase letters • Context-sensitive characters • Reverse secondary sorting • Canonical equivalence
Implicit Conversion Between CLOB and NCLOB • Transparent implicit conversion is supported in: • SQL IN and OUT bind variables for query and DML • PL/SQL functions and procedure parameter passing • PL/SQL variable assignment
NLS Data Conversion with Oracle Utilities • Multiple data conversions can take place when data is exported from one database and imported into another if the same character sets are not used. • External tables use the NLS settings on the server for determining the data character set. • SQL*Loader: • Conventional path: Data is converted into the session character set specified by NLS_LANG. • Direct path: Data is converted using client-side directives.
NLS Data Conversion with Data Pump • Data Pump Export always saves data in the same character set as the database from which the data originates. • Data Pump Import converts the data to the character set of the target database, if needed. • The Data Pump log file is written in the language specified by NLS_LANG for the session that started Data Pump.
Language and Character Set File Scanner (LCSSCAN) Character set
Setting the Database Time Zone • The current time zone in the database is determined by the following: • The SET TIME_ZONE clause of the CREATE DATABASE statement • The time zone of the operating system on the database server host • The time zone specified by the ALTER SESSION SETTIME_ZONE command CREATE DATABASE ... SET TIME_ZONE='-04:00';
Summary • In this lesson, you should have learned how to: • Determine a correct database character set that meets your business requirements • Obtain globalization support configuration information • Customize language-dependent behavior for the database and individual sessions • Specify different linguistic sorts for queries • Retrieve data that matches a search string ignoring case or accent differences
Practice 20 Overview: Using Globalization Support • This practice covers the following topics: • Determining the database character set • Setting the NLS_SORT variable