270 likes | 560 Views
PWB 518: Build International Applications With PowerBuilder 10. Jin-You Zhu Sr. Software Engineer jyzhu@sybase.com August 15-19, 2004. The Enterprise. Unwired. The Enterprise. Unwired. Industry and Cross Platform Solutions. Manage Information. Unwire Information. Unwire People.
E N D
PWB 518: Build International Applications With PowerBuilder 10 Jin-You Zhu Sr. Software Engineer jyzhu@sybase.com August 15-19, 2004
The Enterprise. Unwired. Industry and Cross Platform Solutions Manage Information Unwire Information Unwire People • Adaptive Server Enterprise • Adaptive Server Anywhere • Sybase IQ • Dynamic Archive • Dynamic ODS • Replication Server • OpenSwitch • Mirror Activator • PowerDesigner • Connectivity Options • EAServer • Industry Warehouse Studio • Unwired Accelerator • Unwired Orchestrator • Unwired Toolkit • Enterprise Portal • Real Time Data Services • SQL Anywhere Studio • M-Business Anywhere • Pylon Family (Mobile Email) • Mobile Sales • XcelleNet Frontline Solutions • PocketBuilder • PowerBuilder Family • AvantGo Sybase Workspace
What is Unicode? • Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. • 2 sets of Unicode • UCS-2: use one 16bit unit (2 bytes) to represent a character. (up to 65535 characters) • UCS-4: use one 32bit unit (4 bytes) to represent a character. UCS-4 is a superset of UCS-2. It includes more characters. • 3 popular Unicode Transformation Formats (UTF) • UTF-8: Use 1 to 4 bytes to represent one Unicode character. ASCII characters is the same as those in ASCII. To represent UCS-2, need 1 to 3 bytes. To represent UCS-4, needs 1-4 bytes. • UTF-16: Use 1 (UCS-2)or 2 (for UCS-4) 16bit unit to represent one Unicode character. • UTF-32: Use 1 32bit unit to represent one Unicode character.
Why use Unicode ? • Unicode allows a program or website to be targeted for multiple platforms, languages and countries. • It defines codes for all characters used in all major languages today. • It is able to encode multilingual text. • Unicode is the official way to implement ISO/IEC 10646. • It is being adopted by many of the industry leaders. • It allows data transfer between different systems without data corruption.
Benefit & Pitfalls of Using Unicode • Unicode can handle text in any language or any combination of languages. • You can process and show characters in multi-language in the single form. • It is possible that one application fits for all languages. • Conversion is only necessary on incoming and outgoing data without corrupt. • No data lose when convert from any code page to Unicode. • It simplifies operations on text because there is no longer a need to keep track of what encoding scheme is being used. • Disadvantages • Because one character in Unicode take 2 bytes, it consume more memory.
PowerBuilder 10 – Unicode Enabling • PB10 uses Unicode internally. It can process and display Unicode characters, which support Multilanguage in your applications. • Database: Support DBCS & Unicode databases. • PowerScript: PB10 has more PowerScript functions to process Unicode string and ANSI(DBCS) string. • PowerScript: Manipulation of ANSI & Unicode files • DW/XmlDW: Select/Insert/Update of Multilanguage is supported. • PBNI: two sets of interface are implemented. The users have the choice to use Unicode API or ANSI API. • Orca: two sets of interface are implemented. The users have the choice to use Unicode API or ANSI API • External Function: Support ANSI & Unicode parameters. • A migration tool is developed to help solve migration issues.
PB10 Supports ANSI & Unicode Databases (1) • ANSI/DBCS Database • A database that uses ANSI (or DBCS codepage) as its character set, such as CP1252 for European language, CP932 for Japanese, CP936 for Simplified Chinese. • Unicode Database • A Unicode database is a database whose character set is set to a Unicode format, such as UTF-8, UTF-16. • All data in database is in Unicode format, and any data saved to the database must be converted to Unicode data implicitly or explicitly. • Unicode column • A database that uses ANSI (or DBCS) as its character set may use special data types to store Unicode data. These data types are NCHAR, NVARCHAR, / NVARCHAR2. Columns with this data type can store Unicode data. Any data saved into such a column must be converted to Unicode explicitly.
PB10 Supports ANSI & Unicode Databases(2) • In PB10, Most DB interfaces support Ansi & Unicode Databases. (*)-- Need a patch for EAServer 4.2.3/5.1.
DB interface: SYC & SYJ • A new dbparm (“UTF8”) is defined for SYC/SYJ • “UTF8”: could be 1 or 0. Default value is 0. • If set this dbparm to 0, DB driver will convert the data to the client machine’s locale. Then client will convert it to Unicode. • If set it to 1, the DB driver will gives data back in Unicode for Multilanguage support. In this case, the ASE server need to be specially configured. How? Sp_configure “enable Unicode conversion” 2
DB interface: ODBC • For client/Server applications: PB10 can consume data from ANSI database and Unicode database. No special setting is needed. • For N-tier applications: if you use ODBC to connect to ASA Unicode database through connection cache, you need a special patch for EAServer 4.2.3/5.1, which add a new connection cache called “ODBCU”. With this new connection handle, PB component can access Unicode data from Unicode ASA database.
DB interface: O90 • For Client/Server applications: PB10 can consume data from Ansi database and Unicode database via O90/O84. No special setting is needed. • For N-tier applications: if you use O90 to connect to Oracle 9 database through connection cache, you need a special patch for EAServer 4.2.3/5.1, which add a new connection cache called “OCI_9U”. With this new connection handle, PB component can access Unicode data for Oracle.
DB interface: JDBC/OleDB/ADO.Net • PB10 can consume data from Ansi database and Unicode database. No special setting is needed.
DB interface: Informix Native • PB10 can consume data from Ansi database of Informix. No special setting is needed. • Note: Informix Unicode database is not supported in PB10.
PowerScript: • Data types • Functions to manipulate ANSI & Unicode string • Functions to process Unicode files
PowerScript Data types • String • String will always be a Unicode string. All data in a String will be Unicode. No ANSI String any more. • Multilanguage characters are possible to put in one PB string. • Blob • Blob remains as a binary data type. It could store binary data, ANSI characters, or Unicode characters. • How?
Conversion between String and Blobin PowerScript • Conversion from Blob to String String ( blob, {Encoding} ) • Convert a Blob to a String • Encoding could be: EncodingANSI!, Encoding UTF8! , EncodingUTF16LE! And EncodingUTF16BE!. The default is EncodingUTF16LE!. • Conversion from String to Blob Blob ( string, {Encoding} ) • Convert a String to a Blob • Encoding could be: EncodingANSI!, Encoding UTF8! , EncodingUTF16LE! And EncodingUTF16BE! . The default is EncodingUTF16LE!. • Other Conversion Functions • FromANSI()/ToANSI()/FromUnicode()/ToUnicode() are still supported, but obsolete, in PB10. We encourage users to shift to String/Blob functions.
PowerScript Functions to manipulate ANSI & Unicode string • Len/Left/Mid/Right/… • These functions are Unicode character based. • LenW/LeftW/MidW/RightW/… • All “W” functions are also Unicode character based. • Same as Len Functions • LenA/LeftA/MidA/RightA/… • A new set (“A”) of functions is added for string manipulation by byte. PB will convert the PB String (Unicode) to DBCS (based on machine’s locale), then apply the operation. • Migration tool help identify/replace these functions.
PowerScript Functions to process Unicode files • File Types • ANSI/DBCS files • Unicode (UTF16/UTF8) files (New) • Binary files • File Operation Functions • FileEncoding(filename) • FileOpen(filename {,filemode {,fileaccess {,filelock {,writemode {,Encoding}}}}}) • Filemode – LineMode, StreamMode, TextMode • FileRead/FileWrite ---- Read/Write in 32765 chunk • FileReadEX/FileWriteEx ---- Read/Write a file • FileSeek/FileSeek64 • Encoding • EncodingANSI!, Encoding UTF8! , EncodingUTF16LE! And EncodingUTF16BE! • Conversion • When Read/Write, Conversion will take place if needed
PowerScript: Examples • Read an Ansi File Integer li_FileNum String s_rec li_FileNum = FileOpen("Employee.txt") // or li_FileNum = FileOpen("Emplyee.txt", TextMode!) FileRead(li_FileNum, s_rec) • Read a Unicode File Integer li_FileNum String s_rec li_FileNum = FileOpen("EmployeeU.txt", TextMode!, Read!, EncodingUTF16LE!) FileRead(li_FileNum, s_rec) • Read a Binary File Integer li_FileNum blob bal_rec li_FileNum = FileOpen("Employee.imp“, Stream Mode!, Read!) FileRead(li_FileNum, bal_rec)
DataWindow • DataWindow support Multilanguage display and manipulation in PB10. • DW string related functions are changed to be consistent to PowerScript functions • DW file manipulation functions extended to Unicode files also.
JSP Authoring& Web Services • JSP • JSP Authoring tool also is Unicode enabled in PB10. • The users have the choice to save the JSP files in different format. (Unicode, UTF8, ANSI are supported). • JSP page can process ANSI & Unicode request. • Web Services • In PB10, Web Services client can handle international characters.
XML DataWindow • Tips • <%@ page contentType="text/html; charset=UTF-8" %> • request.setCharacterEncoding("UTF8"); • XML DataWindow • Process ANSI & Unicode data.
PBNI & ORCA APIs • PBNI • In PB10, PBNI offers 2 sets of APIs– One is for ANSI, the other is for Unicode. So the users can develop PB extensions using ANSI build or Unicode build as they like. • PBNI has templates for users to use. • ORCA • PB10 offers 2 sets of APIs(ANSI & Unicode) for C functions to extract object form PBL or construct PBL from Object file.
External Function Call • Purpose • You can define PB global or local functions to map to external function call to system or 3rd party Dlls • Change of the Syntax • In PB9 and before: the syntax is: FUNCTION int MessageBoxA(int handle, string content, string title, int show type) LIBRARY "user32.dll" • In PB10, FUNCTION int MessageBox(int handle, string content, string title, int showtype) LIBRARY "user32.dll" ALIAS FOR "MessageBoxA;ansi“ -- use ansi version of system function FUNCTION int MessageBox(int handle, string content, string title, int showtype) LIBRARY "user32.dll" ALIAS FOR “MessageBoxW“ -- use Unicode version of system function • Migration tool will help identify and replace the external function call in your existing application.
Summary • PB10 supports Multilanguage process natively. • PB10 supports both ANSI and Unicode databases. • PowerScript can handle Ansi, Unicode, and binary files. • DataWindow and XML DataWindow can process Multilanguage as well. • JSP supports Multilanguage editing and deployment. • Through PBNI, the users have the flexibility to develop Ansi extension or Unicode extension. • PB10 can integrate Ansi and Unicode Dlls into PB10.
Q & A ?