Making Existing SQL Databases Unicode-capable

DataFlex supports using Unicode data in your applications. For developers running prior revisions of DataFlex (19.1 and eariler) on SQL databases, and want to make their application Unicode-ready, some steps may need to be taken to make those databases able to appropriately handle Unicode data.

We outline below the basic actions to prepare a DataFlex database environment to work with Unicode. Note that once databases are set to be Unicode-capable and Unicode data is stored in them, these changes cannot be reversed without the risk of losing information.

Preparing for a Unicode-capable Environment

To migrate an existing SQL database to become Unicode capable, take the following steps:

Note: The DataFlex SQL drivers fully support both non-Unicode and Unicode databases. Migration is not a requirement. Leaving databases as non-Unicode capable just means it cannot store Unicode data.

Analyze and Migrate OEM Tables

Existing SQL databases created by DataFlex versions prior to DataFlex 2021 may be in either ANSI or OEM format. ANSI tables support Unicode data, however, OEM tables do not. OEM tables are only supported in DataFlex 2021 and higher for backward compatibility.

In order to make your database Unicode-capable, you will need to migrate any OEM tables to ANSI. Read how to handle OEM data in DataFlex so you can adjust your applications accordingly.

See Also

Table Character Format in DataFlex

Convert Databases to Unicode-capable

The process of converting databases to Unicode-capable will depend on the type of SQL database backend being used. Before going through details on the conversion, it will be interesting to review what different database may require to properly handle Unicode data.

SQL Databases and Unicode

The various available SQL database systems can handle Unicode data in different ways. In general, there are two main methods:

  1. Special Unicode Data Types - This is the case in SQL Server, for example. Non-Unicode data types, like char and varchar, cannot store Unicode data. The N data types, like nchar and nvarchar, are capable of storing Unicode data.
  2. Unicode Code Page/Character Set/Collating Settings - These settings can usually be configured to use a Unicode capable value (utf-8, utf-16) or a non-Unicode value (ANSI, OEM, Latin).

In MySQL, for example, a database can be created with character set latin1 or utf8mb4. In a latin1 database, char/varchar columns will be stored as latin1 (ANSI) and cannot store Unicode characters. In a utf8mb4 database, char/varchar columns will be stored as utf8 and can store Unicode data.

Unicode Support in SQL Database Systems

Each database system will have their way to work with Unicode data. Here is how some of the most popular databases enable the use of Unicode:

SQL Server

Special Unicode data types - the N types (nchar,nvarchar).

In SQL Server 2019, Unicode data can also be stored in char/varchar columns by setting the collating of the column to a utf8 collating.

MySQL

Character set setting

PostgreSQL

Encoding setting

DB2

Has both special Unicode types (Graphic types) and codeset setting.

The following links include an overview on setting up a database for the different database environments:

 

Making SQL Server Databases Unicode-capable

To make SQL Server databases Unicode-capable all char/varchar columns should be changed to nchar/nvarchar.

Columns of the text data type should be changed to nvarchar(max) since Microsoft already indicated that text/ntext data types will be removed in a future version of SQL Server.

Before changing the data type, make sure all OEM tables have been migrated to ANSI.

Once all tables are ANSI, you will have a few options on how to change char/varchar columns to nchar/nvarchar:

 

The following program illustrates how to convert all char/varchar columns to nchar/nvarchar for all tables in a workspace:

    Procedure ConvertAllTablesToUnicodeNTypes

        Handle hTable

        

        Repeat

            Get_Attribute DF_FILE_NEXT_USED of hTable to hTable

            If (hTable > 0 and (hTable <> 50)) Begin

                Send ConvertTableToUnicodeNTypes hTable

            End

        Until (hTable = 0)

        

    End_Procedure

    Procedure ConvertTableToUnicodeNTypes Handle hTable

    

        Integer iNumColumns iColumn iDFType iNativeType

        

        String sTableName

        String sLogicalName

        String sDriverName

        Boolean bTableOpenError

        String  sColumnName sNativeTypeName

        // trap open table error so it's not fatal

        Move False to Err

        Send Ignore_All to Error_Object_Id        

        Open hTable

        Move Err to bTableOpenError

        Send Trap_All of Error_Object_Id

        

        If (bTableOpenError = False) Begin

            Get_Attribute DF_FILE_ROOT_NAME of hTable to sTableName

            Get_Attribute DF_FILE_LOGICAL_NAME of hTable to sLogicalName

            Get_Attribute DF_FILE_DRIVER of hTable to sDriverName

        

            If (sDriverName = 'MSSQLDRV') Begin

                Structure_Start hTable

                

                Get_Attribute DF_FILE_NUMBER_FIELDS of hTable to iNumColumns

                For iColumn from 1 to iNumColumns

                    

                    Get_Attribute DF_FIELD_NAME of hTable iColumn to sColumnName

                    Get_Attribute DF_FIELD_TYPE of hTable iColumn to iDfType

                    

                    Get_Attribute DF_FIELD_NATIVE_TYPE of hTable iColumn to iNativeType

                    Get_Attribute DF_FIELD_NATIVE_TYPE_Name of hTable iColumn to sNativeTypeName

        

                    If (iDFType = DF_ASCII) Begin

                        If (iNativeType = SQL_CHAR) Begin

                            // Convert char to nchar     

                            Set_Attribute DF_FIELD_NATIVE_TYPE of hTable iColumn to SQL_WCHAR

                        End

                        

                        If (iNativeType = SQL_VARCHAR) Begin

                            // Convert varchar to nvarchar             

                            Set_Attribute DF_FIELD_NATIVE_TYPE of hTable iColumn to SQL_WVARCHAR

                        End

                    End

        

                    If (iDFType = DF_TEXT) Begin

                        If (iNativeType = SQL_LONGVARCHAR) Begin

                            // Convert text to ntext

                            Set_Attribute DF_FIELD_NATIVE_TYPE of hTable iColumn to SQL_WLONGVARCHAR

                        End

                        If (iNativeType = SQL_VARCHARMAX) Begin

                            // Convert varcharmax to nvarcharmax

                            Set_Attribute DF_FIELD_NATIVE_TYPE of hTable iColumn to SQL_WVARCHARMAX

                        End

                    End

                Loop

            End            

        End

    End_Procedure

 

Making Other Databases Unicode-capable

Database systems without special Unicode data types use code set/collating settings to define a database as Unicode or Non-Unicode database. In MySQL, for example, a database can be created with character set latin1 or utf8mb4:

Making a non-Unicode database Unicode-capable in this case means that the code set or the collating of the database must be changed.

It is important to note that DataFlex SQL drivers will work with both Unicode and non-Unicode databases. The drivers will take care of character set conversions as needed.

In order to make a non-Unicode database Unicode-capable, you may use one of the following methods:

This varies with database system (see the Further Information on Other Databases and Unicode section below). Often this can be done on a database, table, or column level.

Be aware that changing the collating of a database does NOT change the collating of the existing data. Existing data must still be converted. You may accomplish that by exporting data in the current collating and importing it once the new collating is set, or using another technique available in the database system – for example, executing an ALTER TABLE [table] CONVERT TO CHARACTER SET [charset] COLLATE [collate] statement in MySQL.

 

Further Information on Other Databases and Unicode

Here are some useful links with information about the various database systems that may be used with DataFlex 20 and their Unicode handling capability.

SQL Server 2019 UTF8 Columns

SQL server 2019 supports Unicode data in char/varchar columns by setting the collating to an utf8 collating.

The links below contain information on how to change the collation of char/varchar columns to an utf8 collating:

Further information on data types and character set in SQL Server:

MySQL

The links below contain information on how to convert a MySQL ANSI (Latin1) database to Unicode (utf8mb4) database:

Further information on character set and collating in MySQL:

MariaDB

The links below contain information on character set and collating in MariaDB:

Oracle

The links below contain information on character set and migration in Oracle:

PostgreSQL

The links below contain information on character set in PostgreSQL:

 

See Also

Working with DataFlex 2021