See Also: String Functions
Returns a normalized copy of the passed String.
NormalizeString( {StringVal}[, {NormalizationForm} ])
Where:
{StringVal} is the string value that needs to be normalized
{NormalizationForm} specifies the Unicode normalization form to use (normNFC / normNFD / normNFKC / normNFKD)
Unicode uniquely identifies each character using a number. This number is also referred to as the code point. Some characters consist of multiple code points, that together are displayed as a single character. In a lot of cases these characters built of multiple code points are available as a single code point as well. There are even characters that are built with more than two code points that can be placed in a random order.
These characters that can be represented in different ways can be problematic when searching through source code (using Pos for example) or when using them as unique keys in your database. Normalizing strings solves this by adjusting the string to use the same version of each character.
Unicode specifies four different normalization forms:
normNFC
Canonical Decomposition, followed by Canonical Composition (the default if not specified).
normNFD
Canonical Decomposition.
normNFKC
Compatibility Decomposition, followed by Canonical Composition.
normNFKD
Compatibility Decomposition.
The following example demonstrates normalization using two strings. The sNorm string contains the single code point version of ‘Latin Small Letter N with Tilde’ where sComposite composes the character using code points ‘Latin Small Letter N’ and ‘Combining Tilde’. It shows that using Pos one notation won’t find the other, but when using NormalizeString it will find the character. Note that to be safe you should make sure that both strings are normalized to the same form.
// ñ (‘Latin Small Letter N with Tilde’)
Move (Character(241)) to sNorm
// ñ (‘Latin Small Letter N’ + ‘Combining Tilde’)
Move (Character(110) + Character(771)) to sComposite
Move (Pos(sNorm, sComposite)) to iPos // Results in 0 (not found)
Move (Pos(sNorm, NormalizeString(sComposite))) to iPos // Results in 1
Move (Pos(NormalizeString(sNorm, normNFD), sComposite)) to iPos // Results in 1
Move (Pos(NormalizeString(sNorm), NormalizeString(sComposite))) to iPos // Results in 1