Skip to Content

mvSQL String Literals

The SQL standard defines two character string types, each with its own literal syntax.

  • ANSI
  • National

ANSI character strings represent ASCII and other single-byte character sets, such as EBCDIC on DB2/i.

National character strings represent multi-byte character sets such as Unicode.

The SQL standard defines a separate literal syntax for each character string type. The syntax for an ANSI string literal is:

'Aristotle''s quotes'

Note that single-quotes embedded within the string literal are escaped by doubling the single quote.

The syntax for a National string literal is:

N'Aristotle''s quotes'

mvSQL supports the SQL Standard syntax for ANSI and National character string literals.

Note

Using the wrong SQL literal syntax can result in significant performance degradation. Given a column that stores data in an ANSI character set, a search condition containing a National string literal will not take advantage of any indexes and will force a conversion of column values.

Some database engines, like Oracle, do not support implicit conversions between character sets. Using the appropriate character string literal syntax will avoid errors and obviate the need for explicit conversions.

Available in Vinyl 2.1. Improved in Vinyl 2.3.

Vendor Support

Unless otherwise documented, ANSI and National character string literals are mapped directly to the native vendor equivalent.

MySQL

MySQL does not define a separate syntax for ANSI and National character string literals. In MySQL, the string literal is assumed to be encoded using the connection's character set. This means that the the encoding can vary based on the connection settings. If the connection uses an ANSI encoding (such as the default latin1 encoding), then all strings are assumed to be ANSI.

Note

Vinyl defaults MySQL connections to the UTF-8 character set. Use the CharSet parameter to change the MySQL connection character set. For instance, to use the MySQL default ASCII character set, add CharSet=latin1 to the Advanced settings.

MySQL supports character set introducers. The "introducer" is a prefix added to the beginning of the string literal value The prefix indicates the character set of the string. Vinyl uses character set introducers to ensure that Unicode data can be encoded in string literals, regardless of the connection settings. Conversely, if the connection uses the UTF-8 character set, the character set introducer ensures that ASCII data is passed to the server as ASCII, avoiding problematic conversions.

In practice, this means that an mvSQL ANSI character string literal will be translated to:

_latin1'Aristotle''s quotes'

An mvSQL National character string literal will be translated into MySQL's syntax as:

_utf8'Aristotle''s quotes'

MySQL does not define separate storage data types for ANSI and National character types. Instead, all string types are stored in a common data type. The character set determines the how the data is stored. Vinyl maps MySQL character columns with a Unicode character set (utf8) to the corresponding, vendor-neutral National character string type (e.g. NCHAR or NVARCHAR).

DB2/i

Like MySQL, DB2/i does not have separate data types for ANSI and National character strings. Instead, the CCSID determines the encoding. The CCSID is a unisgned, 16-bit integer. It's similar in nature to the Windows LCID.

As with MySQL, Vinyl inspects the CCSID to properly map from a DB2/i character type (e.g. VARCHAR) to a Vinyl, vendor-neutral data type (e.g. NVARCHAR). The following CCSIDs will be mapped to the NVARCHAR data type:

  • 1200 - UTF-16
  • 1208 - UTF-8
  • 13488 - UCS-2

Conversely, when creating columns with a National character data type, Vinyl explicitly sets the CCSID. Specifically, Vinyl sets the CCSID to 1208 (UTF-8).

Finally, DB2/i does not have to have a separate syntax for ANSI and National character strings. Therefore, the same syntax is used for both.

Non-RDBMS data sources

mvSQL National string literals are not supported for non-RDBMS data sources such as REST and Salesforce, etc.

Further Reading

https://learn.microsoft.com/en-us/sql/relational-databases/collations/collation-and-unicode-support

https://docs.microsoft.com/en-us/sql/t-sql/data-types/constants-transact-sql

https://dev.mysql.com/doc/refman/8.0/en/charset-introducer.html