my-server
← Wiki

Videotex character set

The character sets used by Videotex are based, to greater or lesser extents, on ISO/IEC 2022. Three Data Syntax systems are defined by ITU T.101, corresponding to the Videotex systems of different countries.

Data Syntax 1

Data Syntax 1 is defined in Annex B of T.101:1994. It is based on the CAPTAIN system used in Japan. Its graphical sets include JIS X 0201 and JIS X 0208.

The following G-sets are available through ISO/IEC 2022-based designation escapes:

Mosaic sets for Data Syntax 1

The mosaic sets supply characters for use in semigraphics.

� Not in Unicode

Data Syntax 2

Data Syntax 2 is defined in Annex C of T.101:1994. It corresponds to some European Videotex systems such as CEPT T/CD 06-01. The graphical character coding of Data Syntax 2 is based on T.51.

The default G2 set of Data Syntax 2 is based on an older version of T.51, lacking the non-breaking space, soft hyphen, not sign (¬) and broken bar (¦) present in the current version, but adding a dialytika tonos (΅—combining form is U+0344) at the beginning of the row of diacritical marks for combination with codes from a Greek primary set. An umlaut diacritic code distinct from the diaeresis code, as included in some versions of T.61, is also sometimes included.

The default G1 set is the second mosaic set, corresponding roughly to the second mosaic set of Data Syntax 1. The default G3 set is the third mosaic set, matching the first mosaic set of Data Syntax 1 for 0x60 through 0x6D and 0x70 through 0x7D, and otherwise differing. The first mosaic set matches the second except for 0x40 through 0x5E: 0x40 through 0x5A follow ASCII (supplying uppercase letters), whereas the remainder are national variant characters; the displaced full block is placed at 0x7F.

  • Representation of 0x5B-5E is not guaranteed in international communication and may be replaced by national application oriented variants.
  • 0x5F may be displayed either as ⌗ (square) or _ (lower bar) to represent the terminator function required by Videotex services.

Data Syntax 3

Data Syntax 3 is defined in Annex D of T.101:1994. The graphical character coding of Data Syntax 3 is based on T.51.

The supplementary set for Data Syntax 3 is based on an older version of T.51, lacking the non-breaking space, soft hyphen, not sign (¬) and broken bar (¦) present in the current version, and allocating non-spacing marks for a "vector overbar" and solidus and several semigraphic characters to unallocated space in that set.

See the comments in the T.51 article for caveats about the combining mark Unicode mappings shown below. Unlike Unicode combining characters, T.51 diacritic codes precede the base character.

C0 control codes

C0 control codes for Videotex differ from ASCII as shown in the table below. The , , (LS1), (LS0) and codes are also available in some or all data syntaxes, but without change in name or semantic from ASCII.

C1 control codes

The following specialised C1 control codes are used in Videotex. There are four registered sets, with some differences between them.

References