DIN 91379

The DIN standard DIN 91379: "Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM" defines a normative subset of Unicode Latin characters, sequences of base characters and diacritic signs, and special characters for use in names of persons, legal entities, products, addresses etc. The standard defines a normative mapping of Latin letters to base letters A-Z as an extension of the recommendations of ICAO.

In the informative part of the standard, a set of extended characters is defined, which includes Greek and Cyrillic letters as well as other special characters for names of legal entities and product names.

Languages and scripts supported

The subset supports all official languages of European Union countries as well as the official languages of Iceland, Liechtenstein, Norway, Switzerland, and also the German minority languages.

To support other languages that do not use the Latin writing system, the set of normative letters contains all combinations of Latin letters with diacritical marks that are necessary for the transliteration of names into the Latin writing system according to the ISO standards relevant at the time of publication.

The standard supports the necessary characters for entries in the civil status registers. According to the Law on the Convention of September 13, 1973 on the recording of surnames and forenames in civil status registers information in Latin characters is to be taken over true to the letter with all diacritic marks and information in other characters is to be reproduced by transliteration, if possible in accordance with ISO standards.

This support is not complete; for non-European languages that use Latin script, for example Vietnamese is supported, but not, for example, the Togo national languages Ewe (ÃÂ, ÃÂ, ÃÂ, ÃÂ£, ÃÂ, ÃÂ are missing) and Kabiye (ÃÂ, ÃÂ, ÃÂ£, ÃÂ©, ÃÂ, ÃÂ are missing), the South African official language Tshivenda (Ã¡Â¸Â, Ã¡Â¸Â½, Ã¡Â¹Â, Ã¡Â¹Â± are missing), the Namibian national language Khoekhoegowab (the click sound letters ÃÂ, ÃÂ, ÃÂ, ÃÂ are missing), or Tongan (the fakauÃÂ»a is missing). Although the characters mentioned in brackets appear in personal names in the respective countries, the standard does not mention any transliteration rules or mapping rules for writing names in basic Latin letters.

In addition to the normative characters the standard defines subsets of extended characters that contain modern Greek letters for Greece and Cyprus, Cyrillic letters for Bulgaria and special characters for names of products and legal entities.

Conforming applications may support additional characters, however for interface agreements or registers it may be appropriate to support only a final subset of characters and sequences based on this standard.

The text of the predecessor, DIN SPEC 91379, explanations and lists of characters and sequences as Excel and XML files can be found in Koordinierungsstelle fÃÂ¼r IT-Standards (KoSIT). This reference contains also an XML schema file with patterns to check conformance of text to subsets defined in this standard. Lists of characters and sequences of DIN SPEC 91379 and DIN 91379 as plain text files are available via GitHub in DIN 91379 Characters and Sequences. The DIN contains few additional characters and sequences.

Application of the standard

All IT procedures used for data exchange within and between the federal and state governments or for data exchange with citizens and companies must comply with DIN 91379 from 1 November 2024.

The architecture guideline for German federal IT demands the usage of the predecessor DIN SPEC 91379 in the version from July 2022.

Continuous text and historic letters are not in the scope of this norm.

Structure of the standard

The DIN standard consists of a normative and an informative part.

The requirements in the normative part are binding for all compliant systems. In the normative part, the letters for processing names with basic Latin letters and diacritics are specified. All compliant systems must support these letters. Furthermore, a mapping of the normative letters to the basic Latin letters A-Z is defined.

A compliant system may support additional letters in addition to the normative letters.

The recommendations in the informative part are not binding for compliant systems. The informative part determines a UNICODE subset of extended letters, e.g. for legal entities, product names and for data exchange in the EU. In addition the informative part defines data types that can be used for checking data fields.

Normative part

Compliance

To be compliant to this norm, it is required to

support all normative letters and sequences at all processing stages,
use the encoding UTF-8 at interfaces, and
normalize the characters according to Unicode normalization form C (NFC).

Normative letters

Any conforming IT system must be able to process the normative letters in all name fields. This includes the collection, storage, transmission, display, and printout.

The normative character groups are given below. The associated characters can also be found in DIN 91379 Characters and Sequences for machine processing. The following tables of characters were generated from the XML file chars.xml in the DIN appendix.

Latin letters (bll)

These letters must be supported to represent names, especially personal names.

Non-letters N1 (bnlreq)

These characters must be supported to represent names, especially personal names.

Non-letters N2 (bnl)

These characters must be supported to represent names in a broader sense, e. g. place names, street names, house numbers, legal entity names, and product names. They are not suitable for personal names.

Non-letters N3 (bnlopt)

These letters are included for backwards compatibility with the standard Latin characters in Unicode. Version 1.1.1.

They are not relevant for personal names or other names, only for legal entity names and product names.

Non-letters N4 (bnlnot)

These whitespace letters are unsuitable for representing names, but they must be processed.

The letter NO-BREAK SPACE is necessary to prevent a line break in special names that could change the meaning. The other letters are included for backwards compatibility with the standard Latin characters in Unicode. Version 1.1.1.

Deprecated letters

Existing documents and register entries contain deprecated letters that are no longer used today. These letters must be supported by compliant IT systems. When creating new entries, deprecated letters should not be used.

Normative mapping of Latin letters to basic letters (search form)

A normative mapping of all normative letters to the basic Latin letters AÃ¢ÂÂZ is given below. This mapping is required, for example, for the machine-readable zone of passports. Another application is the creation of search forms, so that names can be found even if they are spelled differently or without specifying the diacritics.

The following table is based on table 9 of DIN 91379 and chapter 6, table A of the ICAO specifications for machine-readable travel documents. The table was created with the information from the XML file chars.xml in the DIN 91379 appendix.

Entries that appear in the ICAO specification and in table 9 of DIN are marked with ICAO in the Mapping column, additional entries in table 9 of the DIN are marked with EXT. In the Type column, ID is specified for entries that describe an identity mapping, and MAP for other mappings.

Informative part

Extended letters

Each conforming IT system should be able to handle the extended letters for all name fields. This includes the collection, storage, transmission, display, and printout.

Greek letters (gl)

For cross-border data exchange, every IT system should support Greek letters in name fields.

Cyrillic letters (cl)

For cross-border data exchange, every IT system should support Cyrillic letters in name fields for Bulgarian names.

Non-letters E1 (enl)

These letters should be supported for legal entity names and product names.

Technical data types (informative)

For information, technical data types are defined as subsets of the letters defined in the standard. These can be used for interface agreements, for technical checks or as a basis for creating your own data types. An implementation as an XML schema type is included in the din-91379-datatypes.xsd file attached to the standard. This implementation is also freely available under the CC BY-ND license as part of the XOEV library.

Added letters

Compared to DIN SPEC 91379, some additional letters have been included, only two of these letters are not deprecated.

Current state

Current results of the standardization process include the specification DIN SPEC 91379 in March 2019 and final DIN standard in August 2022. The CEN/TC 224/WG 19 working group is working on the further development of this standard into the European standard EN 00224284 in the 04301181 project. According to AFNOR norminfo the project started in Dec. 2024 with a design phase, in April 2026 a public inquiry should start and the publication of the standard is planned for Nov. 2027.

Open-source software supporting DIN 91379

Free Java library for creating and editing PDF supporting DIN 91379:
OpenPDF
Free converter from XSL formatting objects to PDF
Apache FOP

Free Fonts for DIN 91379
Arimo
Noto Latin, Greek, Cyrillic, see also issue "Combining comma above right" at wrong position
Sudo coding font

Related standards

Keyboard standard DIN 2137

The German keyboard layouts E1 and E2 standardized in the DIN 2137-1 standard enable the entry of all characters listed in DIN 91379 except Cyrillic letters without recourse to their Unicode value or their decimal code. Achieving this was one of the main reasons for revising these keyboard layouts compared to the previous version DIN 2137-1:2018-12.

Character naming and spelling standard DIN 5009

The version of DIN 5009:2022-06 Ã¢ÂÂWord and information processing for office applications Ã¢ÂÂ Announcing and dictating of text and charactersÃ¢ÂÂ published in May 2022 together with its supplement "Announcing, naming and keyboard input of special letters and characters" contains German-language names, spelling rules and spelling announcement words for all characters listed in DIN 91379 (except some outdated characters and the Greek and Cyrillic letters). This ensures that the characters can be reproduced correctly in oral communication (e.g. on the telephone).

External links

Adobe Glyph List