Data Validation

Data validation is the process of ensuring that the structure of the data confirms to a defined format. This is especially useful for developers when they add support for a data format to their application. They are able to test that the data they produce is correct and conforms to the format.

In the world of XML, validation involves checking the names of elements, their nesting, names of attribues and the type of values in elements and attributes. Several XML Schema languages exist for the formal definition of an XML structure. For the OpenLyrics data format the RelaxNG chosen and used to create the format definition.

Tools and Libraries

The OpenLyrics RelaxNG XML schema can be used in any programming language which has libraries with support for XML schemas. It is also possible, to some extent, to convert RelaxNG schemas to other languages, like DTD or W3C XML Schema.

A list of other software for RelaxNG can be found on the RelaxNG site.

Validation examples

CLI validation using Libxml2

To validate an OpenLyrics XML file use the following command:

xmllint --noout --relaxng openlyrics-0.9.rng xmlfile.xml

xmlfile.xml is the OpenLyrics file which you need to validate and openlyrics-0.9.rng contains OpenLyrics RelaxNG XML schema. xmllint is part of the libxml2 library and can be installed on Debian based system with apt-get install libxml2-utils.

Validating using Python


#!/usr/bin/env python3

from lxml import etree

xml_validator = etree.RelaxNG(file = "openlyrics-0.9.rng")
xml_file = etree.parse("xmlfile.xml")
is_valid = xml_validator.validate(xml_file)

print(f'xmlfile.xml is{"" if is_valid else " not"} valid')

xmlfile.xml is the OpenLyrics file which you need to validate and openlyrics-0.9.rng contains OpenLyrics RelaxNG XML schema.

Bundled CLI Script

To execute the included script, use the following command:

python3 tools/ openlyrics-0.9.rng xmlfile.xml

xmlfile.xml is the OpenLyrics file which you need to validate and openlyrics-0.9.rng contains OpenLyrics RelaxNG XML schema. Python >= 3.6 and lxml is required.

RelaxNG XML schema

The following RelaxNG XML schema was created:

<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns=""

  <!-- TOP LEVEL -->

    <element name="song">
      <ref name="songAttributes"/>
      <ref name="properties"/>
        <ref name="format"/>
      <ref name="lyrics"/>

  <define name="properties">
    <element name="properties">
      <interleave> <!-- allow occur in any order -->
        <!-- at least one title is always required -->
        <ref name="titles"/>
        <!-- other properties items are optional -->
          <ref name="authors"/>
          <ref name="copyright"/>
          <ref name="ccliNo"/>
          <ref name="released"/>
        <!-- Music Info -->
          <ref name="transposition"/>
          <ref name="tempo"/>
          <ref name="key"/>
          <ref name="timeSignature"/>
        <!-- Other Info -->
          <ref name="variant"/>
          <ref name="publisher"/>
          <ref name="version"/>
          <ref name="keywords"/>
          <ref name="verseOrder"/>
          <ref name="songbooks"/>
          <ref name="themes"/>
          <ref name="comments"/>

  <define name="format">
    <element name="format">
      <ref name="formatTags"/>

  <define name="lyrics">
    <element name="lyrics">
        <!-- at least one verse is required -->
          <ref name="verse"/>
          <ref name="instrument"/>

  <!-- PROPERTIES -->

  <define name="titles">
    <element name="titles">
        <element name="title">
          <ref name="nonEmptyContent"/>
            <ref name="langAttribute"/>
              <ref name="translitAttribute"/>
            <attribute name="original">
              <data type="boolean"/>

  <!-- AUTHOR info -->

  <define name="authors">
    <element name="authors">
        <element name="author">
          <ref name="nonEmptyContent"/>
              <attribute name="type">
              <!-- when attrib 'type' value is 'translation' it can have an attribute 'lang'.
                   'xml:lang' can't be used. xml:lang means in what language is the
                   content of an element and this is not the case. -->
                <attribute name="type">
                  <ref name="langAttribute"/>

  <define name="copyright">
    <element name="copyright">
      <ref name="nonEmptyContent"/>

  <define name="ccliNo">
    <element name="ccliNo">
      <data type="positiveInteger"/>

  <define name="released">
    <element name="released">
      <!-- allowed values
           1779-12-31T13:15:30+01:00 -->
        <data type="gYear"/>
        <data type="gYearMonth"/>
        <data type="date"/>
        <data type="dateTime"/>

  <!-- MUSIC INFO -->

  <define name="transposition">
    <element name="transposition">
      <data type="integer">
        <param name="minInclusive">-11</param>
        <param name="maxInclusive">11</param>

  <define name="tempo">
    <element name="tempo">
        <!-- attrib 'type' value 'bpm' - beatss per minute required -->
          <data type="positiveInteger">
            <param name="minInclusive">30</param>
            <param name="maxInclusive">250</param>
          <attribute name="type">
        <!-- attrib 'type' value 'text' - any text -->
          <ref name="nonEmptyContent"/>
          <attribute name="type">

  <define name="key">
    <element name="key">
      <ref name="keyNote"/>

  <define name="timeSignature">
    <element name="timeSignature">
      <data type="token">
        <param name="minLength">3</param>
        <param name="maxLength">5</param>
        <!-- number between 1 and 63 + "/" + numbers: 1, 2, 4, 8, 16, 32, 64 -->
        <param name="pattern">(6[0-3]|[1-5][0-9]|[1-9])/(64|32|16|8|4|2|1)</param>

  <!-- OTHER INFO -->

  <define name="variant">
    <element name="variant">
      <ref name="nonEmptyContent"/>

  <define name="publisher">
    <element name="publisher">
      <ref name="nonEmptyContent"/>

  <define name="version">
    <element name="version">
      <ref name="nonEmptyContent"/>

  <define name="keywords">
    <element name="keywords">
      <ref name="nonEmptyContent"/>

  <define name="verseOrder">
    <element name="verseOrder">
            <ref name="verseNameType"/>
            <ref name="instrumentNameType"/>

  <define name="songbooks">
    <element name="songbooks">
        <element name="songbook">
          <attribute name="name">
            <ref name="nonEmptyContent"/>
            <!-- 'entry' is like song number but song number must not
                 always be integer and it can contain letters.
                 examples: '153c' or '023', etc. -->
            <attribute name="entry">
              <ref name="nonEmptyContent"/>

  <define name="themes">
    <element name="themes">
        <element name="theme">
          <ref name="nonEmptyContent"/>
            <ref name="langAttribute"/>
              <ref name="translitAttribute"/>

  <define name="comments">
    <element name="comments">
        <element name="comment">
          <ref name="nonEmptyContent"/>

  <!-- FORMAT -->

  <define name="formatTags">
    <!-- Allow only one set of formatting tags for lyrics -->
    <element name="tags">
      <attribute name="application">
        <ref name="nonEmptyContent"/>
        <ref name="formatTagsTag"/>

  <define name="formatTagsTag">
    <element name="tag">
      <attribute name="name">
        <ref name="nonEmptyContent"/>
      <element name="open">
        <ref name="nonEmptyContent"/>
      <!-- Close element is optional. Formatting without text may be present.
           e.g. <br/> -->
        <element name="close">
          <ref name="nonEmptyContent"/>

 <!-- LYRICS -->

  <define name="verse">
    <element name="verse">
      <ref name="verseAttributes"/>
        <ref name="langAttribute"/>
          <ref name="translitAttribute"/>
        <ref name="lines"/>

  <define name="lines">
    <element name="lines">
        <attribute name="part">
          <ref name="nonEmptyContent"/>
        <attribute name="break">
        <attribute name="repeat">
          <data type="integer">
            <param name="minInclusive">2</param>
        <ref name="linesContent"/>
      <ref name="linesContent"/>

  <define name="chord">
    <element name="chord">
      <attribute name="root">
        <ref name="musicalNote"/>
        <attribute name="bass">
          <ref name="musicalNote"/>
        <attribute name="structure">
          <ref name="chords"/>
        <attribute name="upbeat">
          <data type="boolean"/>
          <ref name="linesContent"/>

  <define name="keyNote">
      <!-- theoretical keys -->
      <!-- 10♯ -->
      <!-- 9♯ -->
      <!-- 8♯ -->
      <!-- /theoretical keys -->
      <!-- 7♯ -->
      <!-- 6♯ -->
      <!-- 5♯ -->
      <!-- 4♯ -->
      <!-- 3♯ -->
      <!-- 2♯ -->
      <!-- 1♯ -->
      <!-- 0  -->
      <!-- 1♭ -->
      <!-- 2♭ -->
      <!-- 3♭ -->
      <!-- 4♭ -->
      <!-- 5♭ -->
      <!-- 6♭ -->
      <!-- 7♭ -->

  <define name="musicalNote">
    <!-- Only English notation is allowed -->
      <!-- chromatic notes -->
      <!-- supporting theoretical keys -->
      <value>E#</value><!-- supporting major F# scale (6#) -->
      <value>B#</value><!-- supporting major C# scale (7#) -->
      <value>Fx</value><!-- supporting major G# scale (8#) -->
      <value>Cx</value><!-- supporting major D# scale (9#) -->
      <value>Gx</value><!-- supporting major A# scale (10#) -->
      <value>Cb</value><!-- supporting major Gb scale (6b) -->
      <value>Fb</value><!-- supporting major Cb scale (7b) -->
      <!-- /supporting theoretical keys -->

  <define name="chords">
      <!-- ** 2 note chords -->
      <!-- perfect 5th; power chord -->

      <!-- *** 3 note chords -->
      <!-- major -->
      <!-- minor -->
      <!-- augmented -->
      <!-- diminished -->

      <!-- **** 4 note chords -->
      <!-- dominant 7th -->
      <!-- major 7th -->
      <!-- minor 7th -->
      <!-- diminished 7th -->
      <!-- half-diminished 7th -->
      <!-- minor major 7th -->
      <!-- augmented major 7th -->
      <!-- dominant 7th flat 5 -->
      <!-- dominant 7th sharp 5; augmented 7th -->
      <!-- diminished major 7th -->
      <!-- major 7th flat 5 -->
      <!-- major 6th -->
      <!-- (major minor 6th) -->
      <!-- minor 6th -->
      <!-- (minor minor 6th) -->

      <!-- ***** 5 note chords -->
      <!-- (dominant) 9th -->
      <!-- dominant minor 9th -->
      <!-- major 9th -->
      <!-- minor (dominant) 9th -->
      <!-- minor major 9th -->
      <!-- augmented major 9th -->
      <!-- augmented (dominant) 9th -->
      <!-- half-diminished 9th -->
      <!-- half-diminished minor 9th -->
      <!-- diminished 9th -->
      <!-- diminished minor 9th -->
      <!-- dominant flat 10 -->

      <!-- ****** 6 note chords -->
      <!-- (dominant) 11th -->
      <!-- major 11th -->
      <!-- minor (dominant) 11th -->
      <!-- minor major 11th -->
      <!-- acoustic (dominant) 11th -->
      <!-- acoustic major 11th -->
      <!-- acoustic minor (dominant) 11th -->
      <!-- acoustic minor major 11th -->
      <!-- augmented major 11th -->
      <!-- augmented (dominant) 11th -->
      <!-- half-diminished 11th -->
      <!-- diminished 11th -->

      <!-- 7 note chords -->
      <!-- (dominant) 13th -->
      <!-- major 13th -->
      <!-- minor (dominant) 13th -->
      <!-- minor major 13th -->
      <!-- (dominant) 13th -->
      <!-- major 13th -->
      <!-- minor (dominant) 13th -->
      <!-- minor major 13th -->
      <!-- augmented major 13th -->
      <!-- augmented (dominant) 13th -->
      <!-- half-diminished 13th -->

      <!-- *** Figured 3 note chords -->
      <!-- major/minor suspended 4th -->
      <!-- major/minor suspended 2nd -->

      <!-- **** Figured 4 note chords -->
      <!-- dominant (7th) major 6th -->
      <!-- major 6th 9th -->
      <!-- major added 9th -->
      <!-- minor added 9th -->
      <!-- augmented added 9th -->
      <!-- major 6th suspended 4th -->
      <!-- major 6th suspended 2nd -->
      <!-- minor 6th suspended 4th -->
      <!-- minor 6th suspended 2nd -->
      <!-- dominant/minor 7th suspended 4th -->
      <!-- dominant/minor 7th suspended 2nd -->
      <!-- (minor) major 7th suspended 4th -->
      <!-- (minor) major 7th suspended 2nd -->
      <!-- augmented major 7th suspended 4th -->
      <!-- augmented major 7th suspended 2nd -->
      <!-- half-diminished 7th suspended 4th; dominant 7th flat 5 suspended 4th -->
      <!-- half-diminished 7th suspended 2nd; dominant 7th flat 5 suspended 2nd -->
      <!-- diminished 7th suspended 4th -->
      <!-- diminished 7th suspended 2nd -->
      <!-- diminished major 7th suspended 4th; major 7th flat 5 suspended 4th -->
      <!-- diminished major 7th suspended 2nd; major 7th flat 5 suspended 2nd -->
      <!-- dominant (7th) major 6th suspended 4th -->
      <!-- dominant (7th) major 6th suspended 4th -->

      <!-- ***** Figured 5 note chords -->
      <!-- (dominant) 9th suspended 4th -->
      <!-- dominant minor 9th suspended 4th -->
      <!-- major 9th suspended 4th -->
      <!-- augmented major 9th suspended 4th -->
      <!-- augmented (dominant) 9th suspended 4th -->

      <data type="token">
        <param name="minLength">1</param>
        - major      - [not marked]
        - minor      - m
        - perfect    - [not marked]
        - diminished - d
        - augmented  - a

        Music intervals:
        | Int. | Dim. | Min. | Per. | Maj. | Aug. |
        |  1   |  d1  |      |  1   |      |  a1  |
        |  2   |  d2  |  m2  |      |  2   |  a2  |
        |  3   |  d3  |  m3  |      |  3   |  a3  |
        |  4   |  d4  |      |  4   |      |  a4  |
        |  5   |  d5  |      |  5   |      |  a5  |
        |  6   |  d6  |  m6  |      |  6   |  a6  |
        |  7   |  d7  |  m7  |      |  7   |  a7  |
        |  8   |  d8  |      |  8   |      |  a8  |
        |  9   |  d9  |  m9  |      |  9   |  a9  |
        | 10   | d10  | m10  |      |  10  | a10  |
        | 11   | d11  |      |  11  |      | a11  |
        | 12   | d12  |      |  12  |      | a12  |
        | 13   | d13  | m13  |      |  13  | a13  |
        | 14   | d14  | m14  |      |  14  | a14  |
        | 15   | d15  |      |  15  |      | a15  |

        1. The root note (1) should not be included
        2. All other intervals (see the table above) can be included once
        3. Intervals must be separated by a hyphen character (-)
        4. Intervals should be in ascending order (see the table above)
        5. At least 1, but no more than 12, intervals can be specified

        - Harmonics possible: (a?2)?-?(m?3)?-?((d|a)?4)?-?((d|a)?5)?-?(m?6)?-?((d|m)?7)?-?(d?8)?-?((m|a)?9)?-?(m?10)?-?((d|a)?11)?-?(m?13)?
          Covers all variants from standardized 69 chords + a2, d4, a4, d8, a9, m13 (one per step)
        - Logically possible: (d1)?-?(a1)?-?(d2)?-?(m2)?-?2?-?(a2)?-?(d3)?-?(m3)?-?3?-?(a3)?-?(d4)?-?4?-?(a4)?-?(d5)?-?5?-?(a5)?-?(d6)?-?(m6)?-?6?-?(a6)?-?(d7)?-?(m7)?-?7?-?(a7)?-?(d8)?-?8?-?(a8)?-?(d9)?-?(m9)?-?9?-?(a9)?-?(d10)?-?(m10)?-?(10)?-?(a10)?-?(d11)?-?(11)?-?(a11)?-?(d12)?-?(12)?-?(a12)?-?(d13)?-?(m13)?-?(13)?-?(a13)?-?(d14)?-?(m14)?-?(14)?-?(a14)?-?(d15)?-?(15)?-?(a15)?
          Covers all possible variants (52) in ascending order. Covers format aspects of expectations: 1, 2, 3, 4.
        - Logically possible (max. 12 notes): (([2-9]|1[0-5]|m([23679]|10|13|14)|(d|a)(1[0-5]|[1-9]))-){0,11}([2-9]|1[0-5]|m([23679]|10|13|14)|(d|a)(1[0-5]|[1-9]))
          Covers all variants (52), but only 12 segments (in any order). Covers format aspects of expectations: 1, 3, 5.
        We need to use both logical regular expressions to cover all 5 aspects of expectations.
        <param name="pattern">(d1)?-?(a1)?-?(d2)?-?(m2)?-?2?-?(a2)?-?(d3)?-?(m3)?-?3?-?(a3)?-?(d4)?-?4?-?(a4)?-?(d5)?-?5?-?(a5)?-?(d6)?-?(m6)?-?6?-?(a6)?-?(d7)?-?(m7)?-?7?-?(a7)?-?(d8)?-?8?-?(a8)?-?(d9)?-?(m9)?-?9?-?(a9)?-?(d10)?-?(m10)?-?(10)?-?(a10)?-?(d11)?-?(11)?-?(a11)?-?(d12)?-?(12)?-?(a12)?-?(d13)?-?(m13)?-?(13)?-?(a13)?-?(d14)?-?(m14)?-?(14)?-?(a14)?-?(d15)?-?(15)?-?(a15)?</param>
        <param name="pattern">(([2-9]|1[0-5]|m([23679]|10|13|14)|(d|a)(1[0-5]|[1-9]))-){0,11}([2-9]|1[0-5]|m([23679]|10|13|14)|(d|a)(1[0-5]|[1-9]))</param>

  <define name="tag">
    <element name="tag">
      <attribute name="name">
        <ref name="nonEmptyContent"/>
      <!-- allow using more formatting tags for text -->
      <!-- e.g. <tag name="bold"><tag name="red">my text</tag></tag> -->
          <ref name="linesContent"/>
        <!-- Allow empty tag. Formatting without text may be present.
             e.g. <tag name="br"/> -->

  <define name="verseAttributes">
    <attribute name="name">
      <ref name="verseNameType"/>

  <define name="songAttributes">
    <!-- by default: value of type string is required in attr -->
    <attribute name="version">
      <data type="NMTOKEN"> <!-- one word value -->
        <!-- allow only values like: '0.1' '11.2' '13.14.15'
        <param name="pattern">[0-9]+\.[0-9]+(\.[0-9]+)?</param> -->
        <!-- RelaxNG xml schema is specific for openlyrics version -->
        <param name="pattern">0\.9</param>
      <attribute name="xml:lang">
        <data type="language"/>
      <attribute name="createdIn">
        <ref name="nonEmptyContent"/>
      <attribute name="modifiedIn">
        <ref name="nonEmptyContent"/>
      <attribute name="modifiedDate">
        <!-- date format: ISO 8601 -->
        <data type="dateTime"/>
      <attribute name="chordNotation">

  <define name="verseNameType">
    <data type="NMTOKEN">
      <param name="minLength">1</param>
      <!-- 3 part: [verse][verse_number][verse_part]
           verse      -        v1, v2, v1a, …
           chorus     - c, ca, c1, c2, c1a, …
           pre-chorus - p, pa, p1, p2, p1a, …
           bridge     - b, ba, b1, b2, b1a, …
           other      - o, oa, o1, o2, o1a, …
           intro      - i, ia, i1, i2, i1a, …
           ending     - e, ea, e1, e2, e1a, … -->
      <param name="pattern">(v[1-9]\d*[a-z]?)|([cpboie][1-9]\d?[a-z]?)|([cpboie][a-z]?)</param>

  <define name="instrumentNameType">
    <data type="NMTOKEN">
      <param name="minLength">1</param>
      <!-- 3 part: [verse][verse_number][verse_part]
           intro  - i, ia, i1, i2, i1a, …
           ending - e, ea, e1, e2, e1a, …
           solo   - s, sa, s1, s2, s1a, …
           middle - m, ma, m1, m2, m1a, … -->
      <param name="pattern">([iesm][1-9]\d?[a-z]?)|([iesm][a-z]?)</param>

  <define name="langAttribute">
    <attribute name="lang">
      <data type="language"/>

  <!-- transliteration -->
  <define name="translitAttribute">
    <attribute name="translit">
      <data type="language"/>

  <define name="nonEmptyContent">
    <data type="string">
      <param name="minLength">1</param>

  <define name="linesContent">
    <!-- allow tag 'tag' inside regular text - mixed content -->
      <ref name="tag"/>
    <!-- allow tag 'comment' inside regular text - mixed content -->
      <element name="comment">
        <ref name="nonEmptyContent"/>
    <!-- allow tag 'chord' inside regular text - mixed content -->
      <ref name="chord"/>
      <element name="br">

 <!-- INSTRUMENT -->

  <define name="instrument">
    <element name="instrument">
      <attribute name="name">
        <ref name="instrumentNameType"/>
        <element name="lines">
            <attribute name="repeat">
              <data type="integer">
                <param name="minInclusive">2</param>
              <ref name="beat"/>
              <ref name="chord"/>

  <define name="beat">
    <element name="beat">
        <ref name="chord"/>
