Tagging
When a SimpleTag
is nested within another SimpleTag
, the nested SimpleTag
becomes an attribute of its parent SimpleTag
.
For instance, if you wanted to store the dates that a singer started being the lead performer,
then your SimpleTag
tree would look something like this:
-
Targets
- TagTrackUID = {track UID of tagged content}.
-
ARTIST = “Pet Shop Boys”
-
LEAD_PERFORMER = “Neil Tennant”
- DATE_STARTED = “1981-08”
-
This corresponds to this layout of EBML elements:
<Tags>
<Tag>
<Targets>
<TagTrackUID>{track UID of tagged content}</TagTrackUID>
</Targets>
<SimpleTag>
<TagName>ARTIST</TagName>
<TagString>Pet Shop Boys</TagString>
<!-- sub tag(s) about the ARTIST -->
<SimpleTag>
<TagName>LEAD_PERFORMER</TagName>
<TagString>Neil Tennant</TagString>
<!-- sub tag(s) about the LEAD_PERFORMER -->
<SimpleTag>
<TagName>DATE_STARTED</TagName>
<TagString>1981-08</TagString>
</SimpleTag>
</SimpleTag>
</SimpleTag>
</Tag>
</Tags>
In this way, it becomes possible to store any SimpleTag
as attributes of another SimpleTag
.
Multiple items SHOULD never be stored as a list in a single TagString
. If there is more
than one tag value with the same name to be stored, then more than one SimpleTag
SHOULD be used.
Why Official Tags Matter
There is a debate between people who think all tags SHOULD be free and those who think all tags SHOULD be strict. Our recommendations are in between.
Advanced-users application might let you put any tag in your file. But for the rest of the applications, they usually give you a basic list of tags you can use. Both have their needs. But it’s usually a bad idea to use custom/exotic tags because you will probably be the only person to use this information even though everyone else could benefit from it. So hopefully, when someone wants to put information in one’s file, they will find an official one that fits their need and hopefully use it. If it’s not in the list, this person can try get a new tag in the Matroska Tags Names registry ((#matroska-tags-names-registry)). This registry is not meant to have every possible information in a file. Matroska files are not meant the become a whole database of people who made costumes for a film. A website would be better for that. It’s hard to define what should be in and what doesn’t make sense in a file; thus, each demand needs to balance if it makes sense to be carried over in a file for storage and/or sharing or if it doesn’t belong there.
We also need an official list simply for developers to be able to display relevant information in their own design, if they choose to support a list of meta-information they should know which tag has the wanted meaning so that other apps could understand the same meaning.
Tag Formatting
TagName Formatting
Official TagName
values MUST consist of UTF-8 capital letters, numbers and the underscore character ‘_’.
Official TagName
values MUST NOT contain any space.
Official TagName
values MUST NOT start with the underscore character ‘_’; see (#why-official-tags-matter).
It is RECOMMENDED to start a tag name with the underscore character ‘_’ for non official tags than are not meant to make it to the list of official tags.
TagString Formatting
Although tags are metadata mostly used for reading, there are cases where the string value could be used for sorting, categorization, etc. For this reason, when possible, strict formatting of the value should be used so everyone can agree on how to use the value.
Due to preexisting files where these formatting rules were not explicit, they are usually presented as rules that SHOULD be applied when possible, rather than MUST be applied at all times. It is RECOMMENDED to use strict formatting when writing new tag values.
Date Tags Formatting
TagString
fields with dates SHOULD have the following format: “YYYY-MM-DD hh:mm:ss.mss”.
This is similar to the ISO8601 date and time format defined in [@RFC3339, appendix A] of [@RFC9559]
without the “T” separator, without the time offset and with the addition of the milliseconds “mss”.
The date and times represented are in Coordinated Universal Time (UTC).
Date and times are usually not precise to a particular millisecond. To store less accurate dates, parts of the date string are removed starting from the right. For instance, to store only the year, one would use “2004”. To store a specific day such as May 1st, 2003, one would use “2003-05-01”.
Number Tags Formatting
TagString
fields that require a floating-point number SHOULD use the “.” mark instead of the “,” mark.
Only ASCII numbers “0” to “9” and the “.” character SHOULD be used.
The “.” separator represents the boundary between the integer value and the decimal parts.
If the string doesn’t contain the “.” separator, the value is an integer value.
Thousandths separators SHOULD NOT be used.
To display it differently for another local, applications SHOULD support auto replacement on display.
Country Code Tags Formatting
TagString
fields that use a Country Code SHOULD use the Matroska countries form defined in [@!RFC9559, section 13],
i.e. [@!RFC5646] two-letter region subtags, without the UK exception.
Target Types
The TargetTypeValue
element allows tagging of different parts that are inside or outside a
given file. For example, in an audio file with one song you could have information about
the album it comes from the CD set even if it’s not found in the file.
For applications to know the kind of information (e.g., “TITLE”) relates to a certain level
(CD title or track title), we also need a set of official TargetTypeValue
values and TargetType
names.
That also means the same tag name can
have different meanings depending on its TargetTypeValue
, otherwise we would end up with 7 “TITLE_” tag names.
For human readability a TargetType
string can be added next to the corresponding TargetTypeValue
.
Audio and video have different TargetType
values.
The following table summarizes the TargetType
values found in [@!RFC9559, section 5.1.8.1.1.2]
for audio and video content:
TargetTypeValue | Audio TargetType | Comment —————-|:——————————–|:—- 70 | COLLECTION | the high hierarchy consisting of many different lower items 60 | EDITION / ISSUE / VOLUME / OPUS | a list of lower levels grouped together 50 | ALBUM / OPERA / CONCERT | the most common grouping level of music (e.g., an album) 40 | PART / SESSION | when an album has different logical parts 30 | TRACK / SONG | the common parts of an album 20 | SUBTRACK / PART / MOVEMENT | corresponds to parts of a track for audio (e.g., a movement) 10 | - | the lowest hierarchy found in music Table: TargetTypeValue Values Audio Semantic Description
TargetTypeValue | Video TargetType | Comment —————-|:————————–|:——- 70 | COLLECTION | the high hierarchy consisting of many different lower items 60 | SEASON / SEQUEL / VOLUME | a list of lower levels grouped together 50 | MOVIE / EPISODE / CONCERT | the most common grouping level of video (e.g., an episode for TV series) 40 | PART / SESSION | when an episode has different logical parts 30 | CHAPTER | the common parts of a movie or episode 20 | SCENE | a sequence of continuous action in a film or video 10 | SHOT | the lowest hierarchy found in movies Table: TargetTypeValue Values Video Semantic Description
Tags from a TargetTypeValue
apply to the all lower TargetTypeValues
. This means that if a CD has the same
artist for all tracks, you just need to set the “ARTIST” tag at TargetTypeValue
50 (ALBUM) and not
to each TargetTypeValue
30 (TRACK), but you can also repeat the value for each track.
If some tracks of that CD have no known
“ARTIST”, the value MUST be set to nothing, a void string “” as detailed in [@!RFC9559, section 24.2],
so that the album “ARTIST” doesn’t apply.
If a tag with a given TagName
is found at a TargetTypeValue
,
only values of that TagName
are valid at that TargetTypeValue
level.
In other words, the TagName
values from upper TargetTypeValue
levels don’t apply at that level.
Multiple SimpleTag
with the same TagName
can be used at a given TargetTypeValue
level when each SimpleTag
contain a TagString
.
For example this can be useful to find a single “ARTIST” even when they are found in a collaboration.
The concatenation of each TagString
represents the value for the TagName
at this level.
The presentation, for instance with a separator, is up to the application.
Target Types Parts
There are three organizational tags defined in (#organization-information):
-
TOTAL_PARTS
-
PART_NUMBER
-
PART_OFFSET
These tags allow specifying the ordering of some tags within a another group of tags.
For example if you have an album with 10 tracks and you want to tag the second track from it.
You set “TOTAL_PARTS” to “10” at TargetTypeValue
50 (ALBUM). It means the “ALBUM” contains 10 lower parts.
The lower part in question is the first lower TargetTypeValue
that is specified in the file.
So, if it’s TargetTypeValue
= 30 (TRACK), then that means the album contains 10 tracks.
If TargetTypeValue
is 20 (MOVEMENT), that means the album contains 10 movements, etc.
And since it’s the second track within the album, the “PART_NUMBER” at TargetTypeValue
30 (TRACK) is set to “2”.
If the parts are split into multiple logical entities, you can also use “PART_OFFSET”.
For example you are tagging the third track of the second CD of a double CD album with a total of 10 tracks
the “TOTAL_PARTS” at TargetTypeValue
50 (ALBUM) is “10”,
the “PART_NUMBER” at TargetTypeValue
30 (TRACK) is “3”,
and the the “PART_OFFSET” at TargetTypeValue
30 (TRACK) is “5”, which is the number of tracks on the first CD.
When a TargetTypeValue
level doesn’t exist it MUST NOT be specified in the files, so that the “TOTAL_PARTS”
and “PART_NUMBER” elements match the same levels.
Here is an example of an audio record with 2 tracks in a single file, corresponding to [@?DaFunk].
There is one Tag
element for the record, and one Tag
element per track on the record.
Each track being identified by a chapter.
The Tag
for the record:
-
Targets
- TargetTypeValue = 50
-
ARTIST = “Daft Punk”
-
TITLE = “Da Funk”
-
TOTAL_PARTS = “2”
The Tag
for the first track:
-
Targets
-
TargetTypeValue = 30
-
TagChapterUID = 12345
-
-
TITLE = “Da Funk”
-
PART_NUMBER = “1”
The Tag
for the second track:
-
Targets
-
TargetTypeValue = 30
-
TagChapterUID = 67890
-
-
TITLE = “Rollin’ & Scratchin’”
-
PART_NUMBER = “2”
This corresponds to this layout of EBML elements:
<Tags>
<!-- description of the whole file/record -->
<Tag>
<Targets>
<TargetTypeValue>50</TargetTypeValue>
</Targets>
<SimpleTag>
<TagName>ARTIST</TagName>
<TagString>Daft Punk</TagString>
</SimpleTag>
<SimpleTag>
<TagName>TITLE</TagName>
<TagString>Da Funk</TagString>
</SimpleTag>
<SimpleTag>
<TagName>TOTAL_PARTS</TagName>
<TagString>2</TagString>
</SimpleTag>
</Tag>
<!-- description of the first track/chapter -->
<Tag>
<Targets>
<TargetTypeValue>30</TargetTypeValue>
<TagChapterUID>12345</TagChapterUID>
</Targets>
<SimpleTag>
<TagName>TITLE</TagName>
<TagString>Da Funk</TagString>
</SimpleTag>
<SimpleTag>
<TagName>PART_NUMBER</TagName>
<TagString>1</TagString>
</SimpleTag>
</Tag>
<!-- description of the second track/chapter -->
<Tag>
<Targets>
<TargetTypeValue>30</TargetTypeValue>
<TagChapterUID>67890</TagChapterUID>
</Targets>
<SimpleTag>
<TagName>TITLE</TagName>
<TagString>Rollin' & Scratchin'</TagString>
</SimpleTag>
<SimpleTag>
<TagName>PART_NUMBER</TagName>
<TagString>2</TagString>
</SimpleTag>
</Tag>
</Tags>
Here is an example using the “PART_OFFSET” tag. It corresponds to a file that contains the third track on the second CD of the 2-CD album “The Orb’s Adventures Beyond The Ultraworld” [@?OrbUltraworld]:
The Tag
for the album:
-
Targets
- TargetTypeValue = 50
-
ARTIST = “Orb”
- SORT_WITH = “Orb, The”
-
TITLE = “The Orb’s Adventures Beyond The Ultraworld”
-
TOTAL_PARTS = “10”
The Tag
for the third track of the second CD:
-
Targets
- TargetTypeValue = 30
-
TITLE = “Outlands”
-
PART_NUMBER = “3”
-
PART_OFFSET = “5”
This corresponds to this layout of EBML elements:
<Tags>
<!-- description of the whole album -->
<Tag>
<Targets>
<TargetTypeValue>50</TargetTypeValue>
</Targets>
<SimpleTag>
<TagName>ARTIST</TagName>
<TagString>Orb</TagString>
<SimpleTag>
<TagName>SORT_WITH</TagName>
<TagString>Orb, The</TagString>
</SimpleTag>
</SimpleTag>
<SimpleTag>
<TagName>TITLE</TagName>
<TagString>The Orb's Adventures Beyond The Ultraworld</TagString>
</SimpleTag>
<!-- the number of sub elements in this album (10 tracks) -->
<SimpleTag>
<TagName>TOTAL_PARTS</TagName>
<TagString>10</TagString>
</SimpleTag>
</Tag>
<!-- description of the third track of the second CD -->
<Tag>
<Targets>
<TargetTypeValue>30</TargetTypeValue>
</Targets>
<SimpleTag>
<TagName>TITLE</TagName>
<TagString>Outlands</TagString>
</SimpleTag>
<!-- This is the third track of the second CD -->
<SimpleTag>
<TagName>PART_NUMBER</TagName>
<TagString>3</TagString>
</SimpleTag>
<!-- The first CD contains 5 tracks -->
<SimpleTag>
<TagName>PART_OFFSET</TagName>
<TagString>5</TagString>
</SimpleTag>
</Tag>
</Tags>
Multiple Targets UID
A Tag
element has a single Targets
element with a single TargetTypeValue
element.
But it can contain various TagTrackUID
, TagEditionUID
, TagChapterUID
and TagAttachmentUID
elements.
When multiple values are found using the same Tag UID element (e.g., TagTrackUID
)
a logical OR is applied on these elements.
In other words the tags apply to each entity defined by a UID.
This is the list of UIDs the tags apply to (e.g., list of TagTrackUID
).
Such a list may contain a single UID element.
When different lists of Tag UID elements are found (e.g., a list of TagTrackUID
and a list of TagChapterUID
)
a logical AND is applied between those lists.
In other words the tags apply only to the entities matching a UID in each list of Tag UID elements.
These operations allow factorizing tags that would otherwise need to be repeated multiple times.
Here is an example of a Tag
applying to 2 chapters, using the same [@?DaFunk] example as in (#target-types-parts):
-
Targets
-
TargetTypeValue = 30
-
TagChapterUID = 12345
-
TagChapterUID = 67890
-
-
WRITTEN_BY = “Thomas Bangalter”
-
WRITTEN_BY = “Guy-Manuel de Homem-Christo”
-
PRODUCER = “Thomas Bangalter”
-
PRODUCER = “Guy-Manuel de Homem-Christo”
This corresponds to this layout of EBML elements:
<Tags>
<Tag>
<Targets>
<TargetTypeValue>30</TargetTypeValue>
<!-- chapter with Da Funk -->
<TagChapterUID>12345</TagChapterUID>
<!-- chapter with Rollin' & Scratchin' -->
<TagChapterUID>67890</TagChapterUID>
</Targets>
<!-- first writer of Da Funk and Rollin' & Scratchin' -->
<SimpleTag>
<TagName>WRITTEN_BY</TagName>
<TagString>Thomas Bangalter</TagString>
</SimpleTag>
<!-- second writer of Da Funk and Rollin' & Scratchin' -->
<SimpleTag>
<TagName>WRITTEN_BY</TagName>
<TagString>Guy-Manuel de Homem-Christo</TagString>
</SimpleTag>
<!-- first producer of Da Funk and Rollin' & Scratchin' -->
<SimpleTag>
<TagName>PRODUCER</TagName>
<TagString>Thomas Bangalter</TagString>
</SimpleTag>
<!-- second producer of Da Funk and Rollin' & Scratchin' -->
<SimpleTag>
<TagName>PRODUCER</TagName>
<TagString>Guy-Manuel de Homem-Christo</TagString>
</SimpleTag>
</Tag>
</Tags>
Some combination of different Tag UID elements are not possible.
A TagChapterUID
and TagAttachmentUID
can’t be mixed because there is no overlap
with a Chapter and an Attachment that would make sense.
An attachment apply to the whole segment and can be tied to tracks,
via \Segment\Tracks\TrackEntry\AttachmentLink
as defined in [@!RFC9559, section 5.1.4.1.24], but not chapters.
Mixing TagEditionUID
and TagChapterUID
elements has also no use because each Chapter UIDs
would need to be in one of the Chapter Edition UIDs.
That would be the same as not using the list of TagEditionUID
at all.
The following table shows the allowed combinations between lists of Tag UID elements:
UID elements | Track | Edition | Chapter | Attachment :———-|:————|:————–|:————–|:————— Track | YES | YES | YES | with matching AttachmentLink Edition | YES | YES | NO | YES Chapter | YES | NO | YES | NO Attachment | with matching AttachmentLink | YES | NO | YES Table: Tag UID elements allowed combinations{#taguid-combinations}
Here is an example of a Tag
applying to a single track and a single chapter.
It represents the composer of the music in a part of a movie.
The file may contain a second audio track with audio commentary not including that music,
so we only tag the track with the music.
-
Targets
-
TargetTypeValue = 30
-
TagTrackUID = 123
-
TagChapterUID = 987654321
-
-
COMPOSER = “Hans Zimmer”
This corresponds to this layout of EBML elements:
<Tags>
<Tag>
<Targets>
<TargetTypeValue>30</TargetTypeValue>
<!-- chapter with the music -->
<TagTrackUID>123</TagTrackUID>
<!-- track with the music -->
<TagChapterUID>67890</TagChapterUID>
</Targets>
<!-- composer of the music in that chapter when using that audio track -->
<SimpleTag>
<TagName>COMPOSER</TagName>
<TagString>Hans Zimmer</TagString>
</SimpleTag>
</Tag>
</Tags>
Official Tags
The following is a complete list of the supported Matroska Tags. While it is possible to use Tag names that are not listed below, this is NOT RECOMMENDED as compatibility will be compromised. If you find that there is a Tag missing that you would like to use, then please contact the persons mentioned in the IANA Matroska Tags Registry for its inclusion; see (#matroska-tags-names-registry).