An m-TAGS container is a sequence of tag sets. Each tag set specifies the metadata for a media resource (or part thereof) as a set of tags. Each tag is a pair of one name and one or more values. Each value is a string of characters. A special "locator" tag (@) is used to identify the encapsulated media resource. The metadata present in the media resource, if any, shall be ignored when an m-TAGS value is loaded by an application. This rule does not extend, of course, to the technical information pertaining the media resource, which shall be transparently passed on to the application. An m-TAGS value should be handled by an m-TAGS enabled application exactly as any other regularly supported media resource. The separation between metadata and media should be transparent to the application, and all metadata manipulation should be confined to the m-TAGS container. This implies that media resources should always be handled as "read-only" by an m-TAGS enabled application.

Formally, the syntax for the m-TAGS container is as follows:

m-TAGS = 
	[ tag-sets ]
	
tag-sets =
	tag-set
	tag-set , tag-sets 

tag-set = 
	{}
	{ locator }
	{ tags }
	{ tags locator }
	{ locator tags }
	{ tags locator tags }
	
tags = 
	tag
	tag , tags

locator = 
	"@" : "location"

location = 
	URI
	local-reference

tag = 
	name : array
	name : value

name = 
	string

array = 
	[]
	[ values ]
	
values =
	value
	value , values

value = 
	string
	
string =
	""
	"string-chars"

local-reference = 
	absolute-path
	relative-path
	
absolute-path = 
	/ segments

relative-path
	first-segment
	first-segment / segments
	
first-segment =
	base-char
	base-char base-chars
	
segments =
	segment
	segment / segments

segment = 
	segment-char
	segment-char segment-chars
	
string-chars =
	string-char
	string-char string-chars
	
string-char =
	/
	segment-char
	
segment-char =
	:
	base-char
	
base-char = 
	any-Unicode-character-except-"-or-/-or-:-or\-or-control-character
	\"
	\\
	\/
	\b
	\f
	\n
	\r
	\t
	\u four-hex-digits
	
four-hex-digits =
	hex-digit hex-digit hex-digit hex-digit
	
hex-digit =
	0
	1
	2
	3
	4
	5
	6
	7
	8
	9
	A
	B
	C
	D
	E
	F
	a
	b
	c
	d
	e
	f
	
       

The definition of m-TAGS is fully compatible with that of a JSON array. This means that an m-TAGS file contains a valid JSON value (of course, not all JSON values are valid m-TAGS values). Although an m-TAGS value can be obtained in a variety of ways, if the value is contained in a file, then the file should have a ".tags" or ".TAGS" extension.

URIs and authorities

In the above syntax specification, the definition of URI is as contained in RFC-3986 (Uniform Resource Identifier (URI): Generic Syntax). Support for URI schemas and interpretation of individual URIs is dependent on the specific implementations of the m-TAGS specification.

Local references

Local references identify a media resource on the local file system. This does not mean that they are confined to identifying a resource exactly correspondent to a local file. Local references may also identify a resource contained within an archive file (e.g. a ".zip" or ".rar" file) or a resource identified within the contents of a file (e.g. a track in a ".cue" file, or a specific chapter in a movie), etcetera. They may also identify resources on a remote file system that has been "mounted" on the local file system.

In general, the interpretation of local references depends on the file system and the implementation. Therefore, they may not be portable across file systems or even across different applications using the same file system. For example, only certain audio or video players may be able to support local references to media resources contained within archives.

The definition of local reference above (local-reference) is similar to that of relative-ref in RFC-3986. The difference is that a local-reference has no query or fragment part. This is because a local-reference is confined to the local file system. On the other hand, a local-reference may contain characters not allowed in a relative-ref, as it is not a URI reference, nor part of a URI, but rather the representation of a containment path in the local file system.

The validity of a local reference is restricted by the characteristics of the local file system. However, as a syntactic rule, and independently from the local file system, a local reference is absolute if it starts with a forward slash ("/"), and it is relative otherwise. If a reference is relative, then it is relative to the location of the m-TAGS file containing it. The "." and ".." segments should have the meaning common to most file systems, as well as to the definition of URI relative reference. However, support for these particular segments is not mandated, nor is the use of other mechanisms for identifying the local container and/or the parent container excluded. Note that a relative local reference cannot start with a segment containing the ":" character, because in such case the path would be interpreted as an URI, and not as a local reference. The "current container" identifier ("." on most file systems) can be used as the first segment of a local reference whose first segment would otherwise contain a ":" character.

It is possible for a local reference in a tag set to identify as media resource a tag set in another m-TAGS file. In this case the media resource identified by the second file's tag set locator shall be the media resource encapsulated by both files. The tags in the second tag set, however, shall not implicitely become part of the first tag set, and shall be ignored when the original tag set is loaded by an application.

a.tag:

[
  {
    "@" : "b.tag",
    "artist" : "The Beatles",
    ...
  }
]


b.tag:

[
  {
    "@" : "/c:/music/The Beatles/Abbey Road/01. Come together.mp3", <- media resource encapsulated
                                                                       by both "a.tag" and "b.tag"
                                                                       
    "style" : ["british pop", "classic rock"],                      <- not included if "a.tag" is loaded
    ...
  }
]

Tag names and values

Tag names may contain any Unicode character, except for control characters. the use of tag names is not restricted to a specific set, nor support is mandated for any set of tag names. This is because tags can be defined for any media type, and different media types may have different subsets of significant tags, or different meaning and value restrictions for the same tag. It is advisable, however, that sub-specifications be created for at least the main classes of media (audio, video, photo, etc.), so that a minimal set of tag names to be supported is identified.

Like tag names, tag values may contain any Unicode character, except for control characters. A tag whose value is the empty value ("[ ]") is explicitely excluded from a tag set. This specification does not restrict the syntax of tag values, nor it specifies their cardinality (i.e. which tags can be multi-valued and which ones can only be single-valued). Again, sub-specifications should be created if values should be restricted to a certain syntax and/or a certain cardinality.

Tag names (unlike JSON object member names) are to be treated as case-insensitive, so for example the "genre", "Genre" and "GENRE" are the same tag name. If two or more tags in the same tag set have the same names, then all but the one shall be ignored. Because the tag set is unordered, it is not possible to define a rule for picking the only valid tag.

When an m-TAGS file contains more than one tag set, then a shorthand notation can be used, eliminating the need for repeating certain tags across several tag sets. If a tag is common to two or more consecutive tag sets, then that tag only needs to appear in the first tag set of the sequence. The equivalent rule for scanning an m-TAGS containing two or more tag sets is that if a tag set does not contain a tag, then the value of that tag is the same as the value of the same tag in the closest preceding tag set, if any. The empty value ("[ ]") shall be used if a tag is excluded from a given tag set, even when present in one of the previous tag sets. Empty valued tags also cascade to the subsequent tag sets, so that if a sequence of tag sets must not contain a certain tag, then only the first tag set of the sequence needs to explicitely contain the empty valued tag.


(c) Luigi Mercurio