The two biggest design goals were to be able to implement ID3v2 without disturbing old software too much and that ID3v2 should be expandable.
The first criterion is met by the simple fact that the MPEG decoding software uses a syncsignal, embedded in the audiostream, to 'lock on to' the audio. Since the ID3v2 tag doesn't contain a valid syncsignal, no software will attempt to play the tag. If, for any reason, coincidence make a syncsignal appear within the tag it will be taken care of by the 'unsynchronisation scheme' described in section 5.
The second criterion has made a more noticeable impact on the design of the ID3v2 tag. It is constructed as a container for several information blocks, called frames, whose format need not be known to the software that encounters them. At the start of every frame there is an identifier that explains the frames's format and content, and a size descriptor that allows software to skip unknown frames.
If a total revision of the ID3v2 tag should be needed, there is a version number and a size descriptor in the ID3v2 header.
The ID3 tag described in this document is mainly targeted to files encoded with MPEG-2 layer I, MPEG-2 layer II, MPEG-2 layer III and MPEG-2.5, but may work with other types of encoded audio.
The bitorder in ID3v2 is most significant bit first (MSB). The byteorder in multibyte numbers is most significant byte first (e.g. $12345678 would be encoded $12 34 56 78).
It is permitted to include padding after all the final frame (at the end of the ID3 tag), making the size of all the frames together smaller than the size given in the head of the tag. A possible purpose of this padding is to allow for adding a few additional frames or enlarge existing frames within the tag without having to rewrite the entire file. The value of the padding bytes must be $00.
Padding is good as it increases the write speed when there is already a tag present in a file. If the new tag is one byte longer than the previous tag, than the extra byte can be taken from the padding, instead of having to shift the entire file one byte. Padding is of course bad in that it increases the size of the file, but if the amount of padding is wisely chosen (with clustersize in mind), the impact on filesystems will be virtually none. As the contents is $00, it is also easy for modems and other transmission devices/protocols to compress the padding. Having a $00 filled padding also increases the ability to recover erroneous tags.
The ID3v2 tag header, which should be the first information in the file, is 10 bytes as follows:
ID3/file identifier | "ID3" | |
ID3 version | $02 00 | |
ID3 flags | %xx000000 | |
ID3 size | 4 * | %0xxxxxxx |
The first three bytes of the tag are always "ID3" to indicate that this is an ID3 tag, directly followed by the two version bytes. The first byte of ID3 version is it's major version, while the second byte is its revision number. All revisions are backwards compatible while major versions are not. If software with ID3v2 and below support should encounter version three or higher it should simply ignore the whole tag. Version and revision will never be $FF.
In the first draft of ID3v2 the identifier was "TAG", just as in ID3v1. It was later changed to "MP3" as I thought of the ID3v2 as the fileheader MP3 had always been missing. When it became appearant than ID3v2 was going towards a general purpose audio header the identifier was changed to "ID3".
The first bit (bit 7) in the 'ID3 flags' is indicating whether or not unsynchronisation is used; a set bit indicates usage.
The second bit (bit 6) is indicating whether or not compression is used; a set bit indicates usage. Since no compression scheme has been decided yet, the ID3 decoder (for now) should just ignore the entire tag if the compression bit is set.
Currently, zlib compression is being considered for the compression, in an effort to stay out of the all-too-common marsh of patent trouble. Have a look at the additions draft for the latest developments.
The ID3 tag size is encoded with four bytes where the first bit (bit 7) is set to zero in every byte, making a total of 28 bits. The zeroed bits are ignored, so a 257 bytes long tag is represented as $00 00 02 01.
We really gave it a second thought several times before we introduced these awkward size descriptions. The reason is that we thought it would be even worse to have a file header with no set size (as we wanted to unsynchronise the header if there were any false synchronisations in it). An easy way of calculating the tag size is A*2^21+B*2^14+C*2^7+D = A*2097152+B*16384+C*128+D, where A is the first byte, B the second, C the third and D the fourth byte.
The ID3 tag size is the size of the complete tag after unsychronisation, including padding, excluding the header (total tag size - 10). The reason to use 28 bits (representing up to 256MB) for size description is that we don't want to run out of space here.
An ID3v2 tag can be detected with the following pattern:
$49 44 33 yy yy xx zz zz zz zz
Where yy is less than $FF, xx is the 'flags' byte and zz is less than $80.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|