Decoding the TRAKTOR4 field
The Native Instrument's Traktor line of software is kind of infamous when it comes to its library management.
It only reads a handful of standard ID3 fields and it encodes its own metadata inside a non-standard private field. As one could expect, the format is binary and proprietary.
Luckily, some clever folks already figured it out back in 2010 and it seems that the format hasn't changed since then. They also provided a parser implemented in Perl, but it wasn't archived.
But since we know the format, we can now write our own parser using some Modern Tools™.
Format
Data are structured as a n-ary tree of frames. Each frame consists of the header and the data content.
0x00 0x04 Frame identifier (reversed for some reason)
0x04 0x04 Frame's length in bytes (=LEN)
0x08 0x04 Number of child frames
0x0c LEN Data content of the frame
If the number of child frames is zero, the data content is the actual payload, the type of which can be string, long, float, date plus some special cases, like the Cue points (see below). Some fields can also contain image data of an arbitrary format and some fields are still unidentified.
String
0x00 0x04 Length of the string (=LN)
0x04 LN×2 String data. Each character is followed by \0
Strings are prefixed with a 4-byte number indicating the length. Each character is represented by two bytes, but it doesn't seem to be standard utf8. (More investigation needed.)
Long/float
0x00 0x04 The value
Date
0x00 0x01 Day
0x01 0x01 Month
0x02 0x02 Year
Cue points
0x00 0x04 Number of cuepoints (=N)
N×[0x00 0x04 Unknown, seems to default to 1
0x04 0x04 Cue point name length (=LN)
0x08 LN×2 Cue point name. Each character is followed by \0
… 0x04 Display order (long)
0x04 Type (enumeration)
0 = CUE
1 = IN
2 = OUT
3 = LOAD
4 = GRID
5 = LOOP
0x08 Start (double)
0x08 Length (double)
0x04 Repeats (long)
-1 = No repeat
? = ?
0x04 Hot cue (long)]
Parsing
Instead of writing the parser from a scratch, I've used a meta-parser for binary data called Kaitai Struct. With it you can describe formats with a markup language based on YAML and then produce parsers for many different languages automatically.
It also provides JavaScript output, so we can use it directly in the browser. Click the button below to process mp3 file with Traktor metadata.
Note: The file is not uploaded anywhere, everything is processed in the browser.
Example output: TRMD HDR CHKS 1835099 FMOD 2021-5-6 VRSN 7 DATA ANDB 112,153,60,65 BITR 249000 BPMQ 1 COM2 Comment2 COMM Comment CTLG Cat. No. CUEP 8 0: name: AutoGrid, display_order: 0, type: GRID, start: 1406.629, length: 0.000, repeats: -1, hot cue: 0 1: name: n.n., display_order: 0, type: CUE, start: 7765.158, length: 0.000, repeats: -1, hot cue: 1 2: name: n.n., display_order: 0, type: CUE, start: 14123.688, length: 0.000, repeats: -1, hot cue: 2 3: name: n.n., display_order: 0, type: CUE, start: 20482.217, length: 0.000, repeats: -1, hot cue: 3 4: name: n.n., display_order: 0, type: CUE, start: 40087.684, length: 0.000, repeats: -1, hot cue: 4 5: name: n.n., display_order: 0, type: CUE, start: 143943.667, length: 0.000, repeats: -1, hot cue: 5 6: name: n.n., display_order: 0, type: CUE, start: 182094.844, length: 0.000, repeats: -1, hot cue: 6 7: name: n.n., display_order: 0, type: CUE, start: 232963.081, length: 0.000, repeats: -1, hot cue: 7 FLGS 12 HBPM 113.23372650146484 IPDT 2021-5-6 LABL Label MKEY 12 PCDB 11.787460327148438 PKDB 7.761477947235107 PROD Producer RANK 255 RLDT 1900-1-1 SYNC MATY 3 TALB Release TCON Genre TIT2 Title TKEY 10m TLEN 245 TMIX Mix TPE1 Artist TPE4 Remixer TRCK 1 USLT Lyrics
Writing
As you might have noticed, there is a check sum. And sure enough, changing even a single byte causes Traktor Scratch to completely ignore the data. So far, I haven't been able to find the function which computes it.