JUKEBOX.SI

Not to be confused with JUKEBOXW.SI

JUKEBOX.SI is an Interleaf File containing 60 audio tracks of music and radio dialogue. With a few exceptions[citation needed], all of LEGO Island's music is stored in JUKEBOX.SI.

Details

As an Interleaf file, JUKEBOX.SI is a container for a large number of asset files. Most of these files are background and radio music (including the radio voices) in Microsoft WAV audio, however there are also 4 FLIC video files for each of the building rooms (dune buggy, jetski, helicopter, and race car). These FLC files are the small instructional videos played in the screen and are interleaved together as "movies" that are equal length (thus looping together).

All audio in JUKEBOX.SI (like most of LEGO Island's audio assets) is mono uncompressed PCM. While the majority of tracks are sampled at 11025 Hz/16-bit, a handful are sampled at 22050 Hz/8-bit. These can be distinguished by each track's WAV header (the "fmt " section if you're familiar with WAV headers) left intact in the first chunk of their respective streams.

When replacing, the WAV "fmt " header can be directly transplanted over the existing data in said first chunk. All WAV formats are compatible. The PCM data can be transplanted too, but it must be interleaved into chunks.

Technical Information

Music appears to begin with a MxDa and is split into chunks of MxCh. The MxDa header contains information about the PCM audio in the MxCh chunks. The first MxCh appears to be information about the remainder of the chunks in the MxDa structure.

All bytes are little endian as is normal for RIFF-based files.

Extracting Audio

  • Audio streams can be located in JUKEBOX.SI by searching for " WAV" (note the prepending space).
    • A few bytes before the " WAV" will be the original filename of the WAV file prior to being imported into the SI file if you wish to retrieve that too.
  • A few bytes later will be "LIST" which appears to specify an array (or "list") of chunks that make up one audio track. The next 4 bytes will by a 32-bit integer for the total size of this "LIST", in other words the total amount of upcoming bytes of the SI file that belong to this particular audio track.
  • The first MxCh after the "LIST" will contain WAV-compatible header data, most of which can be transplanted directly into a WAV file (see below for details).
  • Every MxCh after this one will contain PCM audio data (formatted according to the header data in the first MxCh). Each MxCh has a 22 byte header that will need to be stripped out when extracting. After the 4 byte "MxCh" identifier, the header contains a 4 byte integer of the total amount of bytes that the chunk takes up (minus 8 bytes for the "MxCh" identifier and chunk size integer). All data after this 22 byte header is PCM audio that will be exactly "chunk size - 14" bytes in size (14 is the size of the 22 byte header minus the first 8 bytes).
  • Each MxCh's data can be dumped until you reach the end of the "LIST" size extracted above. At that point the end of the track has been reached and the process must be repeated to extract the next track.

Header

NOTE: This information is incomplete and requires more research and information.

As mentioned above, the first MxCh in a "LIST" contains solely header data. Most of this data is completely compatible with the specification for WAV.

Bytes Offset Description
MxDa 0 Identifier
MxCh 4 Chunk Header
Chunk Size 8 4-byte Integer
Sub-Chunk Size 22 4-byte Integer - The remaining size of this chunk after this value
Audio Format 26 2-byte Integer - 1 = PCM, others indicate some form of compression
Number of Channels 28 2-byte Integer - 1 = Mono, 2 = Stereo
Sample Rate 30 4-byte Integer
Byte Rate 34 4-byte Integer - is equal to Sample Rate * Number of Channels * BitsPerSample/8
Bytes per Sample 36 2-byte Integer - is equal to Number of Channels * BitsPerSample/8
Bits per Sample per Channel 38 2-byte Integer - 8 = 8-bit, 16 = 16-bit, etc.

Transplanting the Header

Use this WAV File Format Header Specification and you'll be able to determine the 16 bytes from "Audio Format" to "Bits per Sample per Channel" are identical. This makes up most of the WAV header data (apart from file and chunk size which cannot be determined from here) and can be directly transplanted to make extraction easier and ensure the sample rate and sample size are correct in the extraction.

Note that the MxCh header contains a few more bytes after "Bits per Sample per Channel" and therefore its "Sub-Chunk Size" is larger than the average WAV file's. These extra bytes should be ignored and not transplanted, though if they are the "Sub-Chunk Size" should be translated too (or at least increased to accommodate for them).