The Prophecy - Project Nemesis
Did your virus scanner flag a download from our site? Click here to let us explain why!

Articles

The Workings of FR-08's Sound System - Part 2

by KB / Farbrausch

Part 2: Why are SMF files smaller?

2.1: MIDI files smaller? wtf...

If you know a bit about MIDI communication and .mid files (which are, as said, just a log of all events with timestamps), you might ask how I can think that those files are (after compression) actually smaller than modules.

Let's first assume that we don't need to compress the music data at all. The executable will be compressed anyway, most probably with an RLE or LZ variant (and perhaps with some entropy coding). So, there's nothing against one megabyte of zeroes, they will become something like four bytes after compression. Things that repeat or look very similar to each other are also something good.

A module consists of patterns, which is in itself a real size advantage, as you can repeat them. Problem: under the above assumption, this doesn't help at all. If you simply write all pattern data below each other, a good LZ algorithm will find out that the structures are repetetive and simply put a reference to the last occurrence of the pattern. So, the whole order table and all the code processing it are a plain waste of space.

The second problem is that all current module players (the only exception was Digitrakker, IMHO) store the data in a "per-row", not "per-channel" fashion, which means that data that is likely to repeat (your bass drum on channels 1,2,3 eg :) is interleaved or "scattered" with data that will change (melody, chords, etc), so a LZ packer will find something, but will only be able to compress small chunks of data, which can be considered quite sub-optimal.

Ok, a standard MIDI file is much worse. SMF is a really "compact" format which focuses on not wasting any byte. That's cool for the uncompressed file size, but is likely to tilt any standard compression algorithm. The fact that the content of such a file is a single stream of data which is not even separated in channels (which is true for format-0 files, format-1 files can have an arbitrary number of tracks, but that will become useless soon anyway) doesn't help anything either.

2.2: How to please an LZ compressor

But we can do something about that. The trick is to re-sort the MIDI data and group similar events together. In fact, I splitted the MIDI stream with the following criteria:

  • MIDI channel
  • Type of event (Note, Control Change, Program Change, Pitch Bend, Channel Pressure, Rest)
  • for Controller events: Number of Controller

So, after the split, I had a few hundred single streams (some of them of course empty) which carried data like "All changes of controller 2 on channel 10". You see that in our bass drum example, a good LZ compressor will process the "notes on channel 1" stream, think something like "oh, the same event every quarter note throughout the song" and replace all those notes with one word: "tekkno".

But, of course, that's not enough. If your song contains more than a bass drum, it will most probably contain sequences which get transposed to other pitches. Normally, an LZ packer won't recognize such transposed patterns, and it won't recognize a continuous controller slide from 0 to 127, either. Therefore, apply delta coding to everything... the timestamps, the note numbers, velocity information, controller value, simply everything. A controller slide from 0 to 127 will become something like {always_the_same_time_delta, 1} per event, and all sequences of notes will become "decoupled" from their base note, as only the pitch distances between the notes are encoded, and an LZ compressor will recognize all those slight repetitions as such.

Then, almost all executable compressors compress their data byte-wise. Make sure that the compressor you chose likes your stream structures. I won't comment that further (keeping a small advantage for oneself rocks), but you'll find out what I mean if you think about it, promised :)

You might (or hopefully will) also find out that writing a .mid player that has to keep about 200 delta encoded streams and their timestamps in "mind" and schedules every event at the right time isn't a too trivial task. In fact, it makes the player code somewhat bigger... a simple .mid player should be possible in 500 bytes, the player for my converted format is about 1.5K uncompressed in not-at-all-optimized C++. Still, this pays off. The 11 minutes main tune for FR-08 is about 120K in size in .mid format (PKZIP makes something among the lines of 20K of it). After conversion and adding a few hundred bytes of sound bank data, the ready-for-playing file is about 180K in size.... and after applying the executable compressor, only 4K of it are left. That's a compression ratio of 1:30, which is, well, quite cool IMHO (and definitely better than the 1:10 ratio i would've got with standard .mid files)

So, we have a concept, we have a file format for the music... let's get some sound out of the PC :)