Building a Better Bitmap (font system)

Intro

There is a common need in the computer world to display text on dot-addressable displays. As with everything else, there are many ways to accomplish the same task. For computers with sufficient memory and CPU power, a popular solution has been to use vector graphics fonts such as the popular system called TrueType. This system uses a bit of math (Bezier curves) to define scalable characters. It can also include Kerning information. Another challenge is how to represent the text - e.g. ASCII or UTF-8. TrueType supports the Unicode character set which is composed of thousands of possible symbols. TrueType font files are usually in the range of 30-200K bytes in size. The data size and math don't represent a big challenge for mobile or desktop level systems, but how can you display nice looking fonts on a microcontroller with only 1K of RAM?

A Clever Workaround

Anyone who works with computers long enough will be familiar with the workaround - a way of getting something done in a sometimes sub-optimal way. In this case, when faced with insufficient memory or computing power, a way to draw text on a graphics display is to use small pre-rendered bitmaps of the characters. In the early days, these were usually 8x8 pixel, 1-bit bitmaps that were very easy to copy to the display. The next level up in difficulty is a system of proportionally spaced character bitmaps. There have been many versions of this idea published on many systems, but a popular one of the last few years, designed to run on humble microcontrollers, has been Adafruit's. They created a data structure to describe the individual character dimensions and offsets, along with a handy tool to convert TrueType files into their format.

Adafruit_GFX

Their font system consists of two main data structures - a font file structure and within it, a set of character glyph structures. The bitmap data is stored as a one dimensional array of bits, one per pixel. So, for example, if the character image is 6 pixels wide and 11 pixels tall, it will consist of 66 bits of image data, occupying 9 bytes of space (the next character starts on a byte boundary). The glyph structure is shown below:

/// Font data stored PER GLYPH
typedef struct {
uint16_t bitmapOffset; ///< Pointer into GFXfont->bitmap
uint8_t width; ///< Bitmap dimensions in pixels
uint8_t height; ///< Bitmap dimensions in pixels
uint8_t xAdvance; ///< Distance to advance cursor (x axis)
int8_t xOffset; ///< X dist from cursor pos to UL corner
int8_t yOffset; ///< Y dist from cursor pos to UL corner
} GFXglyph;

This scheme does a reasonable job of storing variable sized characters and is relatively easy to unpack and draw. For small character sizes, this idea works, but things start to go wrong when you need characters with sizes and offsets that don't fit in 8-bit signed values. In general, this works for fonts up to about 58 points (a typography term). The font convert tool doesn't tell you that the glyph values overflowed, it just looks wrong when you try to use it. 58 points may sound gigantic, but a simple example of needing a font even larger than this is the case of using a humble QVGA (320x240) LCD to display a full screen clock. This sized display needs a font larger than 60 points to fill the display with the 4 digits of the time. If using this font idea on higher resolution displays like e-paper, it becomes even more of a limitation. The second downside of this scheme is that when using large sized fonts, the bitmap data becomes quite large too. A medium to large size font can easily pass 64K bytes in size.

One other challenge of using the Adafruit GFX font format is that due to the font file structure (a C structure with an array of sub-structures with relative offsets), fonts aren't easy to load at run time. This means that most people who use them are stuck working with fonts that are pre-compiled into their code.

For those people who use MicroPython on MCUs, its font system is very similar to Adafruit's. Large point sizes become huge data and it's challenging to load them at run time.

A Better Bitmap Font

Lately I have been using a lot of high resolution Eink displays (controlled by microcontrollers). My own display libraries were using the Adafruit font format, but with my optimized font draw code. It is still practical to use bitmap fonts even with larger point sizes, compared to compiling a whole TrueType renderer into your project. To use very large fonts, I modified the Adafruit glyph structure to use 16-bit values, but the data size of these large fonts was starting to become a burden. I saw that there were some other developers who had implemented a RLE (run-length) bitmap font compression scheme, but it wasn't very effective at compressing the bitmap data (maybe 3:1 at best) and the run-time font loading problem still hadn't been resolved. I have worked with image compression algorithms for many years and I'm a big fan of the CCITT T.6 (aka Group4) FAX compression algorithm. I knew it would work well to compress font images, but it comes with the challenge of needing relatively large Huffman lookup tables on the decoder. Again, large in this case is a relative term and is relative to my hypothetical target system which has only 1K of RAM and limited FLASH memory. There are really two main uses of Huffman tables in 2D FAX compression - the horizontal run length encoding and the line-to-line deltas. The RLE part is the one that requires large tables while the line deltas table is quite small. With this in mind, I created a modified version of Group4 compression - Group5! My version uses the same basic compression scheme, but for the horizontal mode (RLE part), I use a simple short/long direct encoding of the lengths instead of a statistical model. This change makes the compression slightly less effective, but unburdens the decoder quite a bit. Now the decoding logic only needs a single 128-byte lookup table (for line-to-line deltas) and much simpler logic for the RLE part.

The Origin of Group5

Some might question my efforts to modify such an elegant and effective image compression algorithm as Group4 to create an inferior one (Group5), but for the purposes of font compression on MCUs, my changes open up new use cases for nice looking fonts on highly constrained systems and don't detrimentally affect it on larger ones. Last year I was looking into packing more functionality into the ULP (ultra-low power) co-processors inside of Espressif's ESP32-S3 and C6 MCUs. These low speed RISC-V processors are constrained to use the RTC RAM for both their code and data space; the ULP cannot access any other memory of the ESP32. On the ESP32-S3, this is only 8K. To fit useful code, along with nice looking fonts into that space is quite challenging. If I were to use the Group4 algorithm to compress the font data, the decompressor code would overflow the available space. This was the original motivation for creating Group5. With my modified image compression algorithm, I was able to fit my program code, font data and the decompressor in that tiny space. Below is a photo of the first test of that code - a LilyGo ESP32-S3 mini e-paper board driving the display from within the ULP co-processor of the ESP32-S3.

Compression Effectiveness

The G4/G5 image compression works by comparing the current line to the line above and if the color changes (black->white or white->black) occur within +/-3 pixels of the line above, then that can be encoded very efficiently. If not, then it switches to "horizontal mode" and encodes the pair of black/white or white/black run lengths using Huffman codes. Fonts tend to have few color changes per row and as the font grows in size, the color change deltas will shrink towards 0 (the most efficient encoding of G4/G5 - a single bit). So... as the font point size grows, so does the effectiveness of the data compression. At very large point sizes, the G5 compressor can get in excess of 10:1 lossless compression of the font image data. As an example, when encoding the Roboto-Black font to 80pt output (character codes 32 to 127), Adafruit_GFX will produce a 24K, while my G5 font file needs only 9K. As the point size gets larger, the file doesn't grow as much as you might think. For example, a 160pt Roboto-Black G5 file is 17K, but remember that the font images are actually 4x larger (2x in both dimensions), so the data size grew by less than 2x while the font images grew by 4x. Adafruit's system can't produce a 160pt font file, but if it could, it would be 92K.

Anti-Aliased Text Support?

One feature of more advanced text rendering systems is the ability to draw antialiased characters. What this means is that the sharp edges of the characters can be softened by drawing them in various shades of gray/white or whatever color you're using. Below is a good example of the difference it can make.

Normally, 1-bit bitmaps (the right side of the example above) can't be turned into the left side image. To draw antialiased characters usually requires using vector fonts (like TrueType) or storing bitmap fonts in a 2 or 4-bit per pixel format (e.g. the way LVGL does it). For comparison, LVGL defaults to generating 4-bit antialiased fonts that aren't compressed. This results in huge files (generally 4x larger than Adafruit's). I wanted to stay with the benefits of using 1-bit compressed fonts, yet offer an antialias option, so I created a workaround :). My idea is to use fonts at twice the desired size (which is less than 2x the compressed size) and convert each 2x2 pixel block into a gray/alpha value which varies based on the number of set bits. This "scale to gray" scheme, as I call it, actually works quite well. Below is an example from my bb_spi_lcd (color LCD library) showing how it works. It's hard to photograph close-up shots of LCDs without them looking a bit strange, but the text is actually much smoother looking with the anti-alias option.

Unicode? How?

If you work with Unicode characters, you know that supporting the whole set (thousands and thousands of unique characters) is a big challenge. Some compromises usually need to be made when using international characters on constrained devices. For my implementation, I only support the Latin alphabet (plus accented characters) and a few symbols. The group of characters I chose to support isn't random, it's the same ones that Microsoft decided to support many years ago with their "Code Page 1252". It packs a variety of characters into 8-bits (actually 224 possible characters) by including the 96 standard ASCII characters in slots 32 to 127 and then a collection of symbols and accented characters in 128 to 255. This 8-bit encoding isn't used by modern desktop or mobile software, so I added a UTF-8 to CP1252 translation layer in my display libraries. This allows you to use UTF-8 text directly in your projects and pass those strings to my display libraries and they "just work". In my display libraries, I convert the UTF-8 text into CP1252 text and use the 8-bit output to index the character bitmaps. At the font creation side, my fontconvert tool maps the extended character codes 128 to 255 (if you want to use them) to the appropriate Unicode values from within the TrueType font. Below you can see an example running on a common SSD1306 128x64 OLED display.

Does it still run on our hypothetical low-end MCU?

The humble AVR MCU (aka Arduino UNO and Leonardo) is a good stand-in for the hypothetical constrained MCU. When using the Arduino IDE to generate code for it, it has less than 2K of RAM and less than 29K of FLASH available. A quick test revealed that it does still run on the Arduino Pro Micro!

Summary

So, after all of this extra code and complexity, what benefits does this font system bring?

A consistent API and data compatibility across all four of my display libraries (OneBitDisplay, bb_spi_lcd, bb_epaper, and FastEPD)
Smaller data size compared to uncompressed bitmap fonts
A convenient command line tool for creating the compressed font data from any TrueType font
Support for Euro (Latin alphabet) Unicode characters from standard UTF-8 text
Fast rendering speed
Effectively no upper limit on the font size
Reasonable Anti-Alias output from 1-bit data
Easy dynamic loading of font data from external media (e.g. SD card or downloaded from the web)
The font decompressor and rendering code works on humble MCUs with nearly no RAM nor FLASH space

When can I try it?

Soon... I'm updating a lot of code, examples and documentation. It will take a bit of time to get it all done. I'll post here when it's ready to try. The earlier version of the compressed font code has been part of my FastEPD and bb_epaper libraries for a few months. Now I'm bringing it to bb_spi_lcd and OneBitDisplay with a few changes.

What if I want to use it in a commercial product?

Excellent question! I like to use the Apache 2.0 license for maximum freedom of use, but if you would like to use this code in a commercial product, please contact me for reasonable licensing terms.

Search This Blog

Follow me down the optimization rabbit hole