Optimized font rendering on SPI LCDs

The Problem
The open source community has a few large contributors (e.g. Arduino, Google, Adafruit, Microsoft) and their contributions usually set the standard for certain projects and code libraries. I contribute to open source as a way of paying back what I've learned from it. This is not a knock on them to admit that performance is usually not their number one priority, while it usually is mine. Which brings us the reason for this blog post - drawing fonts on inexpensive LCD displays. This has been mostly the exclusive domain of Adafruit's GFX library, but the performance of that code is quite slow, especially on slow MCUs like the Arduino Uno.

The Solution
Adafruit created a good system of converting TrueType fonts into a simpler bitmap format that makes it easy to draw on low power MCUs. They converted the vector data into an array of small bitmaps with an index to reference the start and size of each bitmap. The performance problem comes from the way characters are drawn on the display. As I demonstrated in a recent blog post, in order to get the best performance out of serial displays, the maximum amount of data needs to be sent in each data transaction. Here's the code which I'm referring to:

for (yy = 0; yy < h; yy++) {
for (xx = 0; xx < w; xx++) {
if (!(bit++ & 7)) {
bits = pgm_read_byte(&bitmap[bo++]);
if (bits & 0x80) {
if (size_x == 1 && size_y == 1) {
writePixel(x + xo + xx, y + yo + yy, color);
} else {
writeFillRect(x + (xo16 + xx) * size_x, y + (yo16 + yy) * size_y, size_x, size_y, color);
bits <<= 1;

The code above loops over the pixels of the character bitmap and draws any foreground pixels (set bits) that it finds. For each foreground pixel, it calls the writePixel() or writeFillRect() functions which test boundaries, set the LCD write window and then finally write the pixel(s). In other words, the loop is mighty slow. Here's a video of it in action on a ATmega328 (Uno equivalent). It's drawing text over the same area with different colors:

The method I use in my imaging library defines the character bounding box as a "memory window" on the LCD one time, then writes the data one scan line at a time (it also skips over empty areas quicker).

for (j=0; j<iLen; j++) {
if (uc == 0) { // need to read more font data
j += bits;
while (bits > 0) {
*d++ = usBGColor; // draw any remaining 0 bits
uc = pgm_read_byte(&s[iBitOff>>3]); // get more font bitmap data
bits = 8 - (iBitOff & 7); // we might not be on a byte boundary
iBitOff += bits; // because of a clipped line
uc <<= (8-bits);
k = (int)(d-u16Temp); // number of words in output buffer
if (k >= TEMP_HIGHWATER) { // time to write it
myspiWrite((uint8_t *)u16Temp, k*sizeof(uint16_t), MODE_DATA, 1);
d = &u16Temp[0];
} // if we ran out of bits
*d++ = (uc & 0x80) ? usFGColor : usBGColor;
bits--; // next bit
uc <<= 1;
} // for j

The following video shows that the same MCU doing the same operations on the same LCD can run much faster when the code works with (instead of against) the properties of the display. On faster MCUs the difference between the slower code and mine becomes more pronounced. On the ESP32, the more efficient code is 30-50x faster:


Popular posts from this blog

My BLE Adventures

Fast SSD1306 OLED drawing with I2C bit banging

How much current do OLED displays use?