Advances in font technology and GTK text rendering

Please consider subscribing to LWN
Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

At this year's GUADEC in Denver, Colorado, Behdad Esfahbod and Matthias Clasen presented a two-part talk on a topic that's deeply important to desktop environments: fonts. Esfahbod covered advances in font technology that are making their way to becoming standards, and Clasen briefly discussed improvements in GTK text rendering. The talk presented some fascinating insights into the problems around accurately rendering writing systems on the desktop, and where font technologies may be going in the near future.

Esfahbod, a GNOME contributor for nearly 20 years and creator of the HarfBuzz text-shaping engine, started things off with his part of the talk "Better-Engineered Font-Formats", subtitled "yet another update". He noted that the talk was following up a much longer online presentation of the same name from 2021. (Slides are available here, and a there is also a summary of the talk prepared by Simon Cozens.) That talk was based on his ideas for future font formats and technologies, from ways to improve existing technologies like OpenType to more ambitious things like a future with fully programmable fonts. Esfahbod said that there has been work to push some of those ideas through the ISO standardization process since the presentation.

Boring expansion

He described the work being done to make changes to OpenType as a "boring expansion": making incremental improvements and addressing decades-long limitations of the format. He also complained about the pace of making changes through the ISO standards body, and noted that "we want to make an effort to change things at a faster pace than OpenType has previously seen". Most of the ideas that are being pursued now were already present in 2015, but "every time we wanted to make changes, other stakeholders would push back because 'this will take forever to roll out'". But if it had been done in 2015, it would be available for users today.

Esfahbod said that there are four pieces that are being pushed through the standardization process. The first that he discussed is the axis variations table version 2 or avar2. This involves a better organization of variable fonts, which allow users to specify things like the weight and slant of fonts rather than only choosing from fonts that have preset parameters. He said that avar2 was originally shipped as an experimental feature in HarfBuzz and FreeType, but it was then enabled by default after Apple shipped it last year "without announcing it or telling us".

Avar2 has multiple use cases, he said, but "I'm just going to show one of them" and put up a slide about parametric fonts. Each font has four inputs, its weight, width, optical size, and grade. Typically, font designers have to specify all of those parameters, and that requires shipping "like 81" masters of a font. With avar2, the design space can be converted into modular axes that only require specifying the minimum and maximum of each axis. "So we can reduce something from 81 masters to eight or nine masters", which has produced savings in file sizes of 70% for the Amstelvar font and 80% for the Roboto Flex font.

His next topic was cubic glyf outlines. The glyf data table for fonts stores outline data for the font glyphs: a representation of each character. Historically, he said, TrueType fonts have only used quadratic Bézier curve segments. By moving to cubic Bézier curves, it would be possible to save space (between 3% and 10% depending on how curve-heavy its design is) and reduce conversion errors when compiling fonts. He did not go into great detail about the differences between quadratic and cubic outlines, but Fábio Duarte Martins has an excellent post that illustrates the differences.

The third "boring" topic was variable composites / components (VarComposites) also called "smart components". Traditionally, font designers have been able to reuse partial glyph shapes from fonts to create new characters, but the components are static. Moving to VarComposites makes it possible to reference the same glyph but adds the ability to vary rotation, scale, shear, and so forth. This technique is particularly useful for designing Chinese, Japanese, and Korean (CJK) fonts, he said. A Han variable font with two weights and about 44,000 ideographs using VarComposites realized a 67% decrease in file size, and a Hangul font with two weights and about 11,000 syllables had a decrease in file size of 92%.

Beyond 64K

The final expansion topic Esfahbod talked about is the proposal to enable fonts with more than 64K glyphs. While 64K (65,535 to be precise) glyphs may sound like a lot, he noted that there are CJK fonts that exceed that limit. The Noto CJK and Source Han fonts are at the maximum glyph count and are missing many CJK characters. Some pan-Unicode fonts (those that include multiple writing systems), such as the Noto Sans family, require more than 100 separate files due to this limitation. "And then you open LibreOffice and you see, like, Noto Sans Arabic, Noto Sans Bengali, you see 100 families listed, which is an extremely bad user experience."

The beyond-64K proposal would allow fonts to have 24-bit glyph indices, or up to 16,777,216 glyphs. He said that they had created an experimental Noto Sans merged font that contained all scripts, except CJK, that contained more than 100,000 glyphs. Next, he wanted to talk about ergonomics for developers.

Rewrite all the things

Currently, he said, we have FreeType for font rasterization and HarfBuzz for text shaping. FreeType is a C codebase from the 1990s, and HarfBuzz is a C++ codebase from the 2000s. In addition, there is fontmake, a tool that compiles fonts from source to binary formats, which is written in Python. "So, what the Google Fonts team is doing is to rewrite everything in Rust in a unified code base."

The motivation is "mainly security", he said, but it will also have the benefit of speeding up compilation ("because Python is just so slow") and it will provide a single place to implement features instead of three. Esfahbod said that three or four people were actively working on this, so "expect to see it in Chrome very soon, and Android and other places later".

Even with all of the improvements Esfahbod had discussed so far, he pointed out that there were still things, especially with Arabic typesetting, that cannot be expressed using OpenType. He wanted to skip solving the problems using OpenType, which could take ten years or more, and encode the font-shaping logic using WebAssembly (Wasm) embedded in fonts instead.

For example, he said, there are shaping engines such as Graphite that do work well with Arabic scripts. Unfortunately, fonts that are built for Graphite or other systems require specific shaping engines that only work on certain platforms or with certain programs like Firefox or LibreOffice. But if a font embeds the shaping engine in a "wasm" table, "everything will work".

He put up an examples of Arabic script as rendered with OpenType and as rendered by a custom-built Wasm engine created by Cozens. The OpenType renderings positioned several glyphs in the script incorrectly, in some cases rendering part of the script on top of other glyphs so that separate elements were merged. The Wasm versions, on the other hand, were rendered correctly according to Esfahbod. He also demonstrated rendering Egyptian hieroglyphs and some dynamic fonts. All of the demos can be seen in the recording of the talk, starting at about 10:34.

Beyond the desktop

The big question, he said, is "why?" As in, why do we need to explore using Wasm for fonts when there are already existing solutions for things like Arabic type? The reason is that "we don't want to specifically encode one Arabic solution" because things are constantly being changed and improved. Some users might object to having plugins that do font shaping in their browser or on their system, but "shaping doesn't only happen on the web or desktop". There is also print typesetting, where it is "highly desirable" to be able to customize fonts.

There is precedent for moving from static, data-driven representations to fully programmable approaches, Esfahbod said. In graphics, there are the OpenGL and WebGPU shading languages. "The web itself moved from being static to using JavaScript and WebAssembly." Typesetting, he said, had early programmable systems such as Metafont and TeX. LuaTex even embeds Lua in the typesetter. "So you have full programmability at the layout level, but not the font level." Instead of encoding shaping logic in ways that OpenType understands, "we just want to ship the code in there".

Esfahbod closed out his portion of the talk with a few "goofy" examples of fonts with advanced capabilities built in. One example, called llama.ttf can do text completion. It embeds the Llama large language model (LLM) and can be used with HarfBuzz, if it has the optional Wasm support built in. Another font, called translate.ttf does what its name suggests: translates text typed in one language to another language. He then thanked the Google Fonts team for supporting his work, and recommended reading his state of text rendering 2024 paper from July.

Fonts and GTK

Clasen is a GTK maintainer and has been involved in GNOME for more than 20 years. He started his, shorter, portion of the talk with a diagram of the text-rendering stack in GNOME. Currently, GTK uses Pango as the "core of text and font handling". In turn, Pango uses Cairo (which uses FreeType), HarfBuzz, and Fontconfig. The problem with that arrangement is that when new features turn up in HarfBuzz, "we always have to figure out how to plumb them through to Pango and reach GTK eventually". That can be a hassle so, long-term, GNOME would like to get Pango out of the way and use HarfBuzz to replace the other pieces.

The good news is that, in terms of text rendering, "nothing happened in Pango basically in the last year [...] all the action was on the GTK side, actually". GTK has introduced new renderers in the last year that are better at anti-aliasing than before. (Clasen had spoken about GTK's new renderers the day before, and that video is here.)

He showed a slide (around 22:20 in the fonts talk video) that demonstrated why anti-aliasing is important. The image on the left of the slide demonstrated a "simple-minded" renderer drawing diagonal lines of various widths on a pixel grid, with obvious gaps in the line for narrower versions of the line. The "simple-minded" renderer would only draw pixels "where the shape happens to cover the center of the pixel". That works if a shape is larger than the size of the pixel grid, he said, otherwise "you end up just randomly losing pixels". The more sophisticated renderer, demonstrated by the image on the right, provided more even coverage even for smaller lines.

Things become a bit more complicated when trying to do fractional scaling. Traditionally, GNOME's HiDPI rendering simply switches to a 200% scale where the application still expects the same pixel grid, but each pixel is actually mapped to four pixels on the device. "That is easy and works well, because the device pixel grid and the application pixel grid remain aligned." Fractional scaling, however, becomes much more complicated. At 125%, for example, "you have five device pixels for four application pixels, so the grids are no longer perfectly aligned" and GTK has to figure out how to map that to the device pixels.

If an application is trying to render a "B", GTK tries to position it on the application grid so that it "nicely lines up with the pixels". When using fractional scaling, though, it means that the application grid does not line up with the device grid, "and our nice and sharp vertical stem is blurred out, which is not a great result". What GTK's new renderers do is to use font hinting on the device pixel grid instead "and I hope it makes real fonts sharper and makes everything look better, and that's all I had to show".

With that, he opened the floor to audience questions. The first question was about what to do "to make it nice" when an application is scrolling, such as a text editor, so that fonts don't "jump around" when using fractional scaling. Should hinting be disabled during the live scroll? Clasen said that handling animations and font hinting is "a difficult problem" and "to be honest, I'm not sure what the answer to this is".

Another audience member said that they had "memories of talks about a GPU rasterizer for fonts or something" and wanted to know if there was an update on that. Clasen said that was "something Behdad did a decade ago" called GLyphy. Currently, GTK still renders font glyphs "traditionally" by rasterizing them with Cairo. What would be ideal is to upload scalable Bézier curves "and then have a shader that can handle those". That was what GLyphy was experimenting with, he said.

There is "a working branch somewhere" that uses GLyphy, Clasen said, but it has not been merged yet because "other things were of a higher priority". It was something that "we'll come back to" once other things are sorted out. With that, the session was out of time and attendees were headed to lunch.

[Thanks to the Linux Foundation, LWN's travel sponsor, for supporting my travel to this event.]

Index entries for this article
Conference	GUADEC/2024