Tofu – The opposite of a font

166 points by juanpotato 8 years ago · 65 comments

Reader

WARNING: PLEASE DO NOT RUN THIS CODE ON A HARD DRIVE, I'M NOT SURE HOW LONG IT WOULD TAKE. I USED A RAM DISK

I haven't looked at the implementation, but this seems to imply it is very I/O intensive and already takes a very long time in RAM. Yet, nothing about the problem statement suggests it would be such a task --- it sounds like something that could be very straightforwardly done completely in memory.

chungy 8 years ago

That warning made me want to try it on my hard disk just to see how long it would take...
It was only a couple minutes. That's a bit disappointing.
- sleepychu 8 years ago
  
  SSD or spinning platter?
  - chungy 8 years ago
    
    Spinning platter. ZFS with compression is probably helping that.
    
    panic 8 years ago
    
    Your OS is probably caching the file at the virtual memory layer, so it doesn't even have to hit the filesystem at all.
    
    sillysaurus3 8 years ago
    
    A good lesson not to try to guess where the bottlenecks in a program will be. :)
    
    wereHamster 8 years ago
    
    mount -o sync :)
juanpotatoOP 8 years ago

You're right and I should have added more details. fontforge only accepts file paths and not strings sadly
- monsieurbanana 8 years ago
  
  Libraries to treat strings as files are pretty common, in python they have one in the std:
  https://docs.python.org/2/library/stringio.html
  - juanpotatoOP 8 years ago
    
    But that's a file, it accepts a file path. I already looked into this
CJefferson 8 years ago

It's just that the python library used won't read SVG from a string, online from a file given a filename, so each character must be written to disk, then read back in.
- userbinator 8 years ago
  
  Wow. To me that's definitely in the realm of "fix this if you need to run this program more than once" inefficiency. After looking at the implementation, it doesn't seem like it takes advantage of TTF's "composite glyphs" feature either, which would be the most straightforward way of generating a font like this --- once you define the box and the digit glyphs, each character is then composed entirely of references to the box and the appropriate digit glyphs.
  - juanpotatoOP 8 years ago
    
    Yeah I tried looking to see how much I would need to change to fix it, but I'm not too great in C which is what fontforge was written in.
    Oh man I gotta look into the composite glyphs. I don't know much about fonts in general. Thanks for this bit of information.
  - Zyst 8 years ago
    
    >To me that's definitely in the realm of "fix this if you need to run this program more than once"
    Maybe the programmer doesn't need to use this more than once?
ClassyJacket 8 years ago

Hobbyist-only coder here who hacks together high-level crap, not memory management and such. Question -
If this program needed to operate from memory instead of a disk, why didn't the coder just... code it that way? Are they saying to use a RAM disk to ensure you don't encounter automatic paging out to disk by the OS?
- codefined 8 years ago
  
  I believe it's a requirement of font forge that they're using.
  It requires file paths, and it will be doing a large amount of reading. Easiest way of solving this is probably just to fake that your memory is a hard drive.
  - djsumdog 8 years ago
    
    As other comments have stated, most operating systems do a pretty good job at caching this type of I/O anyway. It goes back to the idea of not optimizing until you know where the bottlenecks really are. A lot of time the underlying layers may take care of it for you.

transitorykris 8 years ago

"This is stupid. Yeah probably, but for its very specific use case it's not terribly bad."

Don't put your work down like this. You created something useful for yourself, and likely useful for others. I'd be surprised if someone called it stupid.

djsumdog 8 years ago

Yea I was surprised at that too. This is incredibly useful for debugging and trying to fiend weird security/unicode gotchas. I can think of a couple of use cases this might be worth trying.
juanpotatoOP 8 years ago

Lol it was a joke. Thanks for the support man!

peterburkimsher 8 years ago

That's great! When I first started trying to learn Chinese, I had a problem because I couldn't read the characters and sometimes couldn't copy-paste from some apps. I made a font that was similar, and then I could use the Unicode Hex Input keyboard. Your font looks so much better though!

juanpotatoOP 8 years ago

Thanks dude!

mintplant 8 years ago

re: license, the SIL Open Font License is common and well-accepted. http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=...

Retr0spectrum 8 years ago

> WARNING: PLEASE DO NOT RUN THIS CODE ON A HARD DRIVE, I'M NOT SURE HOW LONG IT WOULD TAKE. I USED A RAM DISK

How much disk IO does this program generate? Why?

In any case, the OS should cache stuff and give similar performance to using a ramdisk, provided you have enough spare ram.

13of40 8 years ago

> the OS should cache stuff
When I worked at Microsoft back in the 90's I inquired around building 26 as to why I couldn't create a ram disk on what was to become Windows 2000, and this was basically the answer. So yes, if caching doesn't solve this then caching hasn't been implemented correctly.
- MichaelGG 8 years ago
  
  What if you're using a program that fsyncs? And the filesystem isn't just keeping writes cached and flushing only when full. So if I'm writing GB of temp files, I'm causing tons of disk load which may impact other programs. Can't seek to read while writing all my temp data...
  - Dylan16807 8 years ago
    
    > What if you're using a program that fsyncs?
    Remove the fsync or use libeatmydata.
    > So if I'm writing GB of temp files
    The kind that won't fit on a ramdisk in the first place? And this particular use case shouldn't need that.
    
    MichaelGG 8 years ago
    
    The scenario is I'm running a DB on the same system as temp program. Also, machines with many GB of ram are popular these days.
    
    Dylan16807 8 years ago
    
    If fsync on one file forces all other files to disk, then it is 100% an OS performance problem.
    If your temp files fit in ram along with running programs, then they should stay cached. A ramdisk should only be needed in edge cases where you want to manually force other things out of memory, or because you particularly want to avoid extra writes to disk. In general performance is supposed to be boosted just as much by caching.
    
    MichaelGG 8 years ago
    
    If your disk head is writing the temp file, then it can't be sitting idle waiting to fsync. And it'll eventually write as part of checkpointing.
  - 13of40 8 years ago
    
    That's actually a good point.
juanpotatoOP 8 years ago

Fontforge doesn't have a function for loading svgs as glyphs from strings. Only files paths.

jfk13 8 years ago

Might be interesting to compare SIL's "Unicode BMP Fallback Font",[1] which is significantly more compact (I believe it uses composite glyphs).

[1] http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=...

juanpotatoOP 8 years ago

I'm trying to find how to do this in fontforge

lifthrasiir 8 years ago

It is both reasonable and unfortunate that TrueType & OpenType fonts can have no more than 65,535 glyphs. That said, probably the logical font made of multiple physical fonts can be probably made to support all available planes.

userbinator 8 years ago

That said, probably the logical font made of multiple physical fonts can be probably made to support all available planes.
Yes, that's how you can fit all of Unicode into one "font" -- use a collection of fonts with at most 64K glyphs each:
https://graphicdesign.stackexchange.com/questions/73166/what...
I'll leave it as an "exercise for the reader" to generate such a font.
- djsumdog 8 years ago
  
  Wow, I did not know about that limitation. That's pretty interesting.

edem 8 years ago

We had a bug some weeks ago which was caused by a missing `trim` and an invisible chararcter. Our tester copied some text from a webpage which had such characters. This font is helpful in those cases.

hobarrera 8 years ago

Does anyone know the name of the app that's listing/displaying fonts? It looks pretty neat, and I haven't found anything good-looking like that for Linux/*nix.

relyks 8 years ago

It's GNOME's font manager: https://fontmanager.github.io/
If you're using ubuntu, there's a package for it
- hobarrera 8 years ago
  
  Oh, thanks, cool.
  It's designed with GNOME in mind, but not part of GNOME itself, so installing it doesn't require pulling a huge part of GNOME either, and that's a nice plus! :)
- juanpotatoOP 8 years ago
  
  yep

LoSboccacc 8 years ago

very nice! this should be the default binary view on many, many things, especially debuggers/inspectors.

how does it handle combining characters? do they get everything in a single box or are going to be rendered as two boxes?

juanpotatoOP 8 years ago

They _should_ be treated as two separate boxes as I made it so that each glyph is certain to be full width regardless of the default.

dgreensp 8 years ago

I feel like I'm missing the punchline. How big is the generated font?

juanpotatoOP 8 years ago

~20MB

insulanian 8 years ago

I'd call it "Unfont" :-)

boltn 8 years ago

how would one go about running code/operations on ram as opposed to the hd?

k__ 8 years ago

On most operating systems you can use part of your RAM as disk. Format it like a regular disk, mount it and save files on it.
The data is gone if you restart tho.
- ClassyJacket 8 years ago
  
  I've wondered for a while how fast we could make a phone (or PC) that operated entirely in RAM disk and used flash storage just as a one-to-one backup and storage when powered off. Obviously this would require your phone to have 128GB of RAM. You'd write changes to the flash storage, but it'd mirror RAM as closely as possible without destroying power management or storage life.
  Imagine if there was no lag opening any app because everything was in memory. Imagine your code getting simpler because you don't need to load assets off disk - that's done all in one go at boot time. You want to render an image out? Just do it - a reference to a file is a reference to a file, no loading it or wondering if it's loaded.
  Flash storage in phones is fast enough these days that it's probably not worth it, and simply giving traditional phones lots of RAM will probably give 80% of the improvement for 20% of the cost. But I've been curious about the idea for some time.
  - rocky1138 8 years ago
    
    > I've wondered for a while how fast we could make a phone (or PC) that operated entirely in RAM disk and used flash storage just as a one-to-one backup and storage when powered off. Obviously this would require your phone to have 128GB of RAM. You'd write changes to the flash storage, but it'd mirror RAM as closely as possible without destroying power management or storage life.
    PuppyLinux did this for the Eee PC. Worked wonderfully. The only problem was that the distro eventually became outdated, filled with old packages.
    http://www.puppylinux.org/wikka/EeePC
  - Dylan16807 8 years ago
    
    The first Nexus phone had .5GB RAM and .5GB flash.
    In other words, you don't need a ton of space. 8GB split among OS, apps, an running programs would do pretty well, honestly. 16GB is definitely enough. The question becomes more about how much space you want for photos and mp3s, and whether there is actually a benefit to putting those in RAM at all.
  - Someone 8 years ago
    
    The hot stuff already is in the disk cache, so _IF_ the applications are written well, I don't think the difference would be large.
    Also, you may need more RAM than flash, as large resources may/should be compressed in Flash (and personal data should be encrypted)
  - colejohnson66 8 years ago
    
    Then if the phone had originally 4 gigs of ram, you’d need an extra 4 gigs of ram for each app. I guess you could partition your ram to have 4 gigs as scratch, but that would limit the number of programs you can run.
Retr0spectrum 8 years ago

On most Linux distros, /tmp is a ramdisk.
- mcpherrinm 8 years ago
  
  I don't think this is nearly universal - certainly none of the systems I use have /tmp as tmpfs.
  /dev/shm, on the other hand, is almost always guaranteed to be a tmpfs on glibc systems.
- Twirrim 8 years ago
  
  No.... some, sure. Not most.
  RHEL7, and all it's derivatives (CentOS, Oracle Linux etc.) don't use tmpfs for /tmp by default, and they are arguably the most used linux distros in the world.
- JetSpiegel 8 years ago
  
  But with only 1 GB capacity
  - glandium 8 years ago
    
    It depends on your setup. The default when no size is given is to use half the RAM.
    
    colejohnson66 8 years ago
    
    Half? That seems kind of excessive. Are you sure that’s right?
    
    Sidnicious 8 years ago
    
    I would guess that it’s given that capacity, but the memory is allocated lazily.
    
    yjftsjthsd-h 8 years ago
    
    Correct; tmpfs only uses as much memory as as is used by the files it contains.

nicostouch 8 years ago

TOFU = Terrified Of Fucking it Up.

e.g I wanna try it but i'm TOFU

Settings

Tofu – The opposite of a font

Keyboard Shortcuts