A Gentle Guide to Typography: From Chisels to Character Sets

Before there were fonts, before there were printing presses, before there was even an alphabet – there were people who wanted to say things that would last longer than a breath.

They scratched marks into wet clay. They carved shapes into stone. They painted on cave walls with ground-up ochre and spit – the pigments at Lascaux date to around 17,000 years ago.¹ But that’s not the oldest mark-making by a long stretch. The First Nations peoples of Australia – the oldest continuous civilisation on Earth – were creating rock art tens of thousands of years earlier. Petroglyphs in the Pilbara region of Western Australia have been dated to at least 30,000 years ago, and charcoal drawings in Arnhem Land’s Nawarla Gabarnmang shelter push past 28,000 years.^1a Some researchers argue the tradition extends back 65,000 years or more, to the earliest evidence of human settlement on the continent.^1b Writing, in its oldest form, was a physical act: you took a tool and you pushed it into something that would hold the mark after you walked away.

This is where typography starts. Not with software. Not with design theory. With someone pressing a wedge into clay and thinking: I want this to outlive me.

From hand to mould

For thousands of years, every copy of every written document was made by hand. Scribes – often monks in medieval Europe – would sit for hours copying text character by character onto parchment or vellum. Each copy was unique. Each was slightly different. The handwriting of the scribe was the “font”, though nobody called it that.

Then, around 1440, Johannes Gutenberg changed everything.²

Gutenberg didn’t invent printing – the Chinese had been doing block printing for centuries, and Bi Sheng had created movable type from baked clay as early as 1040 AD.³ What Gutenberg invented was movable metal type: individual letters, each cast as a small block of a lead-tin-antimony alloy,⁴ that could be arranged into words, locked into a frame, inked, and pressed onto paper. When you were done printing one page, you could break the letters apart and rearrange them into something else.

This was revolutionary, and it introduced a bunch of concepts we still use today. So let’s walk through them – starting from the most fundamental.

Characters

A character is the abstract idea of a letter, digit, or symbol. The letter “A” is a character. So is “7”. So is “?”. So is “é”. A character doesn’t have a specific shape – it’s the concept of that symbol. When you think of the letter B, you’re thinking of a character: the second letter of the Latin alphabet, regardless of whether it’s tall and thin or short and round.

This distinction matters because the same character can look wildly different depending on who’s drawing it. Your handwritten “g” looks nothing like the “g” on this screen, but they’re the same character. They carry the same meaning.

Glyphs

A glyph is the specific visual shape that represents a character. If a character is the idea, a glyph is the drawing. The letter “a” is a character; the particular way it looks in this paragraph – its curves, its weight, its proportions – that’s a glyph.

One character can have many glyphs. Think about “a” for a moment. There’s the double-storey version (the kind you’re probably reading now, with a little arch over a closed bowl) and the single-storey version (the simpler one that looks like a circle with a stick, the kind most people write by hand). Both are glyphs of the same character.

This goes further. An italic “a”, a bold “a”, a small-caps “A” – these are all different glyphs of the same character. Gutenberg understood this instinctively. His Bible used around 290 distinct glyphs – far more than the alphabet required – including variant letterforms and common ligatures, all designed to mimic the natural variation of handwriting.⁵

Typefaces

Now we’re getting to the term people most often mix up.

A typeface is a designed set of glyphs that share a consistent visual style. When someone sits down and draws a complete alphabet – uppercase, lowercase, numbers, punctuation – in a unified style, they’ve created a typeface. Helvetica is a typeface. Garamond is a typeface. Times New Roman is a typeface.

The word “typeface” comes directly from the physical world. In Gutenberg’s workshop, each metal letter block had a face – the raised surface that got inked and pressed onto paper. A set of blocks sharing the same design was a set of type with the same face. A typeface.

When people say “I love that font”, they usually mean the typeface: the overall design, the aesthetic, the personality. And that’s fine – language evolves. But if you want to be precise, the typeface is the design.

Fonts

So what’s a font then?

In the metal-type era, a font was a specific size and style of a typeface. Garamond 12-point italic was one font. Garamond 14-point bold was a different font. They were literally different sets of physical metal blocks. You had to buy them separately and store them in different drawers.

Those drawers, by the way, were called cases. The capital letters were stored in the upper case (the harder-to-reach one, since capitals are used less often) and the small letters in the lower case – which is where we get the terms “uppercase” and “lowercase”.⁶ (Lovely, isn’t it?)

In the digital world, the distinction has blurred. A font file today usually contains the full set of glyphs for one style of a typeface – Garamond Italic, say, or Garamond Bold. The typeface is the family; the font is the specific file or instance. But in everyday conversation, “font” and “typeface” are used interchangeably, and that’s okay.

Font faces

Font face is a term that lives mostly in the world of CSS and web development. When you write @font-face in a stylesheet, you’re telling the browser: here’s a font file, and here’s what I want you to call it. It’s the bridge between a font file sitting on a server and a name you can use in your design.

In broader typographic conversation, “font face” and “typeface” mean roughly the same thing – the visual design of the letterforms.

Serifs (and their absence)

Look at the letters in a book printed in Times New Roman. See those little feet and flicks at the ends of the strokes? Those are serifs.

The word probably comes from the Dutch schreef, meaning “stroke” or “line”.⁷ Serifs have been around since Roman times – literally. If you look at the inscriptions on Trajan’s Column in Rome (dedicated 113 AD), the letters have serifs.⁸ There’s a beautiful theory, advanced by Edward Catich in his 1968 study The Origin of the Serif, that they originated not from the chisel but from the brush: before carving, Roman stonecutters painted the letterforms with a flat brush, and the natural flare of each brush stroke at the start and end of a line became the serif. The chisel then faithfully followed the painted guide.⁹

Typefaces with serifs – like Garamond, Baskerville, Georgia, and Times New Roman – are called serif typefaces. They feel classic, bookish, warm. Serifs also have a practical function: they help guide the eye along a line of text, creating a subtle visual rail. That’s why they’ve been the default for body text in printed books for centuries.

Typefaces without serifs – like Helvetica, Arial, Futura, and Gill Sans – are called sans-serif typefaces (“sans” is French for “without”). They tend to feel modern, clean, minimal. On screens, especially at small sizes, sans-serif typefaces have historically been easier to read because the fine details of serifs can get lost in low-resolution pixels. (High-resolution screens have closed that gap considerably.)

There are other categories too. Slab serif typefaces (like Rockwell or Courier) have thick, blocky serifs – bold and industrial. Monospaced typefaces give every character the same width, which is why they’re used for code: everything lines up neatly. Script typefaces mimic handwriting. Display typefaces are designed for headlines and large sizes, where they can be dramatic without worrying about readability at 10 points.

Spacing and leading

When Gutenberg assembled his type, the letters didn’t just touch each other. The metal blocks had built-in spacing – a little extra metal on each side of the letter face, so that when you lined them up, there was breathing room between characters.

Spacing (or tracking in modern terminology) is the uniform distance between all characters in a block of text. Increase the tracking and the text feels airy, open, maybe a little aloof. Decrease it and things get tight, urgent, compressed. Good tracking is invisible – you don’t notice it, but you feel comfortable reading.

Leading (pronounced “ledding”) is the vertical space between lines of text. The name comes from the actual strips of lead that typesetters placed between rows of metal type to push the lines apart.¹⁰ More leading gives text room to breathe. Less leading packs it in. The right amount depends on the typeface, the line length, and where the text is being read. Cramped leading is one of the quickest ways to make text feel hostile.

Kerning

Kerning is the adjustment of space between specific pairs of characters. This is different from tracking, which affects all characters equally. Kerning is about individual relationships.

Consider the letters “AV”. Because of their shapes – one leaning left, one leaning right – if you just space them evenly using each letter’s default width, there’ll be an awkward gap between them. It looks like “A V” instead of “AV”. Kerning tucks them closer together so they feel right.

Other classic kerning pairs: “To”, “We”, “Ty”, “VA”, “LT”. Any combination where the shapes of adjacent letters create an optical gap that needs closing.

Good kerning is something you never notice. Bad kerning is something you can’t unsee. (There’s a whole internet subculture dedicated to finding poorly kerned signs. It’s called “keming” – because that’s what “kerning” looks like with bad kerning.)

Metrics and the anatomy of letters

Typographers have a precise vocabulary for the parts of a letter, and it’s unexpectedly beautiful.

The baseline is the invisible line that letters sit on. The x-height is the height of the lowercase “x” – and by extension, most lowercase letters. Tall lowercase letters like “b” and “d” have ascenders that rise above the x-height. Letters like “p” and “g” have descenders that drop below the baseline.

The cap height is the height of capital letters. The bowl is the rounded part of letters like “b”, “d”, “p”, and “o”. The counter is the enclosed (or partially enclosed) space inside a letter – the hole in “o”, the gap inside “e”. The stroke is any main line in a letter. A terminal is where a stroke ends without a serif.

The em is a unit of measurement that originally meant the width of the capital M – because M was typically the widest letter, and its width roughly equalled its height, making a nice square. Today, an em is simply equal to the current point size: in 16-point type, an em is 16 points. It’s used everywhere in typography and CSS. An en is half an em – roughly the width of a capital N, and the unit behind the en-dash (–), which is half the width of an em-dash (—).

But what is a point? And how does it relate to the pixels on your screen?

A point (pt) is the fundamental unit of typographic measurement. The concept dates back to Pierre Simon Fournier, who proposed a standardised point system in 1737, later refined by François-Ambroise Didot in the 1780s.¹¹ In the modern PostScript standard (used by virtually all digital typography), one point is exactly 1/72 of an inch.¹² So 72-point type has letters about an inch tall. This wasn’t always the case – before digital standardisation, different countries used slightly different point sizes. The American point (established by the American Type Founders Association in 1886) was 0.01383 inches; the French Didot point was 0.01483 inches – about 7% larger¹³ – which made international typesetting exciting in all the wrong ways.

A pica is 12 points, or 1/6 of an inch. Picas are used for measuring larger things – column widths, margins, page dimensions. If a designer says “set the body text in 10-point on a 20-pica column”, they mean 10-point type in a column about 3.3 inches wide. You might also encounter the cicero, the continental European equivalent of the pica, which is 12 Didot points (slightly larger than an Anglo-American pica). It’s mostly historical now, but you’ll find it in older European typography manuals.

A pixel (px) is a single illuminated dot on your screen, and its physical size depends entirely on the display. On a 96-DPI (dots per inch) screen – the traditional Windows default – one pixel is 1/96 of an inch, so a CSS “point” (1/72 inch) works out to about 1.33 pixels. On a modern Retina display at 220 DPI, the same point might be 3 or more physical pixels.

This is where it gets confusing. CSS defines 1px as exactly 1/96 of an inch¹⁴ – but on high-DPI screens, a CSS pixel might map to 2 or 3 physical device pixels. Your phone’s “logical” resolution (the one websites see) is often half or a third of its actual hardware resolution. The operating system handles the scaling, which is why text looks sharp on a Retina display: there are simply more physical pixels per logical pixel, giving the rasteriser more dots to work with when drawing those Bézier curves.

In practice: points for print, pixels for screens, ems for responsive design. An em in CSS is relative to the current font size, so padding: 1em means “pad by the width of one M in whatever size we’re using”. This makes layouts scale naturally when the user changes their font size – which is why web designers love ems and their cousin, the rem (root em), which is relative to the root element’s font size rather than the current element’s.

Character sets and encodings

Now we leave the world of ink and metal and enter the world of computers. And things get… complicated.

When computers first needed to represent text, someone had to decide: which characters do we support, and how do we store them?

ASCII (American Standard Code for Information Interchange), first published as ASA X3.4-1963 and revised several times through 1986,¹⁵ was one of the earliest answers. It used 7 bits to represent 128 characters: the English alphabet (upper and lower), digits 0-9, punctuation, and a handful of control characters (like “new line” and “tab”). It was simple, elegant, and completely inadequate for anyone who didn’t write in English.

To make this tangible, here’s what the letter “R” looks like as actual bits in ASCII:

Character:  R
Decimal:    82
Hex:        52
Binary:     01010010

Seven bits of information. That’s all it takes. The letter “A” is 01000001 (65), “B” is 01000010 (66), and so on. Uppercase and lowercase letters are exactly 32 apart – “a” is 01100001 (97) – which means you can convert between them by flipping a single bit (bit 5, if you’re counting from zero). This wasn’t an accident; the designers of ASCII, led by Robert Bemer at IBM, were very clever about the layout.¹⁶

A character set (or charset) is the complete collection of characters that a system recognises. ASCII’s character set has 128 members. That’s fine for English, but French needs accented characters, German needs ß and umlauts, Greek needs an entirely different alphabet, and that’s before we even get to Chinese, Japanese, Korean, Arabic, Hindi, or the hundreds of other writing systems used by actual humans.

The 1980s and 90s saw a proliferation of extended character sets – ISO 8859-1 for Western European languages, ISO 8859-5 for Cyrillic, Shift JIS for Japanese, Big5 for Traditional Chinese. Each one carved out a different set of 256 (or more) characters. This sort of worked if everyone agreed on which character set they were using, but of course they often didn’t. The result was mojibake: garbled text where characters from one encoding were displayed using another’s mapping. You’ve seen it. Those weird sequences of Ã¤ and â€™ where accented letters and curly quotes should be? That’s mojibake.

Unicode: one set to rule them all

Unicode was the attempt to fix this mess, and it’s one of the great technical achievements of the modern era, even if nobody outside of a relatively small group of people appreciates it.

The idea was simple and ambitious: create a single character set that includes every character from every writing system, living or dead, plus mathematical symbols, emoji, musical notation, and anything else humans have ever wanted to write down.

Each character in Unicode gets a unique number called a code point. These are written using a notation you’ll see everywhere: “U+” followed by a hexadecimal number. Hexadecimal (base 16) uses the digits 0-9 and the letters A-F, so each digit represents a value from 0 to 15. It’s used because it maps neatly onto bytes – two hex digits represent exactly one byte. The “U+” prefix just means “Unicode code point”.

So when you see U+0041, that means Unicode code point number 65 (in decimal) – which is the letter “A”. U+03B1 is code point 945 – the Greek letter alpha (α). U+1F600 is code point 128512 – the emoji 😀. The higher the number, the later the character was added to the standard (roughly speaking). The first 128 code points (U+0000 to U+007F) map directly to ASCII, which was a deliberate design choice that made adoption much easier.

As of Unicode 16.0 (September 2024), the standard defines 154,998 characters covering 168 scripts.¹⁷ Every one of them has a code point and an official name. U+0052 is LATIN CAPITAL LETTER R. U+2603 is SNOWMAN (☃). U+1F4A9 is PILE OF POO (💩). The naming is meticulous, sometimes whimsical, and always permanent – once a character is added, it’s never removed.

But a code point is just a number. To actually store and transmit that number in a computer, you need an encoding – a scheme for turning code points into bytes.

UTF-8 is the most common encoding on the web – used by over 98% of all websites as of 2024¹⁸ – and the one you should almost always use. It was designed in September 1992 by Ken Thompson and Rob Pike, famously sketched out on a placemat in a New Jersey diner.¹⁹ It’s clever: ASCII characters (U+0000 to U+007F) are stored as a single byte, identical to their ASCII values, so all existing ASCII text is automatically valid UTF-8. Characters outside ASCII use 2, 3, or 4 bytes as needed. This makes it compact for English text and capable of representing any Unicode character.

To see the difference, let’s look at how a few characters are stored as actual bytes across the different encodings. First, something simple – the letter “R” (U+0052):

Encoding   Bytes (hex)       Bytes (binary)
ASCII      52                01010010
UTF-8      52                01010010
UTF-16     00 52             00000000 01010010
UTF-32     00 00 00 52       00000000 00000000 00000000 01010010

For basic Latin characters, UTF-8 and ASCII are identical – one byte. UTF-16 pads it to two bytes. UTF-32 pads it to four. You can see why UTF-32 is wasteful for English text: three of those four bytes are zeros, carrying no information.

Now something outside ASCII – the pound sign “£” (U+00A3):

Encoding   Bytes (hex)       What's happening
ASCII      --                Can't represent it (not in the character set)
Latin-1    A3                One byte -- works, but only in this specific encoding
UTF-8      C2 A3             Two bytes (the C2 signals "two-byte sequence")
UTF-16     00 A3             Two bytes
UTF-32     00 00 00 A3       Four bytes

And something further afield – the Japanese character “字” (U+5B57, meaning “character” – how fitting):

Encoding   Bytes (hex)       What's happening
ASCII      --                Can't represent it
Latin-1    --                Can't represent it
Shift JIS  8E 9A             Two bytes (Japanese-specific encoding)
UTF-8      E5 AD 97          Three bytes (the E5 signals "three-byte sequence")
UTF-16     5B 57             Two bytes (falls within the Basic Multilingual Plane)
UTF-32     00 00 5B 57       Four bytes

And finally, an emoji – “😀” (U+1F600):

Encoding   Bytes (hex)          What's happening
ASCII      --                   Can't represent it
UTF-8      F0 9F 98 80          Four bytes (the F0 signals "four-byte sequence")
UTF-16     D8 3D DE 00          Four bytes (a surrogate pair -- two 2-byte code units)
UTF-32     00 01 F6 00          Four bytes (same size as everything else in UTF-32)

Notice how UTF-8 scales: 1 byte for ASCII, 2 for European characters, 3 for most of the world’s living languages, and 4 for emoji and rarer scripts. The leading bits of each byte tell the decoder how many bytes to read. It’s an elegant piece of engineering.

UTF-16 uses 2 bytes for characters in the Basic Multilingual Plane (the first 65,536 code points, which covers most living languages) and 4 bytes for everything else. Those 4-byte characters are encoded using pairs of 2-byte values called surrogate pairs – a clever hack that lets UTF-16 reach the full Unicode range while keeping the common case compact. UTF-16 is used internally by Windows, Java, and JavaScript. If you’ve ever been bitten by a JavaScript string reporting the wrong .length for an emoji, that’s because JavaScript counts UTF-16 code units, not characters, and your emoji needed a surrogate pair.

UTF-32 (sometimes called UCS-4) takes the brute-force approach: 4 bytes for every single character, no exceptions. This makes it simple – the nth character is always at byte offset 4n, so random access is trivial. But it’s wasteful. An English text file in UTF-32 is four times the size of the same file in UTF-8, with three zero bytes for every one byte of actual data.

There are also some historical encodings worth knowing about. UCS-2 was an early 2-byte encoding that predates UTF-16 – it could only represent the first 65,536 code points and had no surrogate pair mechanism, so it couldn’t handle emoji or many CJK characters. It’s effectively obsolete, but you’ll occasionally encounter it in older systems. UTF-7 was designed for email systems that could only handle ASCII – it encoded Unicode characters using only ASCII-safe bytes. It was slow, complex, and is now deprecated for security reasons (it enabled some nasty injection attacks).

The encoding is not the character set. Unicode is the character set (the list of characters and their code points). UTF-8, UTF-16, and UTF-32 are encodings (ways of turning those code points into bytes). This distinction trips people up constantly, but it matters. You might say “this file is Unicode” when you mean “this file is encoded in UTF-8”. Unicode tells you which characters exist. The encoding tells you how they’re stored as bytes.

Representations: how letters become pixels

So we have characters (abstract ideas), code points (numbers assigned to those ideas), encodings (ways to store those numbers), and typefaces (visual designs). The last piece of the puzzle is: how does a computer actually draw a letter on screen?

There are two main approaches.

Bitmap fonts were the early method. Each glyph was stored as a grid of pixels – literally a tiny picture. This was fast to render but didn’t scale well. A bitmap font designed for 12-point looked terrible at 24-point because you were just scaling up the pixel grid, producing jagged edges.

Outline fonts (also called vector fonts) solved this. Instead of storing a grid of pixels, they store the shape of each glyph as a set of mathematical curves – typically Bézier curves, named after Pierre Bézier, the French engineer at Renault who developed them in the 1960s for designing car bodies.²⁰ (Paul de Casteljau at Citroën independently developed equivalent mathematics around the same time, but Renault published first.²¹) To display the letter, the computer calculates which pixels fall inside the outline and fills them in. This process is called rasterisation, and it’s why outline fonts scale beautifully to any size.

The two dominant outline font formats are TrueType (developed by Apple and announced in 1991, partly to avoid Adobe’s licensing fees for PostScript Type 1 fonts,²² with files ending in .ttf) and OpenType (announced jointly by Microsoft and Adobe in 1996,²³ with files ending in .otf or .ttf). OpenType is essentially TrueType’s successor and adds support for advanced typographic features: ligatures, small caps, stylistic alternates, and more.

Hinting is the process of adjusting how outlines are rasterised at small sizes on low-resolution screens. Without hinting, the mathematical curves of a glyph might fall between pixels, creating blurry or uneven strokes. Hints are instructions embedded in the font that snap the outlines to the pixel grid at small sizes, keeping text crisp. It’s painstaking work, and it’s one of the reasons well-hinted fonts (like the core Microsoft fonts) have historically looked so much better on screen than cheaper alternatives.

Ligatures: when letters merge

A ligature is a single glyph made by combining two or more characters. The most common one in English is “fi” – in many serif typefaces, the dot of the “i” collides with the overhang of the “f”, so designers create a special glyph where the two letters are fused together. Other common ligatures: “fl”, “ff”, “ffi”, “ffl”.

Ligatures started as a practical solution in metal type (it was easier to cast certain letter combinations as a single piece) and survived because they look good. OpenType fonts can contain dozens of ligatures, and modern software can substitute them automatically.

Some typefaces take this further with contextual alternates – glyphs that change shape depending on what’s next to them. This is especially common in script typefaces, where a letter might have a different tail depending on the following letter, mimicking the natural flow of handwriting.

How long is a piece of string (or: what even is a character?)

You’d think counting characters would be simple. You want to allow 600-character comments on your website. How hard can it be? You just… count the characters. Right?

Welcome to one of the most quietly maddening problems in software engineering.

Let’s start with something innocent: the letter “é”. Is that one character? It depends on who you ask. In Unicode, it can be represented two ways. There’s U+00E9, LATIN SMALL LETTER E WITH ACUTE – a single code point, unambiguously one thing. But there’s also the two-code-point sequence U+0065 (LATIN SMALL LETTER E) followed by U+0301 (COMBINING ACUTE ACCENT). These render identically. They mean the same thing. They’re defined as canonically equivalent by the Unicode standard. But one is one code point and the other is two.

So when your user types “café” into your 600-character comment box, how many characters is that? If you count code points, it might be 4 or 5, depending on which representation of “é” their keyboard produced. If you count UTF-8 bytes, it’s 5 or 6. If you count UTF-16 code units (which is what JavaScript’s .length does), it’s yet another number.

Now add emoji. The thumbs-up emoji 👍 is one code point: U+1F44D. But 👍🏽 (thumbs up with a medium skin tone) is two code points: U+1F44D followed by U+1F3FD (a skin tone modifier). They render as a single visible symbol. The family emoji 👨‍👩‍👧‍👦 is seven code points stitched together with invisible joiners (U+200D, ZERO WIDTH JOINER): man + joiner + woman + joiner + girl + joiner + boy. One “character” on screen. Seven code points. Many more bytes.

And flags! The flag emoji 🇬🇧 is two code points: U+1F1EC (REGIONAL INDICATOR SYMBOL LETTER G) followed by U+1F1E7 (REGIONAL INDICATOR SYMBOL LETTER B). The system pairs them up and displays a flag. What happens if you insert a character between them? Now you’ve got two orphaned regional indicators that render as ugly letter boxes. Is this one character? Two?

The Unicode standard defines a concept called grapheme clusters – sequences of code points that together represent a single user-perceived character. This is probably what you mean when you say “character”, and it’s what a well-implemented character counter should count. But getting grapheme cluster segmentation right requires implementing a nontrivial Unicode algorithm (UAX #29, “Unicode Text Segmentation”²⁴). Most programming languages don’t do this by default. Python’s len() counts code points. JavaScript’s .length counts UTF-16 code units. Neither counts what a human would call “characters”.

So your 600-character limit? If you implement it by counting .length in JavaScript, a user could type 300 emoji and hit your limit – because each emoji is two UTF-16 code units. Or they could paste in text with combining accents and get 600 “characters” that look like 400. Or they could use a single family emoji and consume 11 of their 600 “characters” on one symbol.

The correct answer is to count grapheme clusters, validate on the server (since the client can always lie), and honestly, to be generous with your limits because this stuff is harder than it has any right to be.

There’s an old joke among internationalisation engineers: “How many characters are in this string?” “It depends on what you mean by ‘character’.” It’s not really a joke. It’s more of a warning.

When letters lie: homoglyphs and Punycode

Unicode’s ambition – including every character from every writing system – introduced a problem that no one at the printing press ever had to worry about: characters from different scripts that look identical.

The Latin letter “a” (U+0061) and the Cyrillic letter “а” (U+0430) are visually indistinguishable in most typefaces. The same goes for Latin “o” and Cyrillic “о”, Latin “p” and Cyrillic “р”, Latin “e” and Cyrillic “е”. These are called homoglyphs – different characters that produce identical (or nearly identical) glyphs.

This is a problem because domain names can contain non-ASCII characters. The system that makes this work is called Internationalised Domain Names (IDN), and under the hood it uses an encoding called Punycode to convert Unicode domain names into ASCII-safe strings that DNS can handle. The domain “münchen.de” becomes “xn–mnchen-3ya.de” in Punycode. The “xn–” prefix tells the system it’s an encoded internationalised domain.

The security implications are nasty. An attacker can register a domain like “аpple.com” where the first “а” is Cyrillic, not Latin. To the naked eye, this looks exactly like “apple.com”. The underlying Punycode is completely different (“xn–pple-43d.com”), but browsers display the pretty Unicode version. This is called an IDN homograph attack, first described by Evgeniy Gabrilovich and Alex Gontmakher in a 2002 paper,²⁵ and it has been used for real-world phishing.

Browsers have defences. Most will display the Punycode version instead of the Unicode version if the domain mixes scripts suspiciously – if some characters are Latin and others are Cyrillic, for instance. Chrome, Firefox, and Safari each have slightly different rules for when to show the Punycode, and these rules have been refined over years of cat-and-mouse with attackers. But the fundamental problem remains: Unicode gives us more than 154,000 characters, many of which look alike, and any system that displays them needs to decide how much to trust what it’s showing you.

It’s not just URLs. Homoglyphs can appear in code too. A variable name that looks like password but uses a Cyrillic “а” is a different identifier entirely. Malicious pull requests have used this trick to sneak backdoors past code review. Some code editors now flag mixed-script identifiers, and Unicode itself defines a set of security mechanisms (documented in Unicode Technical Report #36, “Unicode Security Considerations”²⁶) for detecting confusable characters.

Gutenberg’s compositor never had this problem. Every letter in his type case was unambiguous – you could pick it up and feel its shape. In the digital world, two characters can be byte-for-byte different but pixel-for-pixel identical. The typeface doesn’t lie; the character set does.

What happens when you press a key

Let’s make all of this concrete. You’re sitting at your computer and you press the letter “R”. What actually happens – and how did we get here?

The scribe’s version (500 AD). A monk in a scriptorium dips a quill in iron gall ink. He looks at the exemplar – the book he’s copying from – and draws an R. His hand shapes the stroke, the bowl, the leg. The letter exists because his muscles moved in a practised pattern. The “input device” is his hand; the “rendering engine” is also his hand. The glyph is one-of-a-kind.

The printer’s version (1500 AD). A compositor stands at a type case. He reaches into the compartment labelled R, picks up a small metal block – reversed, so it’ll print the right way round – and slots it into the composing stick alongside the other letters. Later, the assembled type is locked into a frame, inked with a leather ball, and pressed onto dampened paper. The letter R is now reproducible. The same block can print the same R a thousand times. The “input” is the compositor’s hand selecting the right piece of type; the “rendering” is the press.

The typist’s version (1900 AD). A typist sits at a typewriter and strikes the R key. A mechanical linkage swings a type bar upward. On the end of the bar is a small metal slug with a reversed R on its face. It hits an inked ribbon, which presses against paper, leaving the shape of the letter. One keystroke, one character, one glyph. The “encoding” is purely mechanical: each key is physically connected to exactly one letterform. (This is also where monospaced type became the norm – every character had to occupy the same width so the carriage could advance by a fixed amount after each keystroke.)

The early computer’s version (1980 AD). You press R on the keyboard of an IBM PC. The keyboard controller sends a scan code – a number identifying which physical key was pressed (not which character it represents – that comes later). The operating system’s keyboard driver translates the scan code into a character code. On this machine, that means ASCII: the letter R is stored as the number 82 (binary 01010010). The application receives this number, looks it up in a bitmap font – a grid of pixels for each character – and copies those pixels into video memory. The screen redraws. An R appears. The letter is now a number that becomes a picture.

The modern version (today). You press R. The keyboard sends a scan code (via USB or Bluetooth). The operating system’s input system translates it – through the keyboard layout (QWERTY? AZERTY? Dvorak?) – into a Unicode code point: U+0052, LATIN CAPITAL LETTER R. This code point might be stored in memory as UTF-8 (the single byte 0x52, since R falls within ASCII’s range), or as UTF-16 (the two bytes 0x00 0x52), depending on the application.

Now the text renderer takes over. It looks up the current font – say, a .otf OpenType file for the typeface Inter. Inside that file, it finds the glyph for U+0052: a set of Bézier curves describing the outline of the letter R in this particular design. The renderer checks the kerning table to see if R needs to be nudged closer to or further from the characters on either side. It checks for ligatures – does this R combine with the next character into a special glyph? (Probably not for R, but the system checks every time.) It applies hinting to snap the curves to the pixel grid at the current size. It rasterises the outline – filling in pixels that fall inside the curves – with subpixel rendering to smooth the edges using the red, green, and blue subpixels of your LCD. The result is painted into the application’s window buffer, which is composited with other windows by the operating system and sent to the display.

All of that – scan code to keyboard driver to Unicode code point to glyph lookup to Bézier curves to kerning adjustment to hinting to rasterisation to subpixel rendering to composited display – happens in microseconds. You press R, and R appears. It feels instant because it is.

The journey across the ages. A monk spent minutes per letter. A compositor spent seconds selecting type. A typist connected key to page in a single mechanical stroke. A modern computer does it in microseconds, but the pipeline is deeper: physical key → scan code → character code → Unicode code point → encoding → glyph lookup → outline scaling → kerning → hinting → rasterisation → pixel buffer → display.

More steps than ever before. Each one invisible. Each one built on something a monk, a compositor, or a typist once did by hand.

The LLM’s version (also today). You ask an AI to write a paragraph. Somewhere in a data centre, billions of numerical weights are multiplied together across dozens of layers of a neural network. The model predicts the most likely next token – not quite a character, not quite a word, but a chunk of text from a vocabulary of tens of thousands of pieces. It picks the token for “R”. This token is decoded back into the bytes of the UTF-8 character R (0x52), which is sent over HTTPS to your browser, where it enters the exact same rendering pipeline as before: Unicode code point → glyph lookup → Bézier curves → rasterisation → screen. The R appears. The entire history of typography – scribe, compositor, typist, keyboard, font renderer – is still there, running the last mile. The only difference is who asked for the letter. It used to be a human pressing a key. Now, sometimes, it’s a machine guessing what comes next. (Though arguably the monk was also guessing what came next. He was just copying more carefully.)

Now print it (or: trapped lightning meets chemical engineering)

Everything above gets a letter onto a screen. But what if you want it on paper? What if you hit Ctrl+P and expect a piece of dead tree to come out of a machine with your words on it?

This is where things get properly unhinged. Because a printer is not a screen. A screen has pixels that glow. A printer has to physically deposit material onto a surface. And the chain of events between “the user clicked Print” and “ink is on paper” is one of the most gloriously over-engineered pipelines in all of computing.

The problem. Your computer knows what the document looks like – it’s been rendering it on screen just fine. But the printer is a separate device with its own processor, its own memory, and its own ideas about how to put dots on paper. Somehow, the computer has to describe the page in a way the printer can understand and reproduce.

In the early days, this was brutally simple. Character printers (like daisy-wheel and dot-matrix printers) worked much like typewriters. The computer sent ASCII characters down a cable, and the printer had its own built-in font – literally a physical wheel with letter shapes on it, or a set of pin patterns for each character. You got whatever the printer gave you. Want a different typeface? Buy a different daisy wheel. Want graphics? Good luck.

PostScript changed everything.

In 1984, Adobe released PostScript²⁷ – a full programming language designed for describing pages. Not characters. Not lines of text. Pages. PostScript could describe any combination of text, graphics, and images as mathematical instructions. A PostScript file doesn’t say “print an R at position 40, 100”. It says “move to coordinates (40, 100), select the font Palatino-Roman at 12 points, scale the coordinate system, define a path using these Bézier curves, and fill it”. Sound familiar? Those are the same Bézier curves we met in outline fonts. This wasn’t a coincidence – Adobe co-founder John Warnock developed PostScript and the Type 1 font format together.²⁸

PostScript was revolutionary because it made the printer resolution-independent. The same PostScript file could print on a 300-DPI office laser printer and a 2,400-DPI professional typesetter, and each device would rasterise the curves at its native resolution. The page description was abstract; the physical rendering was the printer’s job.

The downside? PostScript is a Turing-complete programming language. Your printer is literally executing code to figure out what to print. Early PostScript printers needed powerful processors and lots of RAM – they were essentially computers in their own right, and often more expensive than the computer sending them data. Some complex pages could take minutes to process. Occasionally, a malformed PostScript file could crash the printer or send it into an infinite loop, because that’s what happens when your printer runs arbitrary programs.

PDF (Portable Document Format), also from Adobe, is essentially PostScript’s better-behaved descendant. It dropped the full programming language in favour of a more structured, predictable format, while keeping the same fundamental model: vector graphics, Bézier curves, embedded fonts, resolution independence. When you “print to PDF” today, your computer is generating a page description in this format.

Printer drivers are the translators that sit between your operating system and your specific printer. When you click Print, here’s what actually happens:

The application hands the document to the operating system’s printing subsystem. On Windows, this is the GDI (Graphics Device Interface) or the newer XPS (XML Paper Specification) pipeline. On macOS, it’s Quartz, which internally uses PDF as its native page description format – every Mac has essentially been thinking in PDF since OS X launched in 2001.²⁹ On Linux, it’s typically CUPS (Common Unix Printing System, originally created by Michael Sweet at Easy Software Products in 1997 and later acquired by Apple in 2007³⁰).

The printing subsystem renders the page into an intermediate format – a spool file. The printer driver then translates this spool file into whatever language the specific printer speaks. This might be PostScript, or it might be PCL (Printer Command Language, HP’s long-running alternative), or for many modern consumer printers it might be a proprietary raster format where the computer does all the rendering and just sends the printer a bitmap of dots to lay down.

Now for the physics.

A laser printer works by exploiting static electricity and heat – a process called electrophotography, invented by Chester Carlson in 1938 and first commercialised by Xerox in 1959.³¹ The first laser printer was the Xerox 9700, released in 1977.³² A photosensitive drum – a cylinder coated in a material (typically organic photoconductor, or OPC) that conducts electricity when exposed to light – sits at the heart of the machine. Here’s the sequence:

The drum is given a uniform negative electrical charge by a corona wire or charge roller.
A laser (or an array of LEDs) scans across the drum, switching on and off thousands of times per second. Where the light hits, the charge dissipates. The laser is drawing the page as a pattern of charged and uncharged areas on the drum’s surface – one row of dots at a time, as the drum rotates.
The drum passes a reservoir of toner – a fine powder of plastic particles mixed with pigment. Toner is attracted to the uncharged areas (where the laser hit) and repelled from the charged areas. The powder sticks to the drum in exactly the pattern of your text and images.
A sheet of paper is fed past the drum. The paper has been given a positive charge, which is stronger than the drum’s remaining negative charge, so the toner transfers from drum to paper.
The paper passes through a fuser – a pair of heated rollers at around 150-200°C (300-390°F).³³ The heat melts the plastic in the toner, bonding it permanently to the paper fibres.

That’s it. Your letter R is now toner – melted plastic – fused into paper. The Bézier curves that started as mathematical abstractions in a font file have become a physical pattern of charged and uncharged spots on a rotating drum, which attracted specks of plastic dust, which were melted onto a sheet of ground-up wood pulp. It’s absurd when you think about it.

An inkjet printer takes a different approach but is no less wild. Tiny nozzles – sometimes thousands of them, each thinner than a human hair – fire microscopic droplets of liquid ink onto paper. There are two main technologies: thermal inkjet (used by HP and Canon), in which a tiny resistor heats the ink to around 300°C in microseconds, forming a steam bubble that ejects a droplet,³⁴ and piezoelectric inkjet (used by Epson), which uses a piezoelectric crystal that physically deforms when electricity is applied, squeezing the ink out mechanically.³⁵ Each droplet is about 1-5 picolitres – a picolitre is a trillionth of a litre.³⁶ The precision required is staggering: the nozzles must fire at exactly the right microsecond as the print head sweeps across the page, placing dots at up to 5,760 DPI.

For colour printing, things multiply. A colour laser printer has four separate drums and four toner cartridges (cyan, magenta, yellow, and black – CMYK). Each colour is laid down in a separate pass, with the four layers combining to produce the full colour spectrum. Getting the four colours to align perfectly – registration – is one of the hardest mechanical challenges, and even tiny misalignment shows up as colour fringing on text. A colour inkjet fires four (or more – some have six or eight) colours from separate nozzle arrays in a single pass.

And then there’s the font question. Does the printer use the same fonts as the computer? Sometimes yes, sometimes no. PostScript printers traditionally had a set of built-in fonts (the “PostScript 35” – including Helvetica, Times, Courier, and others) stored in the printer’s own ROM. If your document used one of these, the computer just sent the font name and the printer rendered it locally. If your document used a font the printer didn’t have, the driver had to either embed the font data in the print job (increasing its size) or substitute a similar built-in font (changing how your document looked).

Modern printers mostly receive pre-rasterised data – the computer does the heavy lifting and sends the printer a bitmap. This avoids font substitution problems entirely but means the computer is doing more work and the print data is larger. It’s the same trade-off as always: do you send instructions or pixels? PostScript said instructions. Modern consumer printing says pixels. The professional print industry still says instructions (PDF), because when you’re printing a million copies of a magazine, you need the precision.

The full pipeline, from keypress to paper: You type an R. It becomes a Unicode code point (U+0052). The application looks up the glyph in the font. It renders the page layout – text, kerning, leading, line breaks, all of it. You hit Print. The OS printing subsystem takes the rendered page and converts it to a page description (PostScript, PDF, XPS, or a raw bitmap). The printer driver translates this for your specific printer. The data travels over USB, Wi-Fi, or Ethernet to the printer. The printer’s controller processes the data and drives the marking engine – laser and drum, or inkjet nozzles. Toner is melted or ink is squirted. The paper emerges.

From an abstract idea in Unicode, through mathematical curves, through a page description language, through a driver, through a cable, through trapped lightning in a thinking chip, to a laser drawing on a charged drum, to plastic powder melted by heat onto pressed wood fibre. Gutenberg would be absolutely baffled. But he’d recognise the letter.

Why this matters

Typography sits at the intersection of art, engineering, and language. It’s been refined over nearly 600 years of printing and thousands of years of writing before that. Every choice – serif or sans-serif, tight tracking or loose, generous leading or cramped – changes how text feels, and therefore changes how ideas land.

When typography is good, it’s invisible. The words just flow. You’re not thinking about letterforms or kerning or encodings – you’re thinking about what the text says. That invisibility is hard-won. Behind every comfortable paragraph is centuries of craft: stonecutters and scribes, punchcutters and typesetters, designers and engineers, all working towards the same goal.

Making the words feel effortless.

Sources and further reading

Aujoulat, N. (2005). Lascaux: Movement, Space, and Time. Harry N. Abrams. The Lascaux cave paintings are dated to approximately 17,000 BP (before present) using radiocarbon dating of associated charcoal fragments.

1a. Mulvaney, K. (2013). Murujuga: Rock Art, Heritage, and Landscape Iconoclasm. Australian Scholarly Publishing. Also: David, B. et al. (2013). “How old are Australia’s pictographs? A review of rock art dating.” Journal of Archaeological Science, 40(1). The Nawarla Gabarnmang shelter in Arnhem Land, Northern Territory, contains charcoal rock art dated to at least 28,000 BP. Petroglyphs across the Burrup Peninsula (Murujuga) in the Pilbara region of Western Australia are estimated at 30,000–40,000 years old based on associated archaeological deposits and weathering analysis.

1b. Clarkson, C. et al. (2017). “Human occupation of northern Australia by 65,000 years ago.” Nature, 547(7663), 306–310. Excavations at the Madjedbebe rock shelter in Arnhem Land provide evidence of human occupation – and associated ochre processing for art and ceremony – dating to approximately 65,000 years ago.

Childress, D. (2008). Johannes Gutenberg and the Printing Press. Twenty-First Century Books. The exact date is debated, but most scholars place Gutenberg’s development of practical movable metal type between 1440 and 1450, with the 42-line Bible (the “Gutenberg Bible”) printed c. 1455.
Needham, J. (1985). Science and Civilisation in China, Vol. 5, Part 1. Cambridge University Press. Bi Sheng’s movable type (活字印刷術), made from baked clay, is described in Shen Kuo’s Dream Pool Essays (夢溪筆談, 1088).
Agüera y Arcas, B. (2003). “Temporary Matrices and Elemental Punches in Gutenberg’s DK Type.” Incunabula and Their Readers. Analysis of the type metal alloy used in Gutenberg’s workshop.
Needham, P. (2001). “The Compositor Does Not Exist.” The Library, 7th series, 2(1). Detailed analysis of the Gutenberg Bible’s type repertoire and glyph variations.
Bringhurst, R. (2004). The Elements of Typographic Style, 3rd edition. Hartley & Marks. The standard reference for typographic terminology and history, including the origins of “upper case” and “lower case”.
De Vinne, T.L. (1900). The Practice of Typography: A Treatise on the Processes of Type-Making. The Century Co. Discussion of the etymology of “serif” and its possible Dutch origins.
Catich, E.M. (1968). The Origin of the Serif: Brush Writing and Roman Letters. Catfish Press. Catich’s analysis of Trajan’s Column inscriptions and the brush-origin theory of serifs.
Catich, E.M. (1968). Ibid. Catich, a Catholic priest, calligrapher, and professor at St. Ambrose University, demonstrated his theory by cutting letters with period-appropriate tools.
Bringhurst (2004), op. cit. Also: Lupton, E. (2010). Thinking with Type, 2nd edition. Princeton Architectural Press.
Carter, H. (2002). A View of Early Typography: Up to About 1600. Hyphen Press. Fournier’s Modèles des Caractères de l’Imprimerie (1742) and Didot’s subsequent refinement.
Adobe Systems Incorporated (1999). PostScript Language Reference, 3rd edition. Addison-Wesley. Defines the PostScript point as exactly 1/72 of an inch (Section 4.2).
Lawson, A. (1990). Anatomy of a Typeface. David R. Godine Publisher. Comparison of historical point systems across Europe and America.
W3C (2024). CSS Values and Units Module Level 3. “For a CSS device, these dimensions are anchored either (i) by relating the physical units to their physical measurements, or (ii) by relating the pixel unit to the reference pixel.” The reference pixel is defined as 1/96 of an inch.
Mackenzie, C.E. (1980). Coded Character Sets: History and Development. Addison-Wesley. The definitive history of ASCII’s development, from telegraph codes through the X3.2 committee standardisation.
Bemer, R.W. (1980). “Inside ASCII.” Interface Age. Bemer’s own account of ASCII’s design decisions, including the deliberate 32-offset between upper and lower case.
The Unicode Consortium (2024). The Unicode Standard, Version 16.0. The character count of 154,998 and 168 scripts is from the Unicode 16.0 release announcement, September 2024.
W3Techs (2024). “Usage Statistics of Character Encodings for Websites.” Web Technology Surveys. As of late 2024, UTF-8 is used by 98.2% of all websites.
Pike, R. (2003). “UTF-8 history.” Email to the UTF-8 mailing list, 30 April 2003. Pike describes the dinner at a New Jersey diner with Ken Thompson on 2 September 1992, where they designed UTF-8 on the back of a placemat (or possibly a napkin – accounts vary).
Bézier, P. (1986). The Mathematical Basis of the UNISURF CAD System. Butterworths. Bézier’s own description of the curves he developed at Renault.
Farin, G. (2002). Curves and Surfaces for CAGD: A Practical Guide, 5th edition. Morgan Kaufmann. The history of de Casteljau’s independent and slightly earlier work at Citroën, which remained proprietary until the 1980s.
Stamm, B. (1998). “The Rasterization of TrueType.” Microsoft Typography. Also: Apple Computer (1991). TrueType reference manual. TrueType was announced in 1991 as a response to Adobe’s control over the Type 1 format.
Microsoft Corporation and Adobe Systems (1996). “OpenType Font Format Specification.” The initial OpenType announcement was made at the Seybold conference in 1996.
The Unicode Consortium (2024). Unicode Standard Annex #29: Unicode Text Segmentation. Defines the algorithm for determining grapheme cluster boundaries – what a user would perceive as a single “character”.
Gabrilovich, E. and Gontmakher, A. (2002). “The Homograph Attack.” Communications of the ACM, 45(2). The first academic description of IDN homograph attacks using visually similar characters from different scripts.
The Unicode Consortium (2024). Unicode Technical Report #36: Unicode Security Considerations. Defines mechanisms for detecting “confusable” characters across scripts, including the confusables.txt data file.
Adobe Systems Incorporated (1985). PostScript Language Reference Manual, 1st edition (the “Red Book”). Addison-Wesley. PostScript was developed by Chuck Geschke and John Warnock at Adobe, based on Warnock’s earlier InterPress work at Xerox PARC.
Warnock, J. (2012). Interview for the Computer History Museum’s oral history programme. Warnock describes the co-development of PostScript and the Type 1 font format.
Apple Computer (2001). “Mac OS X: Under the Hood.” Apple Developer Documentation. Quartz’s use of PDF as the native imaging model.
Sweet, M. (2007). “CUPS Purchased by Apple Inc.” CUPS project announcement, 11 July 2007. CUPS was originally released in June 1999.
Owen, D. (2004). Copies in Seconds: How a Lone Inventor and an Unknown Company Created the Biggest Communication Breakthrough Since Gutenberg. Simon & Schuster. The story of Chester Carlson’s invention of xerography and its commercialisation as the Xerox 914.
Xerox Corporation (1977). Xerox 9700 Electronic Printing System documentation. The 9700 was the first commercial laser printer, capable of 120 pages per minute at 300 DPI.
Williams, E.M. (1984). The Physics and Technology of Xerographic Processes. John Wiley & Sons. Fuser temperature ranges depend on the toner formulation, with typical operating temperatures between 150°C and 210°C.
Le, H.P. (1998). “Progress and Trends in Ink-jet Printing Technology.” Journal of Imaging Science and Technology, 42(1). HP’s thermal inkjet process heats ink to ~300°C for ~2 microseconds to form the vapour bubble.
Wijshoff, H. (2010). “The Dynamics of the Piezo Inkjet Printhead Operation.” Physics Reports, 491(4-5), 77-177. A comprehensive review of piezoelectric inkjet physics.
Castrejon-Pita, J.R. et al. (2013). “Future, Opportunities and Challenges of Inkjet Technologies.” Atomization and Sprays, 23(6). Modern photo-grade inkjets can produce droplets as small as 1 picolitre.

</small>