If I scramble the middle thoughts, will you still get me?

Aristotle on 2007-04-26T04:51:42

If you’ve been around on the internet for a while, you’ve certainly seen this amusing – if transparently nonsensical – claim:

Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.

This is hogwash, of course. The best counterexample I’ve seen is part of Matt Davis’ refutation of the scrambled-letters myth: “the magltheuansr of a tageene ceacnr pintaet.” The reality is that you infer a whole lot from context, and based on these cues you can very quickly reduce the set of plausible candidate words for any scrambled lump to a set of one. Human communication is quite redundant.

However, that does suggest a way to make text harder to read:

WRD_ACCORDING WRD_TO WRD_A WRD_RESEARCHER WRD_AT WRD_CAMBRIDGE WRD_UNIVERSITY, WRD_IT WRD_DOESN’T WRD_MATTER WRD_IN WRD_WHAT WRD_ORDER WRD_THE WRD_LETTERS WRD_IN WRD_A WRD_WORD WRD_ARE, WRD_THE WRD_ONLY WRD_IMPORTANT WRD_THING WRD_IS WRD_THAT WRD_THE WRD_FIRST WRD_AND WRD_LAST WRD_LETTER WRD_BE WRD_AT WRD_THE WRD_RIGHT WRD_PLACE. WRD_THE WRD_REST WRD_CAN WRD_BE WRD_A WRD_TOTAL WRD_MESS WRD_AND WRD_YOU WRD_CAN WRD_STILL WRD_READ WRD_IT WRD_WITHOUT WRD_PROBLEM. WRD_THIS WRD_IS WRD_BECAUSE WRD_THE WRD_HUMAN WRD_MIND WRD_DOES WRD_NOT WRD_READ WRD_EVERY WRD_LETTER WRD_BY WRD_ITSELF WRD_BUT WRD_THE WRD_WORD WRD_AS WRD_A WRD_WHOLE.

Human cognition feeds on probabilities, plausibilities and likelihoods. We need the succession of shapes, voids, heights and depths, first to tell things apart, then to bring them back together. When dissimilar things are too similar, we perceive the whole as static noise on a fundamental level.

There is a whole slew of consequences in all manner of areas that could be tangibly illustrated using this example; from writing to music to linguistics to programming.

F.ex. it reminds me of something Larry once wrote about language design:

I was reading Umberto Eco’s The Search for a Perfect Language, and he makes the point that, over the centuries, many of the designers of “perfect” languages have fallen into the trap of trying to make similar things look similar. He goes on to argue that similar things should look different, because when you don’t, you end up with too little redundancy for effective communication.

It also reminds me of the followup to Imperfect Sound Forever, an excellent essay about how compression (in the sound engineering sense, not the computer science sense) is destroying the “soul” of music.

It reminds me of many things. Cognition and perceptions are endlessly fascinating topics of great import, with lessons that cross interdisciplinary boundaries. I have to admit I didn’t actually have any particular one of them in mind in writing this. There is no take-away I will offer here. (The links may provide interesting fodder for reading, of course.)

I just had to get this out… isn’t all this stuff and the connectedness of it all just neat?


(My mind rarely works by following a discernible trail of thought. When I am forced to work my way through one, it is stressful and exhausting. Rather, I tend to have a jumbled mesh of related ideas and concepts – not infrequently with only tenuous relations –, in which I flit from place to place, letting the similarities, differences and nuances sink in. I’m not really smart; I just have a good memory and some aptitude at drawing on it to recall possibly-relevant stuff. Anyway, this meandering waffle is by way of an explanation for the preceding meandering waffle. It was just stuff has been rattling around my noggin for a while.)


Deduction

chromatic on 2007-04-26T05:41:49

My mind rarely works by following a discernible trail of thought. When I am forced to work my way through one, it is stressful and exhausting.

You sound like an inductive thinker. That's how my mind works.

Re:Deduction

Aristotle on 2007-04-26T19:49:43

Wait, what is how your mind works? The referent in that sentence is totally ambiguous so I can’t figure out whether you are saying “we’re alike” or “we differ.”

But yes, inductively is very much how my thinking works.

Re:Deduction

chromatic on 2007-04-27T07:22:07

We're alike, except you're Croup and I'm Vandemar.

Re:Deduction

jplindstrom on 2007-04-27T11:31:15

Hey, I'm reading that book!

So, well, I recognize the names. When I'm done I might even understand your reference.

Re:Deduction

Aristotle on 2007-04-27T15:28:24

Whereas I don’t now – and won’t soon – have any idea what chromatic is talking about. :-)

Re:Deduction

chromatic on 2007-04-27T17:11:23

There are four simple ways for the observant to tell Mr. Croup and Mr. Vandemar apart: first, Mr. Vandemar is two and a half heads taller than Mr. Croup; second, Mr. Croup has eyes of a faded china blue, while Mr. Vandemar's eyes are brown; third, while Mr. Vandemar fashioned the rings he wears on his right hand out of the skulls of four ravens, Mr. Croup has no obvious jewelry; fourth, Mr. Croup likes words, while Mr. Vandemar is always hungry. Also, they look nothing at all alike.

  —Neil Gaiman, Neverwhere, p. 7

why not to use ALL CAPs

markjugg on 2007-04-26T14:56:52

You touched on a point I learned by studying typography. "All Caps" is harder to read because our eyes look at the overall shape of a word. Especially for things viewed from a distance, that's what you see before you can even make out the letters.

It's ironic that people choose to use all-caps to make something more important or visible, when they may actually be making it header to read.

I think of the road-side signs with changeable plastic letters.

All-caps work OK to emphasis things in small does, but doing things like writing a three page warning in all-caps can just add unnecessary strain on readers.

Re:why not to use ALL CAPs

Aristotle on 2007-04-26T20:52:22

Yes! All caps is a bad idea in many cases. It should be used ONLY for VERY short bits of text.

However, that’s not for the reason you cite. “Bouma shape” has been conclusively refuted. In fact, the scrambled-middle-letters myth, while making a ridiculous claim, clearly disproves that we recognise words by their shape, because if the context provides enough clues, you can read scrambled-middle-letters words without even slowing down – despite the fact that the word shape has been completely corrupted. Microsoft Research has a very good article about this:

Word shape is no longer a viable model of word recognition. The bulk of scientific evidence says that we recognize a word’s component letters, then use that visual information to recognize a word. In addition to perceptual information, we also use contextual information to help recognize words during ordinary reading, but that has no bearing on the word shape versus parallel letter recognition debate. It is hopefully clear that the readability and legibility of a typeface should not be evaluated on its ability to generate a good bouma shape.

What happens is that we go through the text sequentially, but doing parallel recognition of short blocks of individual letters. That doesn’t mean we deal with every single letter, though – we go from tentative to partial to (sometimes) full recognition of letters, but by taking the context into account we can reduce the plausibility of occurence of most words to zero very quickly, so we can stop looking at any one block of letters quite quickly, long before we've visually determined the exact sequence of letters. F.ex., •• grammatical structure •• • sentence alone •• often enough •• dictate ••• choice •• many short words with barely any ambiguity. This is why scrambling the middle letters does not slow you down appreciably when you read text with relatively short words and low “concept density” (for lack of a better word).

You will notice that the mangling I showed is not just all caps, but also prefixes every word with the same string of letters. In fact, it didn’t even need the all caps to effectively destroy the readability of the text:

Wrd_according wrd_to wrd_a wrd_researcher wrd_(sic) wrd_at Wrd_cambridge Wrd_university, wrd_it wrd_doesn’t wrd_matter wrd_in wrd_what wrd_order wrd_the wrd_letters wrd_in wrd_a wrd_word wrd_are, wrd_the wrd_only wrd_important wrd_thing wrd_is wrd_that wrd_the wrd_first wrd_and wrd_last wrd_letter wrd_be wrd_at wrd_the wrd_right wrd_place. Wrd_the wrd_rest wrd_can wrd_be wrd_a wrd_total wrd_mess wrd_and wrd_you wrd_can wrd_still wrd_read wrd_it wrd_without wrd_problem. Wrd_this wrd_is wrd_because wrd_the wrd_human wrd_mind wrd_does wrd_not wrd_read wrd_every wrd_letter wrd_by wrd_itself wrd_but wrd_the wrd_word wrd_as wrd_a wrd_whole.

This is not a whole lot easier to read than the all caps. Why is that? It’s because we can only recognise short blocks of letters in parallel. Now pay attention to how you read this – you will notice that you are forced to scan meticulously, looking for the actual start of each word.

We jump through the text 3–7 letters or so at a time. Naturally, th• begi••••• of a wor• is mu•• mo•• impor•••• to thi• tha• its en•. What the above mangling does is that it keeps tripping you up by making the start of every word meaningless.

In summary: the all caps version is painfully slow to read because it makes the shapes of each letter much more similar, not the shape of each word.

Again, pay attention to how you read it. It forces you to go literally one or two letters at a time, instead of you letting you “surf” the text as we normally do.