Category Archives: Overthinking It

The One Hundred and Twenty-Four Shortest Words in Scrabble

An unhealthy degree of familiarity with the permissible two-letter words is one of the early warning signs of a Scrabble addict. International tournament Scrabble permits 124 two letter words, but only around half of them are everyday words, with the rest an odd mix of variant spellings, foreign-derived obscurities, and written forms of things you didn’t know were words: like the letter R, spelled AR, obviously, or the expression of scepticism produced when first seeing it played, spelled HM (or HMM). For those people who imagine the game is about playing words from one’s own working vocabulary, it can be quite annoying when an opponent drops GU and points smugly at the appropriate page in the definitionless lexicon rather than talking convincingly about their keen interest in Shetland Island folk music.

So, for anyone fed up of being on either end of that kind of conversation, here is a handy semi-interactive visual guide to the two letter words in Scrabble.

Despite usually playing the role of smug GU-player, I have some sympathy for the other point of view. Here’s five 2-letter words doing a bad job of justifying their place in the lexicon:


The Collins dictionary (i.e. International Scrabble) description even says “obsolete form of I”. I chalk this one up to lobbying from players who desperately wanted to be able to play C in a two-letter word.


Collins claims this is a plural form of DEUS. I mean, DEUS isn’t really English either, except as a Plantagenet-era interjection. Is this the foreign word in multiple phrases thing? Though I tip my hat to the scholars trying to make “di ex machina” happen, I am unconvinced by it’s Scrabbleworthiness.


A hypothetical force proposed by Baron Karl von Reichenbach as pervading all nature and accounting for various physical and psychological phenomena. Here’s a brief history. The hypothesis was not very successful. In W. Gregory’s translation of von Reichenbach’s reseaches, we are told “the author has given to the new imponderable the name of Od, a name not possessing any meaning, but admitting of being compounded, according to the genius of the German language.” It didn’t catch on at the time, it’s not going to now, and failed coinages are always a little tragic. Best to lay it to rest.

That kind of universal-fluid-life-force theory was big in the mid-19th century. Bulwer-Lytton’s VRIL is a similar concept, maintaining a similarly dubious place in the lexicon, but its usefulness as a way to dump a V is some compensation.

These subterranean philosophers assert that, by one operation of vril, which Faraday would perhaps call “atmospheric magnetism,” they can influence the variations of temperature—in plain words, the weather; that by other operations, akin to those ascribed to mesmerism, electro-biology, odic force, &c., but applied scientifically through vril conductors, they can exercise influence over minds, and bodies animal and vegetable, to an extent not surpassed in the romances of our mystics.

— Edward Bulwer-Lytton, demonstrating his mastery of both science and prose, in Vril, The Coming Race.


Unusually, it’s the North American lexicon which is dodgy here. The Hasbro dictionary search defines OE as “a whirlwind off the Faeroe Islands”, which definition seems to have been copied verbatim, from dictionary to dictionary, without anyone wondering whether it was still useful.

oe, n. A whirlwind off the Faeroe Islands.
The American Practical Navigator, US Govt. NIMA (19822002)

oe (ō), n. A whirlwind off the Faeroe Islands.
Navigation Dictionary, US Hydrographic Office (1956)

Oe. A violent whirlwind off the Ferroe Islands.
Naval Encyclopaedia, Philadelphia (1881)

OE. An island [from the Ang-Sax.] Oes are violent whirlwinds off the Ferroe Islands, said at times to raise the water in syphons.
The Sailor’s Word-Book, London (1867)

150-year-old copypasta.

Collins defines OE as “grandchild”, a variant of Scots dialect “OY”, and various dictionaries list that meaning, and/or oe, “a small island”. It would be nice if either meaning had citations more recent than the time Walter Scott finished translating his collection of Danish ballads, but at least there’s some evidence that OE was actually used in those senses.


A dialect word for cows, apparently used in Scots and northern dialects. Fair enough, maybe, except the OED’s usage quotations have the word spelled KYE in every instance since 1522. Might be time to retire the shorter variant? Same argument goes for NY and FY, come to think of it. Allowing spelling variants that haven’t been used since the invention of spelling just makes you looks silly.

The Various Versions of Voldemort’s Real Name, Listed In Order of Levenshtein Edit Distance from “Tom Marvolo Riddle”, with additional commentary

I was going to add an explanatory paragraph here, but then I didn’t.

Language Riddle’s Name Lev.Δ J-W.Δ ‘Voldemort’ Anagram
English Tom Marvolo Riddle 0 0.000 I am Lord Voldemort
Estonian Tom Marvolon Riddle 1 0.018 Mina Lord Voldemort
Turkish Tom Marvoldo Riddle 1 0.064 Adim Lord Voldemort
Slovakian Tom Marvoloso Riddle 2 0.033 A som i Lord Voldemort
Portuguese Tom Servolo Riddle 2 0.074 Eis Lord Voldemort
Russian Tom Narvolo Reddl 3 0.088 Lord Volan-de-mort
Basque* Tom Narivoloz Riddle 3 0.137 Lord Voldemort naiz
Spanish Tom Sorvolo Ryddle 3 0.167 Soy Lord Voldemort
German Tom Vorlost Riddle 5 0.105 Ist Lord Voldemort
Czech Tom Rojvol Raddle 5 0.109 Já Lord Voldemort
Ukranian Tom Yarvolod Redl 5 0.119 Ya Lord Voldemort
Italian Tom Orvoloson Riddle 5 0.156 Son io Lord Voldemort
Latin Tom Musvox Ruddle 5 0.248 Sum Dux Voldemort
Bulgarian Tom Mersvoluko Riddŭl 6 0.116 Tuk sŭm i Lord Voldemor
Hebrew Tom Vandrolo Ridl 6 0.161 Ani Lord Voldmort
Catalan Tod Morvosc Rodlel 7 0.291 Sóc Lord Voldemort
Serbian Tom Mervolodomos Ridl 8 0.158 To smo mi Lord Voldemor
Romanian Tom Ruvel Doodler 9 0.184 Eu Lord Voldemort
Belarusian Tom Val’Dor Redl 9 0.199 Lord Val’Demort
Lithuanian† Tomas Malvoras Ridlis 9 0.206 Aš Valdovas Voldemortas
Swedish Tom Gus Mervolo Dolder 10 0.232 Ego sum Lord Voldemort
Norwegian Tom Dredolo Venster 10 0.249 Voldemort den Store
Hungarian Tom Rowle Denem 10 0.259 Nevem Voldemort
Faroese Tom Evildo Reger 10 0.270 Eg eri Voldemort
French Tom Elvis Jedusor 12 0.306 Je suis Voldemort
Finnish Tom Lomen Valedro 13 0.206 Ma olen Voldemort
Esperanto Tom Vlades Mistero 13 0.267 Mi estas Voldemort
Icelandic‡ Trevor Delgome 14 0.307 Eg er Voldemort
Dutch Marten Asmodom Vilijn 15 0.361 Mijn naam is Voldemort
Slovenian§ Mark Neelstin 15 0.485 Mrlakenstein
Danish Romeo G Detlev Jr 16 0.414 Jeg er Voldemort

* Good try, Basque, but there are too many Is in NARIVOLOZ

† Some sources have LORDAS instead of VALDOVAS. This isn’t an anagram either way (count the Ms, or Is).

‡ If you know what Neville’s toad was named in Iceland, let me know.

§ Well the anagram is as elegant as any, Slovenia, but changing “Voldemort” and “Tom Riddle” is somewhat cheating.


Ok, so, why is is that the Romance and Slavic language versions generally just run with an embellished middle name, but the Scandinavian language versions are so tortured? “Riddle” is a Germanic word, so it seems weird that none of them opted to keep it, or a cognate.

My guess is the J-G of the first person pronoun is just really hard to work into a proper name (especially, perhaps, with a V already called for). The Swedish translator cleverly opted to put the reveal into Latin, the Norwegian phrase breaks the form completely, the Danish translator managed to get “Jeg er” in there, but at the cost of making Voldemort’s birth name “Romeo G[åde] Detlev, Jr”.

He-Who-Must-Not-Be-Named indeed.

Biogeography, and Big Year species discovery curves

In 2015, Noah Stryker is birding a global Big Year: i.e. trying to spot as many bird species as possible in the year, without the usual geographical constraints that big year birders often adopt. His blog on the Audubon site makes excellent reading. Aside from the impressive list of bird species being racked up, the logistics involved are mind-boggling. Virtually every week has seen him fly into a new country or part-of-country, a new local birder/guide to ferry him around birding sites, with stunningly little wasted time. For example, having run out of gettable birds in Colombia (no mean feat!), rather than risk a day or two idle, he scrambled a trip to Jamaica (“An old friend of mine, Liz Ames, happens to be finishing a three-month field research season in Jamaica… we’re hoping to make a clean sweep of Jamaica’s endemic birds in the next 48 hours”), a few weeks later, a delayed flight reduced his time in Iceland to an 7-hour overnight layover, but Iceland in summer doesn’t really get dark, so he pulled an all-nighter, and racked up 36 more species.

The fact that he’s birding so consistently and efficiently means the discovery curve is really smooth. Which means we can do things like try to predict his end-of-year total.

Noah Strycker's Species Accumulation Curve 2015

Noah Strycker’s 2015 Species Accumulation Curve, Jan-July

Michael Preston has a series of posts addressing exactly that. In the most recent, at the end of June, he finds that Noah’s accumulation curve is still mostly linear, but chooses to use a fourth order polynomial to fit to the previous global big year attempters, Alan Davies (not that one) and Ruth Miller.

But when the accumulation is done with consistency, the curve will follow the standard discovery curve pattern. Counts of individuals-per-species within a given area follow a power law curve, so the discovery curve is the inverse – usually modelled as a logarithmic or asymptotic curve tailing off to a theoretical maximum. Which is to say, the first day you go out, you’ll see lots of species, and then it becomes more and more a hunt for rarities (and, obviously, this is why Noah’s moving around so much. In order to keep a high number of species per day, he has to move on when the ROI in his current location falls below the expected ROI for the rest of the world in the available time left).

So I think it will be a struggle to maintain a linear accumulation. I predict the gradient will continue to level, and by the end of the year, a logarithmic trend line will fit better than a linear one.

Being a global exercise, it’s not obviously true that that will happen. After all, if the ROI drops, Noah can always move on. One could argue that a negligible number species are so widespread that collecting them early will undermine later collection efforts. But early-counting global species isn’t necessary, only early-counting species that occur in more than one region. The great flyways of the world lead to overlapping bird lists all across the Americas, within Eurasiafrica, and round the Asian Pacific Rim. Species collected in South America will have reduced Noah’s North American collections, and by the time he gets to Australasia in December, his efforts across the rest of the world will have diminished the ‘gettable’ list for Australia and New Zealand, perhaps leaving him only the endemic species. Though granted, there are quite a few in that part of the world.

The other interesting thing about Noah’s accumulation curve is that it already shows some degree of curvature in interesting places. Let’s do some biogeography.

World Ecozones

Ecozones of the world, by Wikipedian ArnoLagrange

Ecozones, as Wikipedia puts it, delineate large areas of the Earth’s surface within which organisms have been evolving in relative isolation over long periods of time. They take fancy Latinate names, but are more or less what you’d expect: South/Central America (Neotropic), Nearctic (North America), Palearctic (Eurasia and North Africa), Afrotropic (Sub-Saharan Africa), Indomalaya (South-East Asia), Australasia, Oceania, Antarctic. Within each zone, there is a shared biogeography, which translates to a lot of species overlap. Not anywhere near total species overlap, of course, but enough that the common and farther-flying species are common across much of the zone. Enough that, on the Big Year scale, it starts affecting the accumulation curve.

Here’s a graph of species seen by Noah per day (so far), coloured by ecozone.

Species counted per day

Graph of species counted by Noah Strycker per day, in Big Year 2015, Jan-Jul, coloured by ecozone.

A trend emerges:
Aside from Antarctica (travel is a bit constrained when you’re on a cruise ship), there’s clear downward trend to the ‘peak’ days (peak days correspond to Noah’s arrival at a new birding site), and a less clear but still noticeable downward trend (R2~=0.2) within each zone as a whole. The spike at the beginning of each new ecozone resets the trend to some extent, i.e. we see a logarithmic pattern within each ecozone. I’m not sure whether the lower initial spikes in North America and Europe are due to early counting of overlapping species or simply that temperate ecosystems are less rich in species to begin with. In particular, the inter-migration between North and South America seems to be producing a macro-ecozone effect.

So what happens if we model the entire year as a series of ecozones, with each one following a simple exponential drop-off:

Species Accumulation by ecozone Model

Model of species accumulation as exponential fall-off from ecozone peak.
Light green = modelled species seen per day
Grey = modelled species accumulation curve based on the per-day figures
Blue = actual species accumulation curve
Green dots = actual curve linear trend line projected to EOY.

The modelled species per day value is floor[$ecozone_peak * log2($days_in_zone + 1)/$days_in_zone]. The ecozone peak values for Antarctica (12), South America (108), North America (49), Europe (57) and Africa (75) are taken from Noah’s list, being the most successful day spent in each ecozone (all in the first week of their respective zone, counting North America as Guatemala 2015-04-20 onwards). The peak figures for Asia and Australasia are purely guesses, which makes this model not especially powerful as a prediction tool.

So that’s not bad, it tracks the curve quite well in North America (May-June), Europe (Jun), and Africa (Jul), albeit a little lower than reality.

But the smooth fall off doesn’t look anything like Noah’s sighting pattern. As I’ve mentioned, Noah is moving around a lot. And each individual locale he flies into also comes with its own species accumulation curve. Although he takes heroic pains to traverse ecosystems and altitudes to capture as diverse a set as possible, there is inevitable overlap at each new centre he visits, and possibly he is choosing which ecosystems to visit in descending order of richness too. After all, you might find there’s a trip to Jamaica to be squeezed in! So there’s another level to this – within each ecozone are a number of sites, and each site has its own peak-and-falloff, and each peak is lower than the last.

What happens if we model each ecozone as a series of locales, taking a representative nine days at each, and put a drop-off on the accumulations.

Species Accumulation Model by ecozone and Locale

Model of species accumulation as exponential fall-off from ecozone peak and individual locale peaks.
Light green = modelled species seen per day
Grey = modelled species accumulation curve based on the per-day figures
Blue = actual species accumulation curve
Green dots = actual curve linear trend line projected to EOY.

The modelled species per day value is now floor[$ecozone_peak * log2($days_in_zone + 1)^1.6/$days_in_zone * 0.8^(($days_in_locale - 1) mod 9)], so introducing a geometric drop off (and unjustifiably futzing with the ecozone term to compensate).

The alignment is now surprisingly successful, even given how many arbitrary constants there are in that function! The proof of the pudding will come when Noah enters Asia and then Australasia, and I can plug in the final two ecozone peak figures and see how well the model tracks.

Interestingly, even with all the curve matching, the current prediction for the end of year figure, using what I hope are reasonable predictions for the peaks in the remaining tropical ecozones Asia (70) and Australasia (80), comes out to almost exactly the same as the linear trend line shows: about 6,500. But then, so far I have only accounted for the two observable inverse-power-law patterns, and not the hypothetical global one. If that is a factor, then 70 and 80 are probably over-estimates, and we’re probably looking at an end of year figure nearer 6,100.

There is a further line of reasoning to support that, which is: Noah, or someone at Audubon has almost certainly done this exercise already, and they chose to aim for 5000 species, not 6000. That suggests to me that 6000 was not considered a foregone conclusion, and so perhaps they are expecting it to be low 6,000s rather than up in the 6,500s. Who knows.

It occurs to me though, that Noah may have one more trick up his sleeve, at the end of the year. He’s travelling, Phineas Fogg style, eastwards around the globe. Were it me, come the end of December, with New Zealand finally birded out, I would be awfully tempted to cash in the hours banked crossing time zones: catch a Dec 30th flight to Hawaii, and win a bonus day to chase down 48 Hawaiian endemics and the last few Pacific pelagics. Which would quite nicely tick off the eighth ecozone, and thus the world.

The gender of the dwarves in The Hobbit

The discussion last year about casting a female in a major role in the forthcoming film version of The Hobbit (my money’s on Smaug) reminded me that the book is quite quiet on the subject of the dwarves’ genders. From (rusty) memory, I remembered only Thorin (son of Thrain) as being definitely male. But it turns out that there are gender cues, very scarcely scattered.

So, while re-reading the book recently, I kept a note of all the occasions when a gender is actually revealed, either by gendered pronoun or by family relationships. (Subsequent gender mentions after the initial one are not recorded below.)

[Page numbers are from the Collins Modern Classics edition, and are included not necessarily as a reference aid, but to demonstrate how rarely the minor dwarf characters are afforded singular pronouns.]

The reveals, in order, are:

Balin and Dwalin

When [Bilbo] got back Balin and Dwalin were talking at the table like old
friends (as a matter of fact they were brothers).

Ch. 1. An Unexpected Party, p20


Thorin, an enormously important dwarf, in fact no other than the great Thorin Oakenshield himself.

Ch. 1. An Unexpected Party, p22


“Half a minute!” said Dori, who was at the back next to Bilbo, and a decent fellow. He made the hobbit scramble on his shoulders as best he could with his tied hands, and then off they all went at a run…

Ch. 4. Over Hill and Under Hill, p87


“Why, O why did I ever bring a wretched little hobbit on a treasure hunt!” said poor Bombur, who was fat, and staggered along with the sweat dripping down his nose in his heat and terror…

Ch. 4. Over Hill and Under Hill, p88


“Come here Fili, and see if you can see the boat Mr. Baggins is talking about.”
Fili thought he could; so when he had stared a long while to get an idea of the direction, the others brought him a rope.

Ch. 8. Flies and Spiders, p178


There they were at last, twelve of them counting poor old Bombur, who was being propped up on either side by his cousin Bifur, and his brother Bofur…

Ch. 8. Flies and Spiders, p201


“And who are these?” he asked, pointing to Fili and Kili and Bilbo.
“The sons of my father’s daughter,” answered Thorin, “Fili and Kili of the race of Durin, and Mr. Baggins who has travelled with us out of the West.”

Ch. 10. A Warm Welcome, p238

Oin and Gloin

Oin and Gloin were sent back to their bundles at the top of the tunnel. After a while a twinkling gleam showed them returning, Oin with a small pine-torch alight in his hand, and Gloin with a bundle of others under his arm.

Ch. 13. Not At Home, p285

It may be worth mentioning that in the case of the last four dwarves, these brief references are the only ones. Thorin, Balin, Bombur, and Dori all have slightly more of a “speaking role” in the story, and do get a smattering of “he”s and “his”s elsewhere.

I briefly wondered whether Tolkien’s dwarves, like Pratchett’s, go to war and wear beards and use male pronouns regardless of their biological status, but decided that while finding and reading Tolkien’s History Of The Dwarves In 18 Volumes would be my usual course of action, it would totally undermine the whimsy behind this post. For the same reason, I have made no effort to track down further biographical detail on the thirteen dwarves.

All that aside, the startling conclusion of this post is that Ori, Nori and Bifur do not have a defined gender (within the artificially small scope of The Hobbit as a standalone work of fiction). Not, perhaps, enough ungendered dwarves that it Raises Interesting Questions About Our Assumptions Hmmm, but enough that Peter Jackson can, without abusing the text, give Dori a pair of axe-wielding sisters…

Edit, after the fact: Introducing new characters is cheating, Jackson.