Biogeography, and Big Year species discovery curves

In 2015, Noah Stryker is birding a global Big Year: i.e. trying to spot as many bird species as possible in the year, without the usual geographical constraints that big year birders often adopt. His blog on the Audubon site makes excellent reading. Aside from the impressive list of bird species being racked up, the logistics involved are mind-boggling. Virtually every week has seen him fly into a new country or part-of-country, a new local birder/guide to ferry him around birding sites, with stunningly little wasted time. For example, having run out of gettable birds in Colombia (no mean feat!), rather than risk a day or two idle, he scrambled a trip to Jamaica (“An old friend of mine, Liz Ames, happens to be finishing a three-month field research season in Jamaica… we’re hoping to make a clean sweep of Jamaica’s endemic birds in the next 48 hours”), a few weeks later, a delayed flight reduced his time in Iceland to an 7-hour overnight layover, but Iceland in summer doesn’t really get dark, so he pulled an all-nighter, and racked up 36 more species.

The fact that he’s birding so consistently and efficiently means the discovery curve is really smooth. Which means we can do things like try to predict his end-of-year total.

Noah Strycker's Species Accumulation Curve 2015

Noah Strycker’s 2015 Species Accumulation Curve, Jan-July

Michael Preston has a series of posts addressing exactly that. In the most recent, at the end of June, he finds that Noah’s accumulation curve is still mostly linear, but chooses to use a fourth order polynomial to fit to the previous global big year attempters, Alan Davies (not that one) and Ruth Miller.

But when the accumulation is done with consistency, the curve will follow the standard discovery curve pattern. Counts of individuals-per-species within a given area follow a power law curve, so the discovery curve is the inverse – usually modelled as a logarithmic or asymptotic curve tailing off to a theoretical maximum. Which is to say, the first day you go out, you’ll see lots of species, and then it becomes more and more a hunt for rarities (and, obviously, this is why Noah’s moving around so much. In order to keep a high number of species per day, he has to move on when the ROI in his current location falls below the expected ROI for the rest of the world in the available time left).

So I think it will be a struggle to maintain a linear accumulation. I predict the gradient will continue to level, and by the end of the year, a logarithmic trend line will fit better than a linear one.

Being a global exercise, it’s not obviously true that that will happen. After all, if the ROI drops, Noah can always move on. One could argue that a negligible number species are so widespread that collecting them early will undermine later collection efforts. But early-counting global species isn’t necessary, only early-counting species that occur in more than one region. The great flyways of the world lead to overlapping bird lists all across the Americas, within Eurasiafrica, and round the Asian Pacific Rim. Species collected in South America will have reduced Noah’s North American collections, and by the time he gets to Australasia in December, his efforts across the rest of the world will have diminished the ‘gettable’ list for Australia and New Zealand, perhaps leaving him only the endemic species. Though granted, there are quite a few in that part of the world.

The other interesting thing about Noah’s accumulation curve is that it already shows some degree of curvature in interesting places. Let’s do some biogeography.

World Ecozones

Ecozones of the world, by Wikipedian ArnoLagrange

Ecozones, as Wikipedia puts it, delineate large areas of the Earth’s surface within which organisms have been evolving in relative isolation over long periods of time. They take fancy Latinate names, but are more or less what you’d expect: South/Central America (Neotropic), Nearctic (North America), Palearctic (Eurasia and North Africa), Afrotropic (Sub-Saharan Africa), Indomalaya (South-East Asia), Australasia, Oceania, Antarctic. Within each zone, there is a shared biogeography, which translates to a lot of species overlap. Not anywhere near total species overlap, of course, but enough that the common and farther-flying species are common across much of the zone. Enough that, on the Big Year scale, it starts affecting the accumulation curve.

Here’s a graph of species seen by Noah per day (so far), coloured by ecozone.

Species counted per day

Graph of species counted by Noah Strycker per day, in Big Year 2015, Jan-Jul, coloured by ecozone.

A trend emerges:
Aside from Antarctica (travel is a bit constrained when you’re on a cruise ship), there’s clear downward trend to the ‘peak’ days (peak days correspond to Noah’s arrival at a new birding site), and a less clear but still noticeable downward trend (R2~=0.2) within each zone as a whole. The spike at the beginning of each new ecozone resets the trend to some extent, i.e. we see a logarithmic pattern within each ecozone. I’m not sure whether the lower initial spikes in North America and Europe are due to early counting of overlapping species or simply that temperate ecosystems are less rich in species to begin with. In particular, the inter-migration between North and South America seems to be producing a macro-ecozone effect.

So what happens if we model the entire year as a series of ecozones, with each one following a simple exponential drop-off:

Species Accumulation by ecozone Model

Model of species accumulation as exponential fall-off from ecozone peak.
Light green = modelled species seen per day
Grey = modelled species accumulation curve based on the per-day figures
Blue = actual species accumulation curve
Green dots = actual curve linear trend line projected to EOY.

The modelled species per day value is floor[$ecozone_peak * log2($days_in_zone + 1)/$days_in_zone]. The ecozone peak values for Antarctica (12), South America (108), North America (49), Europe (57) and Africa (75) are taken from Noah’s list, being the most successful day spent in each ecozone (all in the first week of their respective zone, counting North America as Guatemala 2015-04-20 onwards). The peak figures for Asia and Australasia are purely guesses, which makes this model not especially powerful as a prediction tool.

So that’s not bad, it tracks the curve quite well in North America (May-June), Europe (Jun), and Africa (Jul), albeit a little lower than reality.

But the smooth fall off doesn’t look anything like Noah’s sighting pattern. As I’ve mentioned, Noah is moving around a lot. And each individual locale he flies into also comes with its own species accumulation curve. Although he takes heroic pains to traverse ecosystems and altitudes to capture as diverse a set as possible, there is inevitable overlap at each new centre he visits, and possibly he is choosing which ecosystems to visit in descending order of richness too. After all, you might find there’s a trip to Jamaica to be squeezed in! So there’s another level to this – within each ecozone are a number of sites, and each site has its own peak-and-falloff, and each peak is lower than the last.

What happens if we model each ecozone as a series of locales, taking a representative nine days at each, and put a drop-off on the accumulations.

Species Accumulation Model by ecozone and Locale

Model of species accumulation as exponential fall-off from ecozone peak and individual locale peaks.
Light green = modelled species seen per day
Grey = modelled species accumulation curve based on the per-day figures
Blue = actual species accumulation curve
Green dots = actual curve linear trend line projected to EOY.

The modelled species per day value is now floor[$ecozone_peak * log2($days_in_zone + 1)^1.6/$days_in_zone * 0.8^(($days_in_locale - 1) mod 9)], so introducing a geometric drop off (and unjustifiably futzing with the ecozone term to compensate).

The alignment is now surprisingly successful, even given how many arbitrary constants there are in that function! The proof of the pudding will come when Noah enters Asia and then Australasia, and I can plug in the final two ecozone peak figures and see how well the model tracks.

Interestingly, even with all the curve matching, the current prediction for the end of year figure, using what I hope are reasonable predictions for the peaks in the remaining tropical ecozones Asia (70) and Australasia (80), comes out to almost exactly the same as the linear trend line shows: about 6,500. But then, so far I have only accounted for the two observable inverse-power-law patterns, and not the hypothetical global one. If that is a factor, then 70 and 80 are probably over-estimates, and we’re probably looking at an end of year figure nearer 6,100.

There is a further line of reasoning to support that, which is: Noah, or someone at Audubon has almost certainly done this exercise already, and they chose to aim for 5000 species, not 6000. That suggests to me that 6000 was not considered a foregone conclusion, and so perhaps they are expecting it to be low 6,000s rather than up in the 6,500s. Who knows.

It occurs to me though, that Noah may have one more trick up his sleeve, at the end of the year. He’s travelling, Phineas Fogg style, eastwards around the globe. Were it me, come the end of December, with New Zealand finally birded out, I would be awfully tempted to cash in the hours banked crossing time zones: catch a Dec 30th flight to Hawaii, and win a bonus day to chase down 48 Hawaiian endemics and the last few Pacific pelagics. Which would quite nicely tick off the eighth ecozone, and thus the world.