environmentalresearchweb blog
« Does extracting power from the wind change the climate? | Main | Panel rules out malpractice by climate scientists »
Glaciers and wiggle room
I haven’t written all that many blogs, but this one could be the dullest I have ever written. How do you persuade readers to be interested in how confident the rules of statistics allow them to be?
Statistics is a minefield in large part because we call on it mainly when we have to work out an explanation of what is happening, given a few scraps of more or less reliable information. You can tiptoe through the minefield and hope to reach the other side without getting blown up, or you can turn yourself into a professional statistician. But the latter option is time-consuming.
One option that sometimes crops up is to start not with a small but with a large amount of information and work out what the statistics are. This happened to me a while ago when I wanted to work out how confident I ought to be about the mass balance of a glacier in the Canadian High Arctic, White Glacier.
We measure the mass balance by inserting stakes into the glacier surface. Once a year we measure the rise or fall of the surface with respect to the top of each stake. Mass is volume times density, so the calculation of mass change (per unit of surface area) is simple: rise or fall times density. We measure the density in snow pits. To get the mass balance of the whole glacier, we add up the stake balances, multiplying each by the fraction of total glacier area that it represents.
Now the statistical fun begins. People have been measuring several dozen or more stakes per year for several decades on White Glacier, so we have a large sample and a “forward” problem: what is the uncertainty in the annual average mass balance?
There is a simple answer from classical statistics. This uncertainty is equal to a measure of the uncertainty in each single stake measurement divided by the square root of the number of stake measurements. It is a pity that this simple answer is, for practical purposes, wrong.
The uncertainty in the stake measurement itself is not the problem. You can measure changes of stake height to within a few millimetres, even when it is cold and windy. What you want is a measure of how representative your single stake is of the part of the glacier which it supposedly represents. Working this out is much trickier, but a typical number is a couple up to a few hundreds of millimetres.
Next, and fatally from the minefield standpoint, the number of stake measurements is not the number whose square root you need to divide into the stake uncertainty. The statisticians are not to blame for the prevalence of this crude error, although I think their predecessors of a few generations ago could have done a better job of discouraging its spread.
The early statisticians coined the term degrees of freedom for the correct divisor. It is the number of numbers that are free to vary in your formula. The trouble is that degrees of freedom doesn’t suggest much, if anything, to normal people, who have grasped ill-advisedly at the fact that it may be equal to the sample size as a crutch on which to tiptoe through the minefield. Recently, the statisticians have tried the term effective sample size instead. It is better, because it suggests that the actual sample size may not be the right number, but “effective” is still mysterious jargon for most people, including many scientists.
I think wiggle room would be a still better alternative. The essential point is that you are only allowed to divide by the square root of your sample size if all your samples are independent, meaning that you cannot predict any one stake measurement from any other. The more dependent or correlated they are, the less your wiggle room, the smaller your divisor, and the bigger your uncertainty.
Suppose you measure the mass balance at 49 stakes, a convenient number because its square root is seven. If the stakes were independent you could divide your stake uncertainty by seven to get the uncertainty of your whole-glacier mass balance. If the stake uncertainty happens to be 210 mm, another convenient number, the whole-glacier uncertainty comes out as only 30 mm.
![]()
Correlations between series of annual balances measured at thousands of pairs of stakes on White Glacier. The pairs are arranged by how far apart the two stakes are in altitude, and by their correlation — +1 being perfect correlation and 0 being complete lack of correlation. White means no stake pairs; otherwise, the paler the little box the more pairs fall into it. The green dots are the best estimates, according to statistical theory, of the real as opposed to the observed correlation at each value of vertical separation. (Notice that the observed correlation is sometimes negative even when the “real” correlation is quite large.) The red curve, again from theory, summarizes the green dots.
Too bad that, call it what you will, the wiggle room for the mass balance of White Glacier is about one. The large collection of correlations in the graph is strongly skewed towards predictability. If you know one stake balance, you can do a fair to excellent job of predicting others. That the correlations drop off as the stakes grow further apart turns out not to make much difference to the wiggle room. For the purpose of estimating uncertainty, it is as if we had only one stake, not dozens.
The stakes on White Glacier are so highly correlated that we have to live with the fact that we only know our mass balances to within a few hundred millimetres. This is a serious constraint, considering that typical annual mass balances these days are negative by only a few hundred millimetres. No wonder it has taken a long time for signals of expected change to emerge.
TrackBack
TrackBack URL for this entry:
http://www.iop.org/mt4/mt-tb.cgi/3656
