Thursday, December 6, 2012

A new take on causality

It's not often that something fundamentally new comes along on the topic of causality. That notion is one of the most basic concepts in science and philosophy, indeed in all human thinking (non-human as well, I would guess). Finding causal links helps us interpret the world, make predictions, render the unpredictable environment around us a little less unpredictable. But we still have a lot to learn about causality, and especially how to infer causal links using data.

This is clear from a fascinating recent study that I think will ultimately have quite an impact on applied studies of causal links in fields ranging from economics and finance to ecology. This paper by George Sugihara and colleagues -- its entitled "Detecting Causality in Complex Ecosystems" -- is well worth a few hours of study, as it explores some history of attempts to detect causal links from empirical data and then demonstrates a new technique that appears to be a significant advance on past techniques. 

The key problem in inferring causal links from data, of course, is that mere correlation does not imply causation. The two things in question, A and B, might both be linked to some other causal factor C, but actually have no causal links running from one to the other. In economics, Clive Granger became famous for proposing, in this paper 1969, a way to go beyond correlation. He reasoned that if some thing X causally influences some other thing Y, then including X in a predictive scheme should make predictions of Y better. Conversely, excluding X should make predictions worse. Causal factors, in other words, can be identified as those that reduce predictive accuracy when excluded.

This notion of ‘Granger causality’ makes obvious intuitive sense, and has found many applications, especially in econometrics. However, read the original paper and you quickly see that the theory was developed explicitly for use with stochastic variables, especially in linear systems. As Granger noted, “The theory is, in fact, non-relevant for non-stochastic variables.” Which is unfortunate as so much of the world seems to be more suitably described by nonlinear, deterministic systems.

I've just written for Nature Physics a short essay describing the Sugihara et al. work. I assume many people won't have access to that article (oddly enough, I don't either!) so I thought I'd include a few words here. One problem with Granger causality, the authors point out, is that intimate connections between the parts of any nonlinear system make ‘excluding’ a variable more or less impossible. They demonstrate this for a simple nonlinear system of two variables describing the direction interaction of, say, foxes and rabbits. Call the populations X and Y. Following Granger, you might exclude Y and see if you can predict X. If exclusion of Y reduces your ability to predict, then you've found a causal link. But this recipe yields nothing in this case, because of the nonlinearity. The mathematical model they study definitely, by construction, has a causal links between the two. But the Granger method won't show it.

Why? A key result in dynamical system theory — known as the Takens embedding theorem — implies that one can always reconstruct the dynamical attractor for a system from data in the form of lagged samples of just one variable. In effect, X(t) (fox numbers in time) is always predictable from enough of its earlier values. Hence, excluding Y doesn’t make X any less predictable. The notion of Grange causality would erroneously conclude that Y is non-causal.

To get around this problem, Sugihara and colleagues use the embedding theorem to their advantage. The reconstruction trick can be done for both variables X and Y. I won't dwell on technical details which can be found in the paper, but this yields two mathematical "manifolds" -- essentially, subsets of the space of possible dynamics that describe the actual dynamics that happen. Both of these describe the dynamical attractor of the entire system, one using the variable X, the other the variable Y. Now, sensibly, if X has a causal influence on Y, one should expect this influence to show up as a direct link between the dynamics on these two manifolds. Knowing states on one manifold (for Y) at a certain time should make it possible to know the states on the other (for X) at the same time.

That IS technical, but it's really not complicated. The original paper offers links to some beautiful simulations that aid understanding. The strength of the paper is to show how taking this small step into dynamical system theory pays big results. To begin with, it gives superior performance over the Granger method for several test problems. More impressively, it appears to have already resolved an outstanding puzzle in contemporary ecology.

Ecologists have for decades debated what’s going on with two fish species, the Pacific sardine and northern anchovy, the populations of which on a global scale alternate powerfully on a decadal timescale (see fig below). These data, some suggest, imply that these species must have some direct competition or other interaction, as when the numbers of one go up, those of the other go down. Failing any direct observation of such interactions, however, others have proposed that the global synchrony betrays something else — global forcing from changing sea surface temperatures which just happen to affect the two species differently.

Strikingly, the results from the new method -- Sugihara and colleagues give it memorable name "convergent cross mapping" -- seem to resolve the matter in one stroke. The analysis shows no evidence at all for a direct causal link between the two species, and clear evidence for a link from sea surface temperature to each species. In this case, the correlation is NOT reflecting causation, but simultaneous response to a third factor, though a response in opposite directions.

So there you go -- following the basic ideas of dynamical system theory and actually reconstructing attractors for nonlinear systems makes it possible to tease out causal links far more powerfully than correlation studies alone. This is a major advance on our understanding of causality and I find it hard to believe this technique won’t find immediate application in economics and finance as well as in ecology, neuroscience and elsewhere. If you're involved in time series analysis, looking for correlations and causal relations, give it a read. 


  1. That sounds like interesting work. Do you happen to know if there is a pre-print anywhere in the public domain?


  3. A working paper on a more complete correlation coefficient...

  4. Excellent read, Positive site, where did u come up with the information on this posting?I have read a few of the articles on your website now, and I really like your | | | | | | | | | |

  5. Cortinas de Cristal Malaga
    It is good to see you verbalize from the heart and clarity on this important subject can be easily observed. You have provided great information in your post.

  6. Nice article, thanks for the information. It's very complete information. I will bookmark for next reference
    jaring futsal | jaring golf | jaring pengaman proyek |
    jaring pengaman bangunan | jaring pengaman gedung

  7. Thats a good' article, i usually amazed with' this thing, i asked myself about this opinion, I wish You'll a better articles that can make another people impressive.. don't make the article feel rigit and isn't interesting and poor, i like to read this' article and i think this is "good".thank you' m'y brother.

  8. This blog is so nice to me. I will keep on coming here again and again. Visit my link as well.. הלוואה בפריסה נוחה

  9. Wonderful points most people wrote in this article.Great Blob.I think you have made some genuinely interesting elements and guideline.
    Cheap Assignment Writing Service | Cheap Dissertation Writing Service

  10. My Dissertation Writer can help you with your dissertations, Cheap Dissertation Writing Service , research papers & thesis. Qualified Experts & Reasonable Prices. Simply Email us the Details.

    Cheap Dissertation Writing Services

  11. Thanks for sharing. I hope it will be helpful for too many people that are searching for this topic.
    Movers and Packers Delhi
    Movers and Packers Gurgaon
    Movers and Packers Mumbai

  12. maryland dumpster rental Established in 2003, EnviroSolutions is a vertically integrated solid waste collection, disposal and recycling company located in the Northeast and Mid-Atlantic regions of the United States.

  13. Just admiring your work and wondering how you managed this blog so well. It’s so remarkable that I can't afford to not go through this valuable information whenever I surf the internet! Car Insurance Brandon FL

  14. I havent any word to appreciate this post.....Really i am impressed from this post....the person who create this post it was a great human..thanks for shared this with us.