In News

By Wayne Cheung

Have you ever wondered whether a butterfly can really cause a hurricane? Are you confused about all the correlations and causations in real life, and how can we go about finding out how everything is dependent on each other? When the system is small enough, there are already many existing statistical techniques that measure correlation between variables. But when things get too complicated, and involve many variables, then a typical method that is a bit more advanced and sophisticated, called Bayesian Network Inference is used.

Ok, ‘Network Inference’ sounds reasonable, it sounds like you’re trying to deduce some kind of networks underpinning all the things that you see in real-life, but what is ‘Bayesian’?

Bayesian actually refers to a probabilistic approach in statistical analysis. Thomas Bayes was a priest who worked part-time at mathematics, and he introduced a new probabilistic interpretation into statistics. His simple and elegant Bayes’ rule is now widely developed and adopted in the analysis of many different types of data.

The idea of Bayesian statistics has two important features. The first one is that in real-life, there is so much randomness and uncertainty in any experiment, that the result in Bayesian statistics is always a probability distribution — you are never 100% sure about anything. The second is introducing the idea of priors, where in addition to analysing the data, you can introduce some of your own assumptions about the system into the mathematics.

Combining this with the network inference part, you end up with a method that can construct networks to describe the cause and effect of a system, with a given probability to tell you how confident that you are about the network.

CHEUNG_carcrashgraph

Cause and effect of car crash.

What is this useful for, you ask? Well, the fact is this method is so generic it can be used to analyse completely different types of data. Let’s say to look at the causes of car crash. Or in medicine, with the newly developed field of bioinformatics, huge amounts of biological data are constantly being generated every day measuring genes or proteins or other biological units. If you think about this carefully, and if you manage to discover some novel cell networks that have never been discovered before, you could gain a better understanding of how diseases evolve in the human body, and you might be well on your way to discover the next vaccine for HIV or the next cure for cancer.

 

Wayne Cheung was one of the recipients of a 2013/14 AMSI Vacation Research Scholarship.