By Peter Wreford, RMIT University
My AMSI vacation research project involved applying graph theory to the genetics of drought resistant chickpeas. Chickpeas are a highly nutritious food, containing dietary fibre, protein, folate and minerals. They are a staple food in India. Finding the genes responsible for drought resistance in chickpeas could contribute to food security.
I was given data for drought resistant and drought susceptible strains of the chickpea plant. For each strain there was a well-watered control group and a water-stressed group. RNA counts were collected for about 3 replicates at 30, 50 and 70 days.
RNA comes from DNA which is a molecule found in the nucleus of all cells. DNA contains a code for all the proteins which make up a living thing. When a gene is expressed a copy of one section of one side of the DNA molecule is made. This section contains the code for one protein (this code is called a gene) and the copy of it is called RNA. The RNA travels out of the nucleus where it can produce a protein. Expression of one gene can increase or decrease expression of another gene.
High correlation between the RNA counts for a pair of genes was assumed to indicate there was interaction between the genes. If interaction was detected between a pair of genes this was represented in a network as an edge (a line) between the two genes (each gene was represented as a node or dot in the network).
Networks for the control and water-stressed groups for the drought susceptible plants were compared because there was some missing data for the drought resistant plants. Unfortunately it was found that the networks were very different. It had been expected that most genes would be interacting similarly but this was not the case.
It was suspected that the data for many genes may be of poor quality so I attempted to delete genes according to a variety of rules. One rule was that the genes retained must have similar RNA counts between replicates since there was no reason for replicates to have very different counts. The similarity between networks improved somewhat but remained low.
The failure to find similar networks could have been due to the fact that the stressed network plants were dying. They were completely denied any water which may have caused a great difference in their gene interaction network. It could also have been because high correlation did not in fact indicate interaction between the genes.
There are a number of ideas for future work on this project. A new rule for selecting genes to be used in the networks has been proposed. The community structure of the networks will be investigated to see which genes do not belong to the same clusters between the control and stressed networks. The drought resistant plants will be compared with the drought susceptible plants.
Peter Wreford was one of the recipients of a 2015/16 AMSI Vacation Research Scholarship.