Dealing with Sensitive Data

Distinguished Professor Kerrie Mengersen, Queensland University of Technology and Centre for Data Science


Kerrie Mengersen is a Distinguished Professor of Statistics and Director of the Centre for Data Science at QUT, Australia. She is an elected member of the Australian Academy of Sciences and the Australian Academy of Social Sciences, and is a Vice-President of the International Statistical Institute. Kerrie is passionate about developing methods to break open data and reveal insights that can help address critical challenges in health, environment, society and industry. She can be found perched on a stool at the intersection of statistics, machine learning, AI, and technology, and being constantly amazed and challenged by the current and future-promised traffic.


Monday, 24 June6pm - 7.30pm AESTDistinguished Professor Kerrie MengersenRoom 206, Steele Building (03)
The University of Queensland

Also broadcast over Zoom
Watch lecture recording


Talk abstract

Many datasets of interest to statisticians are subject to privacy conditions. This can constrain access, analysis, sharing and release of results. In this presentation, we will consider two ways in which this issue might be addressed. The first is through federated learning, in which the analysis is undertaken in such a way that the data remain in situ and private. The second is synthetic generation of the data, such that the simulated data retains salient characteristics but retains the required privacy. We provide some extensions to the class of models that can be considered in federated learning, and an overview of synthetic generation of tabular data. The exposition of these ideas will be motivated by the creation of an Australian Cancer Atlas.

This research is in collaboration with QUT colleagues Conor Hassan and Dr Robert Salomone, and is funded by the Australian Research Council and Cancer Council Queensland.

Key reading:

C Hassan, R Salomone, K Mengersen (2023) Federated variational inference methods for structured latent variable models. arXiv preprint arXiv:2302.03314

C Hassan, R Salomone, K Mengersen (2023) Deep generative models, synthetic tabular data and differential privacy: an overview and synthesis. arXiv preprint arXiv:2307.15424