|
|
Developmental Biology - Personalized Medicine
Artificial Intelligence and the Single Cell
Artificial intelligence someday may be used for error correction in single cell analysis...
The Human Cell Atlas aims to map all tissues of the human body to create a database for the future of personalized medicine. It has the ability to distinguish healthy from individual diseased cells, and helps researchers understand exactly which genes are switched on or off at any given moment in life. This visionary project is enormous and made possible by a technology known as single-cell RNA sequencing.
"Previously, data could only be obtained from large groups of cells as these measurements required so much RNA. The results were always only averages of all the cells used. Now we're able to get precise data for every single cell."
Maren Büttner, doctoral student, Institute of Computational Biology (ICB), Helmholtz Zentrum München, Germany.
However, the increased sensitivity of the technique also means increased susceptibility to the batch effect. Maren Büttner explains: "fluctuations between measurements can occur if the temperature of the device deviates even slightly - or the processing time of the cells changes." Although several models exist for the correction of these deviations, those methods are highly dependent on the actual magnitude of the effect. "We therefore developed a user-friendly, robust and sensitive measure called kBET that quantifies differences between experiments and therefore facilitates the comparison of different correction results," Büttner says.
A second phenomenon known as dropout events also poses a major challenge in single-cell sequencing.
"Let's say we sequence a cell and observe that a particular gene in that cell does not emit any signal at all. The underlying cause can be biological or technical in nature. Either the gene is not read by the sequencer because it is simply not expressed, or it was not detected for technical reasons."
Fabian Theis PhD, ICB Director and professor of Mathematical Modeling of Biological Systems, TUM.
To identify dropout events, Gökcen Eraslan and Lukas Simon developed a deep learning algorithm that simulates learning processes in human neural networks. Known as the deep count auto-encoder, the algorithm learns to simplify complex data by first compressing and then re-constructing it afterwards.
Drawing on a new probabilistic model which compares original to reconstructed data, the algorithm determines whether a missing gene signal is due to a biological or technical failure.
Fabian Theis: "This model even determines cell-type specific corrections without two different cell types becoming artificially similar. As one of the first deep learning methods in the field of single-cell genomics, the algorithm scales up to handle data sets containing millions of cells. We are not smoothing results. Our goal is to identify and correct errors."
Abstract
Single-cell transcriptomics is a versatile tool for exploring heterogeneous cell populations, but as with all genomics experiments, batch effects can hamper data integration and interpretation. The success of batch-effect correction is often evaluated by visual inspection of low-dimensional embeddings, which are inherently imprecise. Here we present a user-friendly, robust and sensitive k-nearest-neighbor batch-effect test (kBET; https://github.com/theislab/kBET) for quantification of batch effects. We used kBET to assess commonly used batch-regression and normalization approaches, and to quantify the extent to which they remove batch effects while preserving biological variability. We also demonstrate the application of kBET to data from peripheral blood mononuclear cells (PBMCs) from healthy donors to distinguish cell-type-specific inter-individual variability from changes in relative proportions of cell populations. This has important implications for future data-integration efforts, central to projects such as the Human Cell Atlas.
Authors
Maren Büttner, Zhichao Miao, F. Alexander Wolf, Sarah A. Teichmann and Fabian J. Theis.
Acknowledgements
The authors thank A. Böttcher for motivating this study, and T. Illicic for carrying out pilot analyses. We thank in particular M. Subramaniam and J. Ye (UCSF) for the PBMC data. We are grateful to the members of the Teichmann and Theis labs for valuable discussions and comments on the manuscript. M.B. is supported by a DFG Fellowship through the Graduate School of Quantitative Biosciences Munich (QBM). Z.M. is supported by a Single Cell Gene Expression Atlas grant from the Wellcome Trust (nr. 108437/Z/15/Z). F.A.W. acknowledges support by the Helmholtz Postdoc Programme, Initiative and Networking Fund of the Helmholtz Association. F.J.T. acknowledges financial support by the German Science Foundation (SFB 1243 and Graduate School QBM) and by the Bavarian government (BioSysNet). This collaboration was supported by a Helmholtz International Fellow Award to S.A.T.
The work was developed in close collaboration with Dr. Sarah Amalia Teichmann from the Wellcome Sanger Institute, who is also involved in the Human Cell Atlas. In 2017, Dr.Teichmann received a Helmholtz International Fellow Award, which promotes cooperation between Helmholtz scientists and top ranking colleagues abroad.
Helmholtz Zentrum München, German Research Center for Environmental Health, Helmholtz Zentrum München pursues the goal of developing personalized medical approaches for the prevention and therapy of major common diseases such as diabetes mellitus, allergies and lung diseases. To achieve this, it investigates the interaction of genetics, environmental factors and lifestyle.
Return to top of page
| |
|
Feb 1, 2019 Fetal Timeline Maternal Timeline News News Archive
Artificial Intelligence test sequences RNA per single cell. Image: Quantitative Biosciences Munich (QBM).
|