A collaborative effort led by Florian Merkle used whole genome sequencing (WGS) to analyse 143 commonly used human pluripotent stem cell (hPSC) lines at single nucleotide resolution and use the resulting insights to rationally select those most suitable for a given purpose.
HPSCs are remarkable since they can self-renew indefinitely while retaining the ability to differentiate into many cell types. These properties make hPSCs a powerful resource for studying early human development, modelling disease , drug discovery, and increasingly also for developing cell therapies for use in humans. Yet despite widespread use in research, their genomes have not been systematically studied.
In a paper published today in Cell Stem Cell, the research team from Cambridge University and collaborators at Harvard University, the Broad Institute of MIT and Harvard, and elsewhere found that some cell lines carried large structural mutations, or small dominant mutations associated with cancer and other diseases that could alter cellular phenotypes and compromise the safety of hESC-derived cellular products transplanted into humans. However, they were encouraged to see that most small structural and single nucleotide variants – that can be identified only through WGS analyses – were present in hESC genomes and human blood-derived genomes at similar frequencies.
Lead author Florian Merkle said:
This study extends our understanding of the types of genetic mutations that are recurrently acquired in human stem cells, including cancer-associated variants that may compromise their safety or function. The detailed genetic characterisation confirms that while some human embryonic cell lines should be avoided, the majority closely resemble those of human populations, confirming them as a powerful model system to study human genetics since each cell line has a unique constellation of common and rare genetic variants.
In addition to sharing the raw and analysed WGS data from these cell lines, the research team are particularly proud of the user-friendly data portal they created to allow anyone with access to an internet browser to explore their data down to the level of individual aligned sequence traces at single-nucleotide resolution. They have openly shared the code used to make this portal to enable other groups to readily re-create similar portals to explore and disseminate their own data. This means that other research groups will now be able to select the cell lines that are best suited for their study and avoid those that may be unsafe or might give misleading results. The team hopes that this will lead to greater reproducibility in stem cell-based research and support the strong track record of safety in approved stem cell-based therapies.
Florian continued:
We believe that whole genome sequencing analysis will gradually become the method of choice for characterising this valuable cell type. Hopefully this study can serve as a blueprint for other groups seeking to deeply characterise their lead cell lines.
Merkle et al., Whole genome analysis of human embryonic stem cells enables rational line selection based on genetic variation. Cell Stem Cell 29, 1–15 March 3, 2022.