Researchers at the Columbia University Vagelos College of Physicians and Surgeons have unveiled a groundbreaking predictive AI system designed to forecast gene activity in human cells with remarkable accuracy. This initiative, led by Dr. Raul Rabadan and his team, aims to deepen the understanding of cellular functions in both healthy and diseased states.
In an article published in Nature, titled “A foundation model of transcription across human cell types,” Rabadan articulated the potential of these predictive generalizable computational models. “Predictive generalizable computational models allow [us] to uncover biological processes in a fast and accurate way,” he stated, highlighting how these advanced methodologies can enhance traditional experimental approaches.
Traditional cellular biology research methodologies tend to be retrospective, focusing on events that have already occurred or are currently transpiring within cells. Typical experiments assess cellular response under incremental condition changes but often lack the precision required to foresee the multitude of potential cellular alterations. The team, therefore, aimed to establish a prospective model that could anticipate these changes instead.
Rabadan asserted, “Having the ability to accurately predict a cell’s activities would transform our understanding of fundamental biological processes.” He speculated that this novel technology could shift biological research from the descriptive analyses of seemingly random processes to a field capable of predicting the systematic underpinnings of cellular behaviour.
In the pursuit of effective AI models, the focus typically has been on specific cell types and diseases. “Previous models have been trained on data in particular cell types, usually cancer cell lines or something else that has little resemblance to normal cells,” Rabadan pointed out. In response to this limitation, the Columbia University team developed the General Expression Transformer (GET), which is designed to reveal regulatory grammars across 213 human fetal and adult cell types.
Using chromatin accessibility data and genomic sequences, GET predicts gene expression patterns derived from millions of cells obtained from healthy human tissues. The model's architecture mimics the functionality of large language models like ChatGPT, as it formulates rules regarding cellular functionality akin to grammar rules. “Here it’s exactly the same thing: we learn the grammar in many different cellular states,” Rabadan explained, illustrating how the model applies its learned rules to specific cell conditions.
The authors of the study commented on GET's exceptional adaptability, stating, “GET also shows remarkable adaptability across new sequencing platforms and assays, enabling regulatory inference across a broad range of cell types and conditions, and uncovers universal and cell-type-specific transcription factor interaction networks.”
The robustness of GET was further demonstrated in research involving diseased cells, particularly in the context of pediatric leukemia. The model effectively predicted mutations that disrupted transcription factor interactions in lymphocytes, providing insights that clarify the functional relevance of hereditary mutations that predispose individuals to leukemia. Laboratory experiments confirmed GET's predictive capabilities.
Beyond its applications in understanding mutations related to diseases, GET also serves as a tool for exploring noncoding and regulatory regions of the genome. Rabadan highlighted, “The vast majority of mutations found in cancer patients are in so-called dark regions of the genome.” These mutations, which generally do not influence protein functionality, have long gone unexplored. The research team anticipates that efforts with models like GET could illuminate these regions.
The implications of this predictive AI model extend far beyond its immediate applications, offering transformative potential in the landscape of biological research. Rabadan envisages a shift towards a more predictive science in biology, stating, “It’s really a new era in biology that is extremely exciting; transforming biology into a predictive science.” The advancement of such AI technologies signifies a pivotal moment in understanding complex diseases and cellular behaviours, potentially leading to faster and more accurate experimentation.
Source: Noah Wire Services