Human micro-biome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects

Human microbiome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects

Table of Contents


The human gut micro-biome is a complex ecosystem that both affects and is affected by its host status. Previous analyses of gut microflora revealed associations between specific microbes and host health and disease status, genotype and diet. Here, we developed a method of predicting biological age of the host based on the microbiological profiles of gut microbiota using a curated dataset of 1,165 healthy individuals (3,663 micro-biome samples). Our predictive model, a human micro-biome clock, has an architecture of a deep neural network and achieves the accuracy of 3.94 years mean absolute error in cross-validation. The performance of the deep micro-biome clock was also evaluated on several additional populations. We further introduce a platform for biological interpretation of individual microbial features used in age models, which relies on permutation feature importance and accumulated local effects. This approach has allowed us to define two lists of 95 intestinal biomarkers of human aging. We further show that this list can be reduced to 39 taxa that convey the most information on their host’s aging. Overall, we show that (a) microbiological profiles can be used to predict human age; and (b) microbial features selected by models are age-related.


The human gut is colonized by a dense microbial community, calculated to consist of 1014 cells, which is an order of magnitude higher than the number of cells in the host 1 . Gut microbiota is a complex ecosystem that carries multiple important functions in the organism. Apart from being a core element of the digestive system, microbiota regulates immunity, processes xenobiotics, produces important metabolites, and even affects higher neural functions 2–4 . The influence, however, is not one-sided: microbiota is not simply determining certain host characteristics, as it responds to signals from the host via multiple feedback loops 5 . Some of these feedback loops were found to be reflected in the microbiota composition.

For example, multiple studies indicate that irritable bowel diseases can develop following the intense immune response to an intestinal infection. Microbiota responds to pro-inflammatory milieu with a decreased number of beneficial bacteria that lack mechanisms to survive under such hostile conditions. In return, host immunity reacts to suppress the blooming pathogenic community, which produces chronic inflammation 6 . Such changes constantly happen throughout an individual’s life and may be deleterious or beneficial, reflect strictly individual choices or be the effects of more widespread factors across populations.

Meta genomic studies have provided valuable insights into how the gut microflora progresses with age. They revealed that gut colonization occurs during birth with the bacteria living in the birth canal. The “pioneer micro-biome” consists of facultative aerobes (e.g. Escherichia, Enterococcus, etc.) that gets replaced during breast feeding with obligate anaerobes (e.g. Bifidobacterium infantis) 7 . Upon weaning, another community shift happens towards more adult-like micro-biomes 8 . These early stages of colonization are extremely important as normal infant microbiota promotes intestinal mucus formation, prevents pathogen blooming, and regulates T-cells. The importance of early colonization is further emphasized by studies that indicate higher occurrences of eczema and food allergies in children with atypical microbiota 9 development (e.g. increased abundance of Clostridium and Escherichia microbes) 10. Factors such as the mode of birth delivery (vaginal or cesarean), infant diet (breast milk or formula), and maternal micro-biome greatly influence micro-biome development.

Although infant micro-biome succession is well studied and can be used to assess the risks of various health conditions, its transition to adult micro-biome is less understood. More so, composition variability attributed to geographic location, medical history, diet, and other factors make it hard to analyze adult micro-biomes as effectively as those of infants. Age-related studies of human micro-biome have failed to produce a straightforward theory of gut flora aging. Some studies indicate decreasing biodiversity in the elderly gut 11,12. However, that is not the case for all data sets, and elderly healthy people may have micro-biomes as diverse as the younger population 13,14. Other findings include changes in specific taxa abundance in aging microbiota. Such bacterial genera as Bacteroides, Bifidobacterium, Blautia, Lactobacilli, Ruminococcus have been shown to decrease in the elderly, while Clostridium, Escherichia, Streptococci, Enterobacteria increase 15,16. However, these patterns are not strictly established as results vary greatly across different studies. This may be attributed to different methodologies as well as unbalanced data sets that may contain people of different lifestyles 17.

Despite these complications, the consensus is that the elderly gut has lower counts of short chain fatty acid (SCFA) producers such as Roseburia and Faecalibacterium and an increased number of aerotolerant and pathogenic bacteria. Such shifts can lead to dysbiosis, which in turn contributes to the onset of multiple age-related diseases 9 . The idea that the gut microflora can be a major contributor to the aging process is not new. Already in the beginning of the 20th century, a Nobel Prize-winning Russian scientist Ilya Metchnikoff proposed that the malicious microbes processing undigested food (especially peptolytic bacteria, e.g. Escherichia and Clostridium) lead to autointoxication. Treating autointoxication with pro- and pre-biotics (such as Lactobacillus preparations) was suggested to alleviate an age-associated decline in organismal function. Recent studies have demonstrated promising results in line with this century-old hypothesis.

The standard way of separating the gut micro-biome into three chronological states – child, adult, and elderly micro-biomes – lack a clear set of rules. Among them, adult micro-biome remains the greatest mystery. It has no established succession stages, as in newborns, and does not normally reflect gradient detrimental processes typical for an old organism. This poses a question whether normal adult micro-biome progresses at all or it is in a state of stasis. Considering the aging process is gradual and involves accumulation of damage and other deleterious changes 21 (as also indicated by a number of biomarkers such as DNA methylation clocks 22,23), it is logical to suppose that gut micro-biome succession is also gradual 24. However, attempts to use micro-biome features to predict chronological age have been inconclusive.

A support vector machine model trained on human meta genomic data to classify samples as young or old was shown to be only 10-15% more accurate than random assignment, as indicated by the Area Under the Curve (AUC) score 25. Another study attempting to use a co-abundance clustering approach has demonstrated general trendlines of microbiota composition for hosts aged 0-100 26. According to the study, specific clades of the gut community significantly differ in abundance among young adults compared to the middle aged. However, the lack of dietary and lifestyle data prevents the authors from putting together a conclusive theory of gut microflora progression. Compared to the well-established DNAm aging clocks that achieve mean absolute error (MAE) <5 years, these results of micro florabased age prediction suggest much room for improvement.

The renaissance of deep learning that started in 2015 resulted in unprecedented machine learning performance in image, voice, and text recognition, as well as a range of biomedical applications 29 such as drug repurposing 30 and target identification 31. One of the most impactful applications of DL in biomedicine was in the applications of generative models to de novo molecular design 32–36. In the context of aging research, these new methods can be combined for geroprotector discovery 37–41. Indeed, since 2013, many aging clocks have been developed in both humans and other model organisms. The published aging clocks utilizing deep learning were developed using standard clinical blood tests 42, facial images 43, physical activity data, 44 and transcriptomic data 45. These clocks were used to rank the most important features contributing to the accuracy of the prediction by using the permutation feature importance (PFI), deep feature selection (DFS) and other techniques. These clocks were also used to assess the population specificity of the various data types.

The goal of this study was to build a predictor of age with whole genome sequencing (WGS) data aggregated from multiple sources and various machine learning techniques and use it to examine patterns of incessant microflora succession. Here, we report a method to estimate a host’s age based on their microflora taxonomic profile, assess the importance of specific taxa in organismal aging, and suggest candidate geroprotective microbiological interventions.


We demonstrated the feasibility of age prediction by application of machine learning approaches to taxonomic microflora profiles. Our most accurate DNN regressor achieved the MAE of 3.94 years. This performance is comparable with the 1.9 MAE of the PhotoAgeClock, 2.7 of the state of art methylation aging clock, 7.8 MAE transcriptomic aging clock and 5.5 MAE of the hematological aging clock published previously. We also developed a method for microbiological feature selection and annotation. It combines two-fold feature importance assessment using PFI and ALE approaches upon training a DNN. This technique allows both selecting the most relevant features as biomarkers and quantifying their influence on the target variable, i.e. age. Using this method, we identified 95 and 39 prokaryote taxa as the biomarkers of intestinal aging. Despite the reduced predictive power of this set when compared to the whole taxonomic profiles, it let us to assign individuals to three age groups (young, middle aged and old) 86% more accurately than random classification (0.71 versus 0.34 F-score). The identified biomarkers include species whose abundance is positively or negatively correlated with predicted age. These species may be further investigated deeply by the community to improve our understanding of human aging and its relationship with the gut micro-biome.Human micro-biome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects.

About KSRA

The Kavian Scientific Research Association (KSRA) is a non-profit research organization to provide research / educational services in December 2013. The members of the community had formed a virtual group on the Viber social network. The core of the Kavian Scientific Association was formed with these members as founders. These individuals, led by Professor Siavosh Kaviani, decided to launch a scientific / research association with an emphasis on education.

KSRA research association, as a non-profit research firm, is committed to providing research services in the field of knowledge. The main beneficiaries of this association are public or private knowledge-based companies, students, researchers, researchers, professors, universities, and industrial and semi-industrial centers around the world.

Our main services Based on Education for all Spectrum people in the world. We want to make an integration between researches and educations. We believe education is the main right of Human beings. So our services should be concentrated on inclusive education.

The KSRA team partners with local under-served communities around the world to improve the access to and quality of knowledge based on education, amplify and augment learning programs where they exist, and create new opportunities for e-learning where traditional education systems are lacking or non-existent.

FULL Paper PDF file:

Human micro-biome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects



Fedor GalkinAleksandr AliperEvgeny PutinIgor KuznetsovVadim N. Gladyshev, Alex Zhavoronkov




Human micro-biome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects

Publish in




PDF reference and original file: Click here

Website | + posts

Nasim Gazerani was born in 1983 in Arak. She holds a Master's degree in Software Engineering from UM University of Malaysia.

Website | + posts

Professor Siavosh Kaviani was born in 1961 in Tehran. He had a professorship. He holds a Ph.D. in Software Engineering from the QL University of Software Development Methodology and an honorary Ph.D. from the University of Chelsea.

+ posts

Somayeh Nosrati was born in 1982 in Tehran. She holds a Master's degree in artificial intelligence from Khatam University of Tehran.