Hosein Fooladi BSc MSc
Hosein Fooladi BSc MSc
Research
Thesis title: "Development of machine learning models for domain generalization in the chemical space"
Thesis outline: Most of the machine learning models have been developed based on two assumptions: there are huge number of training data (inhibitors/non-inhibitors ligands) available for the protein target of interest, and second, at test time (mostly virtual screening) the chemical space is similar to the training ligands (both train and test data comes from the same distribution). Both assumptions can be violated in practical situations. First, for target of interest (which can be a novel pharmaceutical target) the amount of available data often is scarce (maybe no inhibitor is known or very few are known) which makes the application of classical machine leaning and deep learning models not straightforward. Second, the test chemical space (chemical space for prediction) usually is different than the training one (in terms of molecular weights, scaffolds, number of rings, existence of macrocycles, etc.) which again make the predications come from machine learning models not reliable (it is well known that machine learning models often perform poorly during the test stage if the test distribution differs from the training distribution).
My doctoral project aims to propose predictive models for molecular property and activity (ligand-target interaction) that generalize well into unseen domain. Domain can be a new target when there are not too many (none or very few) known ligands available for that target. Moreover, domain can be a new chemical space (ligands with different scaffolds, molecular weights, etc.) for a well-known target. I.e., we want to achieve maximum applicability domain, which means reliable prediction on expanded protein and small molecule chemical space.
Supervisor: Johannes Kirchmair, Advisor: Thierry Langer