Education
-
October 2022 - KU Leuven
PhD in AI for Medical Imaging
During my PhD, I developed a next-generation phenotyping model for computer-aided diagnosis of genetic syndromes from 3D facial images, in which geometric deep learning and metric learning techniques are combined to facilitate interpretable classification and analysis of genetic syndromes. -
2018 - KU Leuven, Belgium
Master of Artificial Intelligence- Engineering and Computer Science Track
(GPA 75.96/100) MAI at K U Leuven explores and builds on the fascinating challenge of developing digital systems than can autonomously process, reason and solve complex problems. This programme has provided me with an in-depth and fundamental knowledge of Artificial Intelligence and deep neural networks. -
2017 - Sharif University of Technology, Iran
Bachelor of Computer Science
(GPA of last two years: 18.06/20) Computer Science Bachelor programme is part of the Department of Mathematical Sciences at Sharif University of Technology. Therefore, the courses we had to follow provided us with a strong theoretical background as well as intensive programming experiences.
Selected Publications & Research Activity
-
2025
Paper Published: A 3D clinical face phenotype space of genetic syndromes using a triplet-based singular geometric autoencoder
IEEE ACCESS
Clinical diagnosis of syndromes benefits strongly from objective facial phenotyping. This study introduces a novel approach to enhance clinical diagnosis through the development and exploration of a low-dimensional metric space referred to as the clinical face phenotypic space (CFPS). As a facial matching tool for clinical genetics, such CFPS can enhance clinical diagnosis. It helps to interpret facial dysmorphisms of a subject by placing them within the space of known dysmorphisms. In this paper, a triplet loss-based autoencoder developed by geometric deep learning (GDL) is trained using multi-task learning, which combines supervised and unsupervised learning approaches. Experiments are designed to illustrate the following properties of CFPSs that can aid clinicians in narrowing down their search space: a CFPS can 1) classify syndromes accurately, 2) generalize to novel syndromes, and 3) preserve the relatedness of genetic diseases, meaning that clusters of phenotypically similar disorders reflect functional relationships between genes. The proposed model consists of three main components: an encoder based on GDL optimizing distances between groups of individuals in the CFPS, a decoder enhancing classification by reconstructing faces, and a singular value decomposition layer maintaining orthogonality and optimal variance distribution across dimensions. This allows for the selection of an optimal number of CFPS dimensions as well as improving the classification capacity of the CFPS, which outperforms the linear metric learning baseline in both syndrome classification and generalization to novel syndromes. We further proved the usefulness of each component of the proposed framework, highlighting their individual impact. From a clinical perspective, the unique combination of these properties in a single CFPS results in a powerful tool that can be incorporated into current clinical practices to assess facial dysmorphism. -
2024
Paper Published: Co-occurrence graph-enhanced hierarchical prediction of ICD codes
IEEE ICASSP
Recent healthcare applications of natural language processing involve multi-label classification of health records using the International Classification of Diseases (ICD). While prior research highlights intricate text models and explores external knowledge like hierarchical ICD ontology, fewer studies integrate code relationships from whole datasets to enhance ICD coding accuracy. This study presents a modular approach, sequentially combining graph-based integration of ICD code co-occurrence with a hard-coded hierarchical-enriched text representation drawn from the ICD ontology. Findings reveal: 1) significant performance gains in the combined model, aside from the significant performance gain in each enhancement module in isolation, 2) graph-based module’s efficacy is more pronounced when applied to enhanced features using the hierarchical ICD ontology, and 3) experiments demonstrate hierarchy depth’s impact on performance, concluding the deepest level’s enrichment. -
2023
Paper Published: A review of deep learning methods for automated clinical coding
IEEE ICCAE
Clinical coding is an administrative function in hospitals, which involves the transformation of clinical notes into structured codes that can be analyzed statistically. Coding data has a number of benefits, including speeding up the administration process of insurance companies and hospitals, improving global data sharing, and facilitating statistical analysis and forecasting. Current healthcare practice involves an individual as a clinical coder interpreting information about an aspect of patient care and assigning standardised codes. Thus, the current manual coding process is labor-intensive, time-consuming, and error-prone. Computer-assisted coding has emerged in the healthcare industry in recent years. AI-based systems together with expert-led services can help reduce labor costs, facilitate the administration process, and provide more informed and efficient healthcare. Consequently, researchers are increasingly interested in the use of deep neural networks to automate the process of clinical coding. The objective of this brief literature review is to summarize and describe the characteristics of the International Classification of Diseases (ICD), list the commonly used ICD-coded datasets, discuss the state-of-the-art deep learning models for ICD coding and the effect of injecting ICD ontology into these models, and present the interpretability mechanism that have been developed and implemented for clinical coding. -
2022
Paper Published: Multi-Scale Part-Based Syndrome Classification of 3D Facial Images
IEEE ACCESS
ABSTRACT: Identification and delineation of craniofacial characteristics support the clinical and molecular diagnosis of genetic syndromes. Deep learning (DL) frameworks for syndrome identification from 2D facial images are trained on large clinical datasets using standard convolutional neural networks for classification. In contrast, despite the increased availability of 3D scanners in clinical setups, similar frameworks remain absent for 3D facial photographs. The main challenges involve working with smaller datasets and the need for DL operations applicable to 3D geometric data. Therefore, to date, most 3D methods refrain from working across multiple syndromic groups and/or are solely based on traditional machine learning. The first contribution of this work is the use of geometric deep learning with spiral convolutions in a triplet-loss architecture. This geometric encoding (GE) learns a lower dimensional metric space from 3D facial data that is used as input to linear discriminant analysis (LDA) performing multiclass classification. Benchmarking is done against principal component analysis (PCA), a common technique in 3D facial shape analysis, and related work based on 65 distinct 3D facial landmarks as input to LDA. The second contribution of this work involves a part-based implementation to 3D facial shape analysis and multi-class syndrome classification, and this is applied to both GE and PCA. Based on 1,786 3D facial photographs of controls and individuals from 13 different syndrome classes, a five-fold cross-validation was used to investigate both contributions. Results indicate that GE performs better than PCA as input to LDA, and this especially so for more compact (lower dimensional) spaces. In addition, a part-based approach increases performance significantly for both GE and PCA, with a more significant improvement for the latter. I.e., this contribution enhances the power of the dataset. Finally, and interestingly, according to ablation studies within the part-based approach, the upper lip is the most distinguishing facial segment for classifying genetic syndromes in our dataset, which follows clinical expectation. This work stimulates an enhanced use of advanced part-based geometric deep learning methods for 3D facial imaging in clinical genetics. -
2021
Paper Published: Matching 3D Facial Shape to Demographic Properties by Geometric Metric Learning: A Part-Based Approach
IEEE Transactions on Biometrics, Behavior, and Identity Science
ABSTRACT: Face recognition is a widely accepted biometric identifier, as the face contains a lot of information about the identity of a person. The goal of this study is to match the 3D face of an individual to a set of demographic properties (sex, age, BMI, and genomic background) that are extracted from unidentified genetic material. We introduce a triplet loss metric learner that compresses facial shape into a lower dimensional embedding while preserving information about the property of interest. The metric learner is trained for multiple facial segments to allow a global-to-local part-based analysis of the face. To learn directly from 3D mesh data, spiral convolutions are used along with a novel mesh-sampling scheme, which retains uniformly sampled points at different resolutions. The capacity of the model for establishing identity from facial shape against a list of probe demographics is evaluated by enrolling the embeddings for all properties into a support vector machine classifier or regressor and then combining them using a naive Bayes score fuser. Results obtained by a 10-fold cross-validation for biometric verification and identification show that part-based learning significantly improves the systems performance for both encoding with our geometric metric learner or with principal component analysis. -
2019
Poster Presentation: PATTERNS OF 3D FACIAL SEXUAL DIMORPHISM: A TRIPLET LOSS APPROACH
Presented at ICVSS 2019, Sicily, Italy
ABSTRACT: Recent advances in deep learning and computer vision have made a significant impact on a variety of domains. However, there is a fundamental and often overlooked shortcoming of these techniques when applied to biological and medical image data. Hence, it is of interest to examine the biological plausibility and interpretability of the knowledge learned by deep networks. Particularly within this study, we aim to use 3D images to investigate the existence of multiple patterns of sexual dimorphism in human faces. The outcomes of this study can potentially result in a powerful alternative to the current linear biological shape analysis which provides only a single metric for sexual dimorphism. We first train an autoencoder for 3D facial images with a 32 dimensional latent space. Once obtained, the trained model will be fine-tuned for sex classification with the deep metric learner function called Triplet Loss (TL). The advantages of this combination are: (1) Triplet loss is trained with triplets of faces, thus, the number of training data can practically increase to n3. (2) Due to the metric measures included in the TL function, the latent space will reflect the similarity in sex groups, hence distances are meaningful measures for classification purposes. (3) The potentially multiple patterns of sexual dimorphism learned by the network can be visualized with the help of the decoder, which improves the interpretability of the learned model. -
2018
2019Master Theses Mentorship: Deep learning of latent manifolds applied to 3D facial images
Bram Vandendriessche, Michaël Vanderstuyft; Master of Computer Sciences
ABSTRACT: A human face is the reflection of numerous factors that affect its appearance. Genetic as well as environmental factors play a key role in what it looks like. An example is the similarity between identical twins, who share the same DNA, but might be influenced by different environmental factors: BMI, smoking, an accident and so on. The effect of DNA on the human appearance is not fully understood, complicating the use of this information. Knowledge about the relations between the facial structure and the factors that have an impact on it, could form the basis of a procedure to retrieve a facial image from a DNA sample. This would prove to be an invaluable addition to the toolbox of forensic experts, facilitating a faster identification of victims and suspects. One of the steps towards this goal is to obtain a good representation of faces. This work therefore investigates the characteristics of a model obtained by training a disentangled variational autoencoder on 3D facial images. Thereby, the focus lies on examining the properties of the latent space and the latent space variables. The model succeeds in capturing the facial variations using 63 variables. We study its generalisation capabilities, as well as the relation between its latent space and the data space. The presented model generalises well over a wide range of faces, allowing random sampling from the latent space to generate synthetic 3D faces. It does, however, not always generate anatomically correct faces, caused by the limited number of variables as well as the correlation between latent variables. -
2018 2019
Master Theses Mentorship: Predicting DNA deduced traits from 3D faces: a triplet loss approach
Bram De Cooman, Master of Mathematical Engineering
ABSTRACT: Recognizing people’s faces is a natural and relatively easy task for humans. To create an automatic system that is able to extract meaningful features from a face and use these to make predictions of an individual’s sex, age, BMI or ancestry, is much harder though. In this thesis we try to solve this problem using convolutional neural networks. The last few years these kind of networks have achieved state-of-the-art results in a broad range of different applications. To train the convolutional models with a limited amount of data, we will employ the triplet loss function. This loss function will create an embedding space in which distances correspond to dissimilarities of the target labels. This strategy has been used successfully before in many deep metric learning applications. It is, however, limited to the prediction of discrete classes. Hence, we generalized this classical triplet loss function to allow for the usage of continuous target traits, such as age and BMI. Together with this generalization, we also introduced two new triplet mining strategies to get a fast and robust convergence of the trained models, depending on the specific task. Once we obtain the highly structured embedding spaces as an output of the trained networks, the actual prediction of the trait values can be done using classical estimators, such as an SVM or KNN predictor. Evaluating the best obtained models on an independent test set, shows we can predict sex with an accuracy of 97.9%, age with an MAE of 3.39 years, BMI with an MAE of 2.82 kg/m2 and 6 SUGIBS genomic background components with a positive R2. Finally, we combined the best models to create powerful biometric systems in both the verification and identification setup. Evaluating their performance shows our approach outperformed some previous attempts within our research group. -
2018 2019
Master Theses Mentorship: Geometric Deep Learning for Phenotypic Trait Recognition
Balder Croquet , Master of Artificial Intelligence
ABSTRACT: We research the use of recent geometric deep learning techniques in the context of phenotypic traits recognition based on the morphology of the face. More specifically the techniques are used to develop an end-to-end learning system that is capable of learning these traits directly from the intrinsical structure where the data can be represented as a graph or a manifold mesh. This approach is compared to previous state of the art where Convolutional Neural Networks are used on engineered feature maps. It was found that so called geometric deep learning techniques often par or even outperform the conventional convolutional neural network approaches on the task of phenotypic trait recognition. Especially the spline convolutional operator with a voxel or graclus pooling operator. We propose a new technique that is capable of generating a saliency map for the best performing geometric deep learning model. The goal is to visualise the decision making process by highlighting the facial features that were important in determining the phenotypic traits. The new technique is a generalisation of a probabilistic sampling technique on the euclidean domain. The technique is formulated for both the graph and manifold mesh representations. Visual inspection of the result for sex prediction showed the facial features: the area underneath the nostrils, the chin, the forehead and the cheekbones, of importance. Affirming the clear indicators nose and chin known by domain experts. -
2018 2019
Master Theses Mentorship: A Comparison of Deep Metric Loss Layers for Facial Expression Recognition
Michiel Vanschoonbeek, Master of Artificial Intelligence
ABSTRACT: In this thesis, we compare different loss layers for facial expression recognition. FER is the task of classifying facial images into universal expression classes. Because of high inter-class and low-intra class similarity, it is a complicated problem. Traditional methods based on handcrafted feature extractors have been applied successfully in lab-controlled environments. However, they perform poorly on images taken in real-world conditions. Hence, their applicability is limited. The advent of deep learning offered new possibilities for the development of FER systems. Convolutional neural networks have proven to be very useful for general computer vision tasks. Recently, deep metric loss layers have been proposed to improve performance. In deep metric learning, a similarity function is learned over the input images in a high-dimensional space, often called an embedding space. The loss functions penalize embedding vectors from the same expression class that lie far away from each other and vectors from different classes that lie close to each other. Ideally, they form clusters after convergence. We implement and evaluate different deep metric loss layers and compare them on overall classification accuracy and individual expression class performance. Each model is trained on the same dataset and network architecture. Because images are classified based on the spatial arrangement of the vectors in the output space, the structure of the embedding space is determinative for classification performance. For each loss function, we show a two-dimensional visualization of each latent space with t-SNE, focusing on the difference with other loss functions. Afterwards, the advantages and disadvantages of each approach are compared. Every reviewed metric loss layer was able to achieve better performance than softmax loss, which is the traditional loss function for classification tasks. There is a significant improvement in accuracy for easily confused classes, like "Fear", "Anger" and "Sadness". Softmax loss performs poorly for these classes. Exponential triplet loss shows the best overall performance with 70%, an increase of almost 6%, thereby exceeding human accuracy. -
2018 2019
Master Theses Mentorship: Internal Representation Analysis of 3D Faces Using Variational Autoencoders
Ruben De Clercq, Master of Mathematical Engineering
ABSTRACT: Internal representations are a form of knowledge representation. Such a representation could for example be a semantic description of concepts defining a dataset. Traditionally, the internal representation of a dataset is found by Principal Component Analysis. Recent work in machine learning has provided new means to find internal representations. In this work we look at an autoencoder and some of its variants. These neural networks map the data to a latent space in a nonlinear fashion and can be used to generate synthetic samples. Various networks are implemented to reconstruct 3D facial data stored in 2D UV maps and to generate new samples. These networks are trained on a dataset of 4,909 UV mapped facial scans and labels, by penalising the loss between the reconstruction and the network input. This loss consists of the average Euclidean distance and a measure based on the mean curvature of a surface. A comparison of the experiments show that it is possible to reconstruct faces with high accuracy from a latent space of dimension 70 and to generate new plausible faces using random sampling. It is also possible to generate accurate faces given labels Age, BMI and Sex.
