Unraveling Complexities in Material Science: A New Approach to Clustering and Visualization
Qatar Environment & Energy Research Institute - QEERI
Empowering Sustainability and Resilience through research, innovation and technology development.
In the realm of materials science, the burgeoning field of data-to-knowledge has unveiled promising avenues for understanding and leveraging various material properties. However, certain classes of materials, such as Metal–Organic Frameworks (MOFs), present a challenge due to their multi-dimensional and interrelated physicochemical properties.
Dr. Fadwa El Mellouhi , Senior Scientist, and Dr. Satyanarayana Bonakala , Postdoctoral Researcher from Qatar Environment & Energy Research Institute - QEERI , along with Dr. Micha?l Aupetit , Senior Scientist and Dr. Halima Bensmail mail, Principal Scientist from Qatar Computing Research Institute (QCRI), are actively tackling this challenge by studying the intricacies of MOFs using advanced clustering and visualization techniques. Leveraging an in-house database encompassing geometrical, chemical, and adsorption properties of MOFs, the team embarked on a journey to decipher the underlying patterns and relationships.
Traditional methods like principal component analysis (PCA) fell short in visually uncovering distinct clusters within the dataset. The team then explored combinations of data projection and clustering techniques, including T-distributed stochastic neighbor embedding (t-SNE), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). While these approaches highlighted the overlapping nature of the data, they lacked reproducibility and were sensitive to parameter variations.
In light of these challenges, the team pioneered an approach centered around Gaussian mixture models (GMM) with eigenvalue decomposition discriminant analysis (EDDA). This method provided stability and robustness, offering a more reliable means of clustering MOF data.
Furthermore, the team introduced an interactive divide-and-conquer methodology, empowering analysts to make informed decisions regarding cluster formation. This human-in-the-loop approach, supported by visualizations like LogDA plots, fostered collaborative exploration and validation of clustering results.
The team's efforts culminated in discovering three distinct clusters within the complex MOF dataset, characterized by their correlation and distribution across various features. This breakthrough underscores the nuanced nature of MOF materials and highlights the inadequacy of conventional clustering pipelines in capturing their intricacies.
The team's focus is on refining the methodology and expanding its applicability to diverse datasets beyond materials science. Incorporating indicators and developing interactive tools are among their priorities to enhance the analytical process and facilitate deeper insights into complex datasets.
In conclusion, the team's journey exemplifies the power of innovative methodologies in unraveling the complexities of material science, paving the way for discoveries and advancements in the field.
?
MA International Relations Jilin University,China. BA Hons Tourism Management,Limkokwing University Of Creative Technology. Sustainable Tourism Planning and Development Expert.
9 个月Thanks for sharing.
Assist Prof. in Clinical Pop Health Sc | Manager Pop. Health Res @ Weill Cornell Medicine - Qatar | Epidemiology ?? Biostatistics
9 个月Bravo Fadwa El Mellouhi ! Keep up the great work