Unraveling Complexities in Material Science: A New Approach to Clustering and Visualization

Unraveling Complexities in Material Science: A New Approach to Clustering and Visualization


In the realm of materials science, the burgeoning field of data-to-knowledge has unveiled promising avenues for understanding and leveraging various material properties. However, certain classes of materials, such as Metal–Organic Frameworks (MOFs), present a challenge due to their multi-dimensional and interrelated physicochemical properties.

Dr. Fadwa El Mellouhi , Senior Scientist, and Dr. Satyanarayana Bonakala , Postdoctoral Researcher from Qatar Environment & Energy Research Institute - QEERI , along with Dr. Micha?l Aupetit , Senior Scientist and Dr. Halima Bensmail mail, Principal Scientist from Qatar Computing Research Institute (QCRI), are actively tackling this challenge by studying the intricacies of MOFs using advanced clustering and visualization techniques. Leveraging an in-house database encompassing geometrical, chemical, and adsorption properties of MOFs, the team embarked on a journey to decipher the underlying patterns and relationships.

Traditional methods like principal component analysis (PCA) fell short in visually uncovering distinct clusters within the dataset. The team then explored combinations of data projection and clustering techniques, including T-distributed stochastic neighbor embedding (t-SNE), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). While these approaches highlighted the overlapping nature of the data, they lacked reproducibility and were sensitive to parameter variations.

In light of these challenges, the team pioneered an approach centered around Gaussian mixture models (GMM) with eigenvalue decomposition discriminant analysis (EDDA). This method provided stability and robustness, offering a more reliable means of clustering MOF data.


?Our methodology follows five steps: (A) clustering of the data with GMM-EDDA in?

Furthermore, the team introduced an interactive divide-and-conquer methodology, empowering analysts to make informed decisions regarding cluster formation. This human-in-the-loop approach, supported by visualizations like LogDA plots, fostered collaborative exploration and validation of clustering results.

The team's efforts culminated in discovering three distinct clusters within the complex MOF dataset, characterized by their correlation and distribution across various features. This breakthrough underscores the nuanced nature of MOF materials and highlights the inadequacy of conventional clustering pipelines in capturing their intricacies.

The team's focus is on refining the methodology and expanding its applicability to diverse datasets beyond materials science. Incorporating indicators and developing interactive tools are among their priorities to enhance the analytical process and facilitate deeper insights into complex datasets.

In conclusion, the team's journey exemplifies the power of innovative methodologies in unraveling the complexities of material science, paving the way for discoveries and advancements in the field.

?

Adama Kadijah Thoronka

MA International Relations Jilin University,China. BA Hons Tourism Management,Limkokwing University Of Creative Technology. Sustainable Tourism Planning and Development Expert.

9 个月

Thanks for sharing.

回复
Karima Chaabna

Assist Prof. in Clinical Pop Health Sc | Manager Pop. Health Res @ Weill Cornell Medicine - Qatar | Epidemiology ?? Biostatistics

9 个月

Bravo Fadwa El Mellouhi ! Keep up the great work

要查看或添加评论,请登录

社区洞察

其他会员也浏览了