Knowledge Integration in Autonomous (Perovskite) Experimentation
Autonomous tools can make and characterize new materials wicked fast (thousands of samples per minute). The machine-learning (ML) decision-maker must also be fast — and it must interface with (gain the trust of) humans, which requires distilling high-dimensional information.
Some ML algorithms can input & output knowledge in a human-interpretable form —?e.g., mathematical equations or language. This LinkedIn post reviews some of these attempts in our lab, and briefly highlights emergent areas.
Case 1: Underlying Physics is Well Known
Bayesian inference can?extract “hidden variables” (bulk & interface properties) from observable experimental data. For instance, the bulk, interface, and defect properties can be inferred from device-level measurements of completed solar-cell devices. This requires a forward model (at minimum heuristics, ideally a full numerical device model), which is reconciled probabilistically with the experimental data. Ongoing work focuses on combining multiple experimental information streams to increase inference precision and accuracy.
Examples:
领英推荐
Case 2: Some Underlying Physics is Known
Bayesian optimization is a powerful tool, but it is only as smart as the data. We use probabilistic constraints?“blend”?secondary data sources into Bayesian optimization acquisition functions. Examples of secondary data sources, and the successes that have resulted, include:
Case 3: Little is Known about Underlying Physics:
We wanted to know what equations describe perovskite film degradation. We applied sparse regression?to extract underlying differential equations governing perovskite (MAPI) film degradation during environmental testing. We found that MAPI films degrade following the Verhulst?logistic?function — descriptive of auto-catalytic reactions. This suggests that to improve MAPI stability, we should improve film purity (e.g., remove nucleation sites for degradation). (open-access paper)
In fairness, sparse regression is not yet "on-demand" — it took us over a year to collect high-quality data. Other approaches involve embedding physics directly into the neural-network architecture. Example includes Sili Deng's team embedding the law of mass action and the Arrhenius equation into the neural network architecture to model chemical reactions — but this requires assumptions about the underlying physics (open-access paper).
LOOKING AHEAD
ML is no longer purely data driven. Knowledge can be embedded into ML in its representations, constraints, and network architecture & parameters. Knowledge can be extracted from ML in the form of equation fitting parameters, or the underlying equations themselves. Expect more and more knowledge integration with ML, especially as ML methods (e.g., large language models, transformers) continue to mature.