Knowledge Integration in Autonomous (Perovskite) Experimentation

Knowledge Integration in Autonomous (Perovskite) Experimentation

Autonomous tools can make and characterize new materials wicked fast (thousands of samples per minute). The machine-learning (ML) decision-maker must also be fast — and it must interface with (gain the trust of) humans, which requires distilling high-dimensional information.

Some ML algorithms can input & output knowledge in a human-interpretable form —?e.g., mathematical equations or language. This LinkedIn post reviews some of these attempts in our lab, and briefly highlights emergent areas.


Case 1: Underlying Physics is Well Known

Bayesian inference can?extract “hidden variables” (bulk & interface properties) from observable experimental data. For instance, the bulk, interface, and defect properties can be inferred from device-level measurements of completed solar-cell devices. This requires a forward model (at minimum heuristics, ideally a full numerical device model), which is reconciled probabilistically with the experimental data. Ongoing work focuses on combining multiple experimental information streams to increase inference precision and accuracy.

Examples:

  • Infer underlying causes of low solar cell performance (open-access paper).
  • Infer bulk defect properties from device-level measurements (paper; open-access).
  • Infer ideal processing conditions (time-temp profile) from solar cell device measurements — with advantages over standard design-of-experiments. (open-access paper)


Case 2: Some Underlying Physics is Known

Bayesian optimization is a powerful tool, but it is only as smart as the data. We use probabilistic constraints?“blend”?secondary data sources into Bayesian optimization acquisition functions. Examples of secondary data sources, and the successes that have resulted, include:

  • DFT data of perovskite phase stability as a function of composition, which enabled a 3x more stable perovskite composition to be identified probing only a fraction of available compositions — and quick scale-up from films to full devices. (open-access paper)
  • Human observations of film quality, and prior data on similar equipment, which resulted in faster process optimization for new perovskite manufacturing tools (and anecdotally, greater human trust in ML). (paper, open-access)


Case 3: Little is Known about Underlying Physics:

We wanted to know what equations describe perovskite film degradation. We applied sparse regression?to extract underlying differential equations governing perovskite (MAPI) film degradation during environmental testing. We found that MAPI films degrade following the Verhulst?logistic?function — descriptive of auto-catalytic reactions. This suggests that to improve MAPI stability, we should improve film purity (e.g., remove nucleation sites for degradation). (open-access paper)

In fairness, sparse regression is not yet "on-demand" — it took us over a year to collect high-quality data. Other approaches involve embedding physics directly into the neural-network architecture. Example includes Sili Deng's team embedding the law of mass action and the Arrhenius equation into the neural network architecture to model chemical reactions — but this requires assumptions about the underlying physics (open-access paper).


LOOKING AHEAD

ML is no longer purely data driven. Knowledge can be embedded into ML in its representations, constraints, and network architecture & parameters. Knowledge can be extracted from ML in the form of equation fitting parameters, or the underlying equations themselves. Expect more and more knowledge integration with ML, especially as ML methods (e.g., large language models, transformers) continue to mature.

要查看或添加评论,请登录

Tonio Buonassisi的更多文章

社区洞察

其他会员也浏览了