Uncovering Structure-Property Relationships of Materials by Subgroup Discovery


Figure 1: The authors demonstrated the usefulness of subgroup discovery toward two materials-science problems: (left) 82 octet binary materials to find descriptors that predict their crystal structure as either zincblende or rocksalt; and (right) 24,400 configurations of gas phase gold clusters with 5 to 14 atoms (2,440 configurations per size) to discern geometrical and physicochemical relationships. The most stable gold cluster geometry at 0 kelvin for each size is shown.

Finding functions that describe the properties of materials (descriptors) from high-dimensional data using human-intuition is difficult and often subjective. Fortunately, developments in big-data analytics tools, i.e., machine learning, compressed sensing, and data mining, in combination with the growth of materials-science repositories are creating opportunities for finding descriptors that help understand and screen novel materials. Our goal is to develop and apply such tools to discover insights and to predict new materials from large collections of materials data stored within the Novel Materials Discovery (NOMAD) Archive.

With this goal in mind, B. R. Goldsmith, M. Boley, J. Vreeken, M. Scheffler, and L. M. Ghiringhelli have recently demonstrated that subgroup discovery (SGD), a form of local pattern discovery for labeled data, is able uncover interpretable descriptors from materials-science data obtained by first-principles calculations (B. Goldsmith et al. New. J. Phys 19 (1) (2017) 013031[BG1] ). The experiments demonstrate that SGD can find an intuitive and simple descriptor that properly classifies 79 of the 82 octet binary semiconductors as either having a zincblende or rocksalt crystal structure. Additionally, SGD helps find patterns between physicochemical and geometrical properties of gas-phase gold clusters. The authors posit that subgroup discovery will serve as a useful tool for the extraction of insights from big data of materials, and its continued development will help pave the way toward novel materials discovery.