The NOMAD (Novel Materials Discovery) Laboratory is a European Centre of Excellence (CoE), funded by the European Union under the Horizon2020 program. Eight complementary research groups of highest scientific standing in computational materials science along with four high-performance computing (HPC) centers form the synergetic core of this CoE (see THE TEAM). This also reflects that the CoE is part of the Psi-k, CECAM , and ETSF communities.

We are currently creating a large database of homogeneous and normalized materials data. Starting from there, the available HPC infrastructure and the envisaged developments will enhance the discovery of new scientific phenomena, novel devices, and it advances materials science and engineering. At first, we will use data from theory and modeling. Later, the database will expanded to experimental data.

New commercial products, from smart phones to solar cells to artificial hips, are typically built from new or improved materials. Choosing the right material is difficult. Computational materials-science can be used to help make this choice. Using information about the atoms and molecules present in a specific material, computers can predict the properties and functions of the material. This information can then be used to decide if the material is right for a specific product. Such computational approaches, typically based on density-functional theory and beyond, can be used for known materials. Furthermore, they are also powerful enough to let researchers consider materials that do not yet exist, thus developing entirely new materials which could help address fundamental issues in a number of widespread fields such as energy storage and transformation, mobility, safety, information, and health.

The NOMAD CoE starts from the NOMAD Repository (see also the youtube movie) which contains data and input and output files of many high-quality calculations performed all over the word. The NOMAD Repository is unique in the sense that it is not restricted to one or a few simulation programs ("codes") but it accepts output from all important codes. In spring 2016 the NOMAD Repository contained input and output files of more than 2 million calculations which corresponds to more than 2 billion CPU-core hours burned on various high-performance computers all over the planet.

As the data at the NOMAD Repository arise from many different codes, the data and the data structure is very inhomogeneous. This is fine for the purpose of the Repository but not for the planned data analytics. Thus, the scientists of the NOMAD CoE first build conversion layers to transform all these data into a code-independent format: the NOMAD Database.

A Materials Encyclopedia will open up new opportunities by developing new tools to search and retrieve information from the large materials data pool. It will comprehensively characterize materials by their computed properties. The developed search engine enables to retrieve  those materials that exhibit one or more required features.

Big-data in materials science contain hidden structures and correlations that may not be detectable by standard tools. We develop Big-Data Analytics tools to extract this information and will use it for analyzing materials properties and functions and predicting trends and anomalies and possibly even novel materials. Key words of the employed methodology are data mining, statistical or machine learning, in particular kernel based methods, neural networks, compressed sensing, and causal inference.

Seeing helps understanding. We develop an infrastructure for remote visualization of the multi-dimensional NOMAD data. Advanced Graphics and a virtual-reality environment will allow for interactive data exploration, training, and dissemination.

A sophisticated technological platform for the integrated design of the data workflow will handle the demands of encyclopedia, visualization, and data analytics. This High-Performance Computing (HPC) infrastructure will support modeling on big data and provide application-enabling services. The CoE will also open up new HPC opportunities by making access to existing materials data easier and developing new tools to search, retrieve and manage large datasets. And it will offer to perform new high-quality calculations for materials where important information is missing in the database.

Our Outreach commitment brings data-driven materials modeling closer to societal exploitation. We stimulate interest and public awareness of the importance of materials science and ensure accessibility of all NOMAD’s tools for industrial users and basic and applied research in industry and academia. To make sure that the NOMAD CoE is valuable and relevant to end-users, we will extend our extensive network of researchers, industry representatives, students, and other stakeholders. We are organizing workshops and schools in collaboration with the Psi-k and CECAM networks (see NEWS).

 

contact concerning general aspects of the CoE: Kylie O'Brien

contact concerning the NOMAD Encyclopedia: Georg Huhs

contact concerning Big-Data Analytics: Luca Ghiringhelli

contact concerning Advanced Visualization: Rubén García Hernández

contact concerning HPC Infrastructure: Atte Sillanpää

contact concerning Outreach: Kylie O'Brien