Battery Materials
Batteries are powering everything from phones to EVs to grid scale energy storage. They are a critical component of the energy transition. According to McKinsey, the Lithium-Ion battery demand is set to grow to 4,700 GWh per year by 2030, driving a $400 billion supply chain.
There is still plenty of room and need for innovation around batteries. Electric vehicle (EV) manufacturers look to increase energy density to reduce range anxiety while grid storage operators look for lower cost and more cycles.
New battery technologies are challenging Li-Ions dominance: Solid state batteries promise to double the amount of energy that can be stored per dollar of cost. These batteries are of intense interest to academia and industry.
Data & machine learning has plenty to contribute to the hunt for new materials. The most promising approaches are around text mining, virtual screening and inverse design.
Text mining of academic literature
There is a lot of research into battery materials, but structured data about material properties is still scarce. This is because academic researchers publish their work in academic papers, which are unstructured text. Shu Huang and Jacqueline M. Cole of Cambridge University developed the BatteryDataExtractor which mines academic papers for material property information.
At the core of their approach is a BERT natural language model, a type of machine learning model similar to ChatGPT. They fine-tuned the model to work with text about batteries and then use it to detect the names of materials as well as abbreviations used. Because BERT is a language model, it can be asked questions about the text. The authors employ a clever "double-turn" question system where they first ask about the value of a particular property mentioned in the text and then ask which material is asked by this property value.

After scanning 223,877 scientific papers, Huang and Cole created a database with 210,416 records about 16,315 unique battery chemicals. This kind of structured data forms the basis of machine learning aided materials discovery.
Virtual Screening
Battery manufacturers screen different materials in experiments to find good ones. However, there is a very large universe of possible battery materials, so its important to prioritise.
Machine learning can help here by predicting the properties of a large number of materials so that those most likely to be suitable can be screened.
A nice example of this approach is provided by Cubuk, Sendek & Reed, Screening billions of candidates for solid lithium-ion conductors: A transfer learning approach for small data. They trained ML models on predicting the conductivity of lithium ions based on different material features. Because data on lithium ion conductivity is scarce, they had to grapple with their model overfitting to the small dataset. To avoid overfitting they first trained a model on material features studied in literature (so called structural descriptors). This first model then produced the training data for a second model which used more basic but widely available material features as inputs (so called elemental descriptors).
Using their machine learning model, the authors then screened 20 billion materials for possible suitability.
Inverse design with generative ML
Machine learning is also used to design materials with desired features directly. There are broadly two approaches: The first is to train a generative machine learning model to generate material candidates and to then predict their features using a separate machine learning model. The second approach is to have a generative model take the desired properties of the material into account directly.
Song et. al., Computational discovery of new 2D materials using deep learning generative models is an example of the two step approach. They train a generative adversarial network, to generate materials. A random forest classifier then screens materials for having the desired characteristic, in this case of being a two dimensional material which is important for many electronic applications. After pre-screening through the random forest there are two further verification steps not based on machine learning to further qualify the materials.

Pathak et. al. Deep Learning Enabled Inorganic Material Generator provide an example of directly predicting a material for a desired set of properties. They use a variational auto encoder where the encoder is trained to predict not just a latent representation of materials, but also their properties. The material properties are then used as an input of the decoder. To create new materials, researchers provide the desired properties and samples from the latent space. The decoder will then produce a material which is predicted to have the desired features.

Interesting startups in the space
The primary way that data can help battery companies is through faster experimentation. Aionics and Chemix are two companies both offering a data platform and machine learning capabilities to help their clients accelerate experimentation. Aionics has more of a professional services approach, allowing clients to upload data, while Chemix advertises their internal experimentation capabilities.
Cuberg follows a more integrated approach. They specialise in a particular type of lithium ion battery, in which they then drive performance through improvement of the electrolytes, electrodes as well as the overall battery design. As part of NorthVolt, they have the ability to manufacture their own batteries.
Further reading
The review articles below are an excellent dive in point:
Ling, 2022, A review of the recent progress in battery informatics
Ng, Sun & Seh, 2023 Machine learning-inspired battery material innovation
Lombardo e.a., 2022 Artificial Intelligence Applied to Battery Research: Hype or Reality?