Google is using old news reports and AI to predict flash floods

6 hours ago

0 2 minutes read

Flash floods are among the deadliest weather events in the world, killing more than 5,000 people every year. They are also among the most difficult to predict. But Google thinks it has solved this problem in an unlikely way: by reading the news.

Although people have collected a lot of weather data, flash floods are too short-lived and localized to be measured comprehensively in the way temperatures or even river flows are tracked over time. That data gap means that deep learning models, which are increasingly able to predict the weather, cannot predict flash floods.

To solve that problem, Google researchers used Gemini – Google’s grand language model – to search 5 million news articles from around the world, isolate reports of 2.6 million different floods, and turn those reports into a time series with geotags called ‘Ground Source’. It’s the first time the company has used language models for this type of work, said Gila Loike, product manager of Google Research. The study and data set were shared publicly Thursday morning.

With Groundsource as a baseline in the real world, according to the researchers trained a model built on a LSTM (Long Short-Term Memory) neural network to record global weather forecasts and generate the probability of flash flooding in a given area.

Google’s Flash Flood Prediction Model Now Highlights Risks to Urban Areas in 150 Countries Flood Hub platform, and shares its data with emergency response organizations around the world. António José Beleza, an emergency response officer with the Southern African Development Community who tested the forecasting model at Google, says it helps his organization respond more quickly to floods.

There are still limitations to the model. First, it has a fairly low resolution, identifying risks in areas of 20 square kilometers. And it’s not as accurate as the U.S. National Weather Service’s flood warning system, in part because Google’s model doesn’t include local radar data, which allows precipitation to be tracked in real time.

Part of the point, however, is that the project is designed to work in places where local governments cannot afford to invest in expensive weather sensor infrastructure or where they do not have extensive recording of meteorological data.

WAN event

San Francisco, CA
|
October 13-15, 2026

“As we collect millions of reports, the Groundsource dataset actually helps rebalance the map,” Juliet Rothenberg, program manager on Google’s Resilience team, told reporters this week. “It allows us to extrapolate to other regions where there isn’t as much information available.”

Rothenberg said the team hopes that using LLMs to develop quantitative data sets from written, qualitative sources can be applied to efforts to build data sets on other short-lived but important phenomena, such as heat waves and mudslides.

Marshall Moutenot, the CEO of Upstream Tech, a company that uses similar deep learning models to predict river flows for clients such as hydropower companies, said Google’s contribution is part of a growing effort to collect data for deep learning-based weather forecast models. Moutenot co-founder dynamic.orga group that manages a collection of weather data suitable for machine learning for researchers and startups.

“Data scarcity is one of the most difficult challenges in geophysics,” says Moutenot. “At the same time, there’s too much data about the Earth, and then when you want to evaluate against the truth, there’s not enough. This was a very creative approach to getting that data.”

Source link