The Nexus Linking IBM, California Wine, and Climate Modeling

If this were a “Jeopardy” clue, the category would be “IoT” and the answer would be “Who is physicist Hendrik Hamann?”

This month, IBM announced the creation of a cloud-based geospatial information and analytics service. It’s designed to help people analyze big data sets comprising information such as satellite images, climate models, and tweets, or more localized data like feeds from IoT sensors. The company calls the technology the PAIRS Geoscope; PAIRS stands for the not exactly catchy Physical Analytics Integrated Data Repository and Services. Big Blue pitches it as something of a magic crystal ball that developers can build into apps. The apps would let users figure out nearly anything about a particular region, big or small, present or future.

Satellite imagery, climate modeling, IoT, Twitter: There was a lot in that announcement to get my attention, but almost too much to grasp. But the lead IBM researcher on the project, Hendrik Hamann, told me it all started with friends and wine. Hamann may be based at IBM Research in Yorktown Heights, N.Y., but his story is a quintessentially Silicon Valley kind of tale. A lot of tech projects here start with friends and wine.

“About six years ago,” Hamann recalled, “we in the research division were looking at where tech is heading—the Internet of Things, drones, nanosatellites. We started looking for interesting applications given these trends.”

Agriculture came to mind, and, says Hamann, “If I’m going to work in agriculture, why don’t I work on grapes and making wine?”

Why not, indeed. It turned out that the CIO for the E. & J. Gallo Winery in Modesto, Calif., had once worked at IBM. Bingo! Hamann pretty soon had an invite to visit Gallo, where he toured the vineyards and the winemaking operation and connected with Nick Dokoozlian, Gallo’s vice president of viticulture, chemistry, and enology.

“The sophistication of data collection was pretty amazing,” says Hamman. “They have yield maps from vineyards down to a couple of meters. But though they had a lot of data, there was little they could do with it.”

IBM Research and Gallo set up a 10-acre test site, filling it with sensors that tracked soil, sun, and weather data. They paired that data with information from weather forecasts, satellites, and harvests. Optimizing irrigation was the initial goal for the winery.

Working with a winery certainly had its advantages, but as a research test bed, a vineyard left a lot to be desired, it turned out.

“I made the decision to work on grapes not knowing much about agriculture,” Hamann says. “It turns out that with grapes, it takes a long time to see results—more than a season. So it took us two full years to complete an initial study. Had I known that, I might have picked something else, where I could have given my manager quarterly results.”

By 2015, Hamann says, the technology—which uses machine learning to extract insights from multiple layers of information—proved itself. Gallo improved yields on the test site while reducing water use. The partnership quickly found another use for IBM’s AI: analyzing a number of variables such as proximity to the winery, weather patterns, elevation, days of sunshine, and other factors to identify suitable locations for new vineyards. Previously, this process would take Gallo months, but with the new system, deciding where to plant new vines takes minutes, says Hamann.

Getting this second application running inspired Hamann and his team to think about applications beyond agriculture. First, he says, they zeroed in on other businesses that use weather, satellite, and soil data, like insurance companies and utilities. Then the researchers considered more broadly what people might want to do with a combination of local sensor data, image data, and other large, publicly available data sets. They built a cloud-based service that automatically sorts through geospatial or time-based data. In the words of an IBM in a news release, the service collates them “into a tidy aligned and indexed structure designed for efficient retrieval and query.”

“The applications are infinite,” Hamann says. “It’s really just about contextualizing information, linking it in space and time.”

Hamann pulled out two examples of possible applications that could be built using the PAIRS tool.

“When was the last time you saw one of those articles on the 10 best places to live blah blah blah in an airline magazine?” he asked. (Last week. And by the way, I always read those articles and rant about how ridiculous their analyses are.) Making such analyses better—or at least more individualized—would be easy for PAIRS, he indicated. “For quality of living, weather is important, climate is important, trees, things about the neighborhood like are there a lot of kids or is it diverse. Searching multiple layers of geospatial information, IoT data, and Twitter feeds—roughly 10 to 15 percent of tweets are geocoded—could refine that.”

What else could PAIRS tell me that I might find immediately useful?

“How about,” Hamann suggests, “What is the best place in your neighborhood to watch the stars?” Given I was running around in the dark trying to find an angle on the recent lunar eclipse, he had me with that one.

But the possibility that probably excites Hamann the most is that the PAIRS system might improve climate models. “We have done a lot of studies in which we use massive amounts of machine learning to rapidly analyze historical climate forecasts against actual weather station data,” he says. “It is amazing what you can learn about the deficiencies and strengths of the models. And once you know that, you can improve them to provide better climate forecasts.”

You can try out the PAIRS tool here. Let me know in the comments if you come up with anything interesting.

Source: IEEE Spectrum Computing