On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence
Summary
Large pre-trained models, also known as foundation models (FMs), are trained in a task-agnostic manner on large-scale data and can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or even zero-shot learning. Despite their successes in language and vision tasks, we have yet seen an attempt to develop foundation models for geospatial artificial intelligence (GeoAI). In this work, we explore the promises and challenges of developing multimodal foundation models for GeoAI. We first investigate the potential of many existing FMs by testing their performances on seven tasks across multiple geospatial subdomains including Geospatial Semantics, Health Geography, Urban Geography, and Remote Sensing. Our results indicate that on several geospatial tasks that only involve text modality such as toponym recognition, location description recognition, and US state-level/county-level dementia time series forecasting, these task-agnostic LLMs can outperform task-specific fully-supervised models in a zero-shot or few-shot learning setting. However, on other geospatial tasks, especially tasks that involve multiple data modalities (e.g., POI-based urban function classification, street view image-based urban noise intensity classification, and remote sensing image scene classification), existing foundation models still underperform task-specific models. Based on these observations, we propose that one of the major challenges of developing a FM for GeoAI is to address the multimodality nature of geospatial tasks. We will discuss the distinct challenges of each geospatial data modality and suggest the possibility of a multimodal foundation model which can reason over various types of geospatial data through geospatial alignments. We will conclude this talk by discussing the unique risks and challenges to develop such a model for GeoAI.
Speaker’s information
Dr. Gengchen Mai is an Assistant Professor (since Aug. 1st 2022) at the Department of Geography at UGA, and a Graduate Program Faculty of the School of Computing, UGA. His research areas are Spatially Explicit Artificial Intelligence, Geographic Knowledge Graphs, Geographic Question Answering, and so on. He is the receipt of many prestigious awards including AAG 2021 Dissertation Research Grants, AAG 2022 William L. Garrison Award for Best Dissertation in Computational Geography, AAG 2023 J. Warren Nystrom Dissertation Award, Top 10 WGDC 2022 Global Young Scientist Award, the The Jack and Laura Dangermond Graduate Fellowship, and so on. He has 59 peer-review publications including 7 first-author journal articles and 9 first-author CS/GIScience conference proceedings.
Time: 10:00-10:30 a.m. US Central Time (Thursday, May 25th, 2023)
The meeting was not recorded, but you may look for the related paper at: https://gengchenmai.github.io/csp-website/
Host: Jiaxin Du, Data Science Ambassador@TAMIDS, PhD candidate@LAUP, TAMU