

Geospatial Analysis
and
Generative AI
Gerard Mor data scientist @ CIMNE






Barcelona, February 24th 2026
From Territorial Data to Intelligent Decision Support
Introduction
CIMNE BEE Group develops advanced methodologies that combine geospatial analytics, data-driven modelling, and Generative AI to transform heterogeneous territorial data into actionable knowledge.

By bridging geospatial science with modern AI architectures, we enable scalable analysis pipelines that support urban planning, energy transition, and climate resilience strategies.
Architecture LEADNET

self-hosted LLM
Open datasets
On pilot premises
LEADNET UI
- User authentication
-
User prompting
- RAG open questions
- RAG ready-to-run pipelines
- Personalised Capacity Building material
- Map & time series & aggregations visualisations
CIMNE premises
Private datasets
Cap. Build. text documents
Video / Audio documents
MCP
secured MCP
secured MCP
ChromaDB
Tagging
Opensearch
public repo
private repo
Private text documents
private repo
public repo
public repo
Geospatial Analysis
(geosp)








geosp / Research lines in Geospatial Analysis
-
Data acquisition, harmonization, and semantic interoperability pipelines
- hypercadaster_ES: Automatic gathering, advanced inference, and interoperability of cadastral data
- social_ES: Automated geospatial data integration on socio-demographics, households characteristics and economic indicators.
- greenshadow: Solar exposure, PV potential estimator and shadow modelling for urban environments.
-
AI-driven modeling:
- Air temperature and humidity downscaling at 100m grid
- Thermal energy demand of buildings
- Electricity and gas energy consumption of buildings
-
Use cases:
- Heat Vulnerability Index at building level
geosp / Data acquisition, harmonisation and semantic interoperability








geosp / Data acquisition / hypercadaster_ES
Python library designed for comprehensive analysis of Spanish official cadastral data. It provides tools for downloading addresses, parcels and buildings cadastral information, integrating attributes of external geographic datasets (administrative levels, DEM, OSM...), and performing advanced building geometry inference, shading analysis, and energy simulation data preparation.
Public repository: https://github.com/BeeGroup-cimne/hypercadaster_ES




geosp / Data acquisition / social_ES
Python library to ingest, clean and harmonise most updated Spanish demographics, socioeconomic and other social-related datasets from National Statistics Institute.
Example datasets:
- Annual Household Income Distribution dataset
- Population Education and Employment Status Census
- Estimated Essential Characteristics of Population and Households by building (hypercadaster_ES is being used in this estimation)
Public repository: https://github.com/BeeGroup-cimne/social_ES


geosp / Data acquisition / greenshadow
Python library for environmental shading analysis using LiDAR data and custom algorithms to simulate the solar shading of rural and urban areas in maximum detail. It uses hillshade techniques combined with cast shadow calculations to provide accurate solar radiation analysis.
Public repository: https://github.com/BeeGroup-cimne/greenshadow




DSM
DSM without vegetation
DEM
LIDAR flight


geosp / Data acquisition / greenshadow
Python library for environmental shading analysis using LiDAR data and custom algorithms to simulate the solar shading of rural and urban areas in maximum detail. It uses hillshade techniques combined with cast shadow calculations to provide accurate solar radiation analysis.
Public repository: https://github.com/BeeGroup-cimne/greenshadow




Slope estimation
Aspect estimation
Class


geosp / Data acquisition / greenshadow
Hillshade during December 12th 2023


geosp / Data acquisition / greenshadow


Direct component of the solar radiation on December 12th 2023
Diffuse component of the solar radiation on December 12th 2023
geosp / Modelling








geosp / Modelling / Air temperature and humidity downscaling

Public repository: https://github.com/BeeGroup-cimne/CR_BCN_meteo


geosp / Modelling / Thermal energy demand of buildings

1 - Select a subset of real buildings and their context
2 - Define building envelopes archetypes according to building code
3 - Define user behaviour patterns according to demographics and socioeconomic profiles
4 - Define building systems archetypes according to EPC and cadastral data
5 - Define microlocal weather input files


geosp / Modelling / Electricity and gas energy consumption

Predict electricity and gas consumption at building level, based on a Graph Neural Network
Input data is socio-economic, demographics, energy demand, city graph (buildings, districts, postal codes, census tract...), building characteristics, and weather conditions.
geosp / Use Cases






geosp / Use cases / Heat Vulnerability Index at building level
The Heat Vulnerability Map of Barcelona is a geospatial analysis tool that identifies buildings most at risk buildings during extreme heat events considering:
- Building characteristics
- Climate Variability and Extreme Events
- Demographic Indicators
- Infrastructure Indicators
- Energy indicators
- Socio-economic Indicators
The framework is based on the dimensions defined in the IPCC’s Third Assessment Report: Exposure, Sensitivity, and Adaptive Capacity.
It provides an assessment of all residential buildings in Barcelona(61,000)


geosp / Use cases / Heat Vulnerability Index at building level

Generative Artificial Inteligence
(genai)








genai / Research lines in Generative AI
- Local Generative AI
- LLMs
- MCPs
- RAGs
- Agents
- RAG use cases
- taxonomizer
- validis
- invoget
- beechat
- Agent use cases
- Copilot for devs (Claude Code, opencode...)
- Openclaw
genai / Local Generative AI








genai / Local Gen AI / Large Language Models (LLMs)
- Why Local LLMs?
- Data sovereignty, governance and GDPR compliance
- Reduced latency and offline capability
- Full control over inference pipelines and model behaviour
- Main Types of Local Models
- General-purpose foundation models: Llama (Meta), OSS (OpenAI), Qwen (Alibaba), Gemma (Google), Kimi, GLM...
- Code-specialized models: DeepSeek-Coder, Qwen code...
- Small edge models: Phi, TinyLlama...
- Specific tasks: DeepSeek OCR, Docling, Chandra OCR...
- Local Deployment Architectures
- GPU inference servers (vLLM)


genai / Local Gen AI / Model Context Protocol (MCPs)
Model Context Protocol (MCP) standardizes how LLMs interact with external tools, data sources, and execution environments through structured context exchange.
-
Core Concepts
- Context Providers: databases, APIs, geospatial pipelines
- Tools: functions exposed to the model (queries, scripts, analytics)
- Structured Messages: schema-based communication between model and system
-
Why MCP Matters?
- Decouples models from infrastructure
- Enables reusable AI workflows
- Facilitates secure access to enterprise data


genai / Local Gen AI / Model Context Protocol (MCPs)



genai / Local Gen AI / Retrieval Augmented Generation (RAGs)
RAG combines language models with external knowledge retrieval to generate grounded, context-aware responses. It tends to avoid LLMs hallucinations, as the context has ground-truth or specific-simulated data.
-
Typical Pipeline
- Document ingestion and chunking
- Embedding generation
- Vector database indexing
- Context retrieval at query time
- Augmented LLM generation
-
Key Components
- Embedding models (bge, e5, instructor)
- Vector stores (Qdrant, Weaviate, FAISS)
- Metadata filtering and semantic search (Opensearch)


genai / Local Gen AI / Retrieval Augmented Generation (RAGs)



genai / Local Gen AI / AI Agents
An AI agent is a system where an LLM plans actions, calls tools, evaluates results, and iterates toward a goal.
-
Core Capabilities
- Planning and reasoning loops
- Tool execution (code, APIs, GIS workflows)
- Memory and state management
- Multi-step problem solving
-
Architectures
- ReAct / tool-calling agents
- Multi-agent orchestration
- Human-in-the-loop agents
-
Practical Use Cases
- Development copilots (Claude Code, Opencode)
- OpenClaw-style research agents


genai / Local Gen AI / AI Agents

genai / RAG Use Cases








genai / RAG Use Cases / taxonomizer
Python library designed to automatically translate, normalize, and generate structured taxonomies of attribute names across heterogeneous datasets.
- Core Capabilities
- Automatic attribute naming harmonization
- Semantic clustering and taxonomy generation
- Multilingual translation of dataset schemas
- Real-time inference through direct communication with locally deployed LLMs
- Architecture
- Python processing layer for schema parsing
- LLM-powered semantic reasoning
- Ontology-aligned output structures
- Value for Geospatial & Data Pipelines


genai / RAG Use Cases / validis
Retrieval-Augmented Generation (RAG) system that sits between energy consultancies and end-users (citizens/clients) to ensure that contractual, identity, and supply-point information is coherent before Datadis data access is granted.
-
Processed Documents
- National identity documents
- Energy contracts
- Energy invoices
-
Key Functionality
- OCR + LLM-based document parsing
- Entity extraction and cross-document comparison
- Automated alignment verification (identity vs. contracts vs. invoices)
- Structured validation outputs via API
-
Deployment Context
- Integrated as an API tool within an energy consultancy workflow
- Supports automated compliance and administrative verification


genai / RAG Use Cases / invoget
RAG-based system that extracts structured information from heterogenous energy invoices using OCR-enhanced LLM pipelines.
- Core Features
- OCR + LLM parsing of invoice documents
- Automatic extraction of prices, tariffs, and attributes
- Harmonization into a predefined energy ontology
- Storage into structured databases
- Technical Pipeline
- Document ingestion
- OCR preprocessing
- Semantic extraction via LLM
- Ontology mapping and validation
- Database persistence
- Impact
- Large-scale invoice analytics
- Standardized energy cost intelligence


genai / RAG Use Cases / beechat
RAG-based conversational interface built on OpenWebUI, enabling semantic search and dialogue over internal project documentation.
-
Knowledge Sources (Nextcloud, GitHub)
- Technical documentation
- Project deliverables, concept notes and research papers
- Functionalities
- Connects to GitHub, Nextcloud, DBs
- Context-aware chat over institutional knowledge
- Semantic retrieval from large document collections
- Continuous knowledge enrichment
- Benefits
- Reduces information silos
- Accelerates onboarding and research workflows
- Centralizes access to historical project knowledge
genai / Agent Use Cases








genai / Agent use cases / Copilot for devs
AI development agents integrate LLM reasoning directly into software engineering workflows, enabling automated code generation, refactoring, debugging, and repository understanding.
Public repositories:



genai / Agent use cases / Copilot for devs
Value for Research & Engineering Teams
- Accelerates prototyping and experimentation
- Reduces repetitive development tasks
- Enables AI-assisted engineering within secure environments
- Privacy: We are using our self-hosted LLM (Qwen 3 code)


genai / Agent use cases / Openclaw
OpenClaw is an autonomous agent framework designed to execute complex research and analytical workflows by combining LLM reasoning, tool execution, and iterative planning.
Public repository: https://github.com/openclaw/openclaw



genai / Agent use cases / Openclaw
-
Core Capabilities
- Interaction with local LLM infrastructure for secure processing
- Multi-step task planning and execution
- Autonomous information retrieval and synthesis
- Code generation and execution for geospatial pipelines
- Integration with APIs, databases, and local analytical tools
-
Potential Applications in Geospatial Analysis
- Automated exploration of spatial datasets and metadata
- Generation of preprocessing scripts for GIS workflows
- Literature review and synthesis of urban climate research
- Autonomous creation of spatial indicators and analytical reports
- Assistance in ontology-driven geospatial modelling
Q&A
Gerard Mor Martinez data scientist @ CIMNE
gmor@cimne.upc.edu








Geospatial analysis and Generative AI
By CIMNE BEE Group
Geospatial analysis and Generative AI
Research lines and products relative to geospatial analysis and generative AI
- 7