GIS for energy simulation and analysis 1 : case studies.

Introduction

Case Studies :

Gridded wind generation, data, modelling and visualisation

PhD: Gridded modelling of wind generation, using GIS

  • A grid was chosen for its simplicity, ability to harmonise datasets and allow array based maths.
  • The power of using GIS and Python in this case was the ability to create a bespoke framework and adapt data and simulation methods to it.
  • Numpy arrays used to create model grid
    • conceptual as did not need to be spatially accurate
    • Only a common key necessary
    • Looping, slicing and mathematical operation become trivial
  • Assigning data to this conceptual framework, using a geographical grid was done in ArcGIS
    • Spatial referencing necessary

SpWind  -  PhD Spatiotemporal Wind model III

Once the grid has been chose there is four phase process to creating a generation timeseries

  1. Analyse available land, Exclude land from development
  2. Identify high value areas
  3. Allocate annual capacity to available zones
  4. Simulate

PhD: Gridded modelling of wind generation, Phase 1: GIS analysis of available land

  • Layers of spatial restrictions on development merged
  • Land use of GB wind farms examined - only common types used.
  • Plots show uses of onshore turbines and the evaluation of the exclusion analysis - very few farms built in exclusion zones
  • ArcGIS and matplotlib

Phase 2 & 3 :  Allocate capacity to available high quality areas

Using National Grid Energy Scenarios (circa 2012)

Allocate these to land identified in previous slide, where wind  quality is good.

Phase 4 : Wind Simulation - Past and Present Research

Simulating generation from wind turbines has evolved from station data to reanalysis data 

  • Generation from wind turbines has been simulated from weather data for many decades
  • The most recent crop of simulation studies can find roots in work from Graham Sinden, who used onshore MIDAS station data
  • Teams at Reading, Imperial and Edinburgh Universities (as well of UCL of course) have adapted these methods for reanalysis data,
    • making significant improvements in accuracy, scope and resolution
    • mostly focussed on wind turbine performance and demand supply matching
      • Only UCL modelling demand
    • recently this field has expanded to include research in Germany
    • and incorporated into energy systems optimisations
    • Since I last worked on this there has, no doubt, been further progress
  • Hardware, software and data have all improved and continue to do so - the relationship between weather and power remain more or less the same ...

Phase 4 : Wind Simulation Fundamentals

  1. establish the wind speed at the location of simulation
    • Assume that nearby measurement represents the site
      • Mast data or grid point reanalysis
    • Or try to more closely represent conditions by altering wind speeds somehow
      • Statistical or dynamic downscaling
  2. Estimate the wind speed at the height of turbine
    • Extrapolate upwards using law of choice
    • Or down if using pressure level wind speeds
  3. Convert wind speed to powers
    • Using measured relationship
    • Or physical relationship
    • Choose whether to include swept area and wind density?

Simulating or estimating generation relies on a 3 step fundamental process, details will comein the following model examples

SpWind  -  PhD Spatiotemporal Wind model

GB only, CFSR driven, gridded model of wind generation for scenario disaggregation 

  • Use only grid point centroid values, assuming that these represent the conditions within the grid square.
    • To establish that this was the case, points were evaluated against 10 m MIDAS data
    • Strong correlation and low error found (plots)
    • High sites less well represented, these aren't used for development however
    • Might not be suitable for areas with complex terrain
  • Only 1 near surface height therefore extrapolate using the power law with a single assumed exponent for onshore and one for offshore
    • Simplification of surface roughness and atmospheric conditions (room for improvement) 

ESTIMO_wind - RESTLESS wind simulation

MERRA driven, wind farm specific, Global Capacity factor timeseries derivation

  • Closest four points bilinearly interpolated to farm location
  • Curve fitted using wind speeds at 2, 10 and 50 m using a linearised version of the log law equation
    • friction velocity and roughnesss length estimated using least squares optimisation (scipy).
  • This stage of the model can be evaluated by comparing timeseries to met masts
  • Some sites better than other - higher degree of accuracy where all close points have similar surface conditions

SpWind  -  PhD Spatiotemporal Wind model II

  • Convert to power using one offshore and one onshore archetypal curve and uniform hub heights
    • A simplified method ...
  • Not a very high degree of accuracy
  • Method does, however, facilitate scenario analysis, particularly when paired with demand modelling, described later

ESTIMO_wind - RESTLESS wind simulation

  • All wind farms simulated separately using data from the windpower.net
  • Site specific wind speed heights and farm curves
  • Increased detail necessitates the use of Legion - great when it works ...

ESTIMO_wind - RESTLESS wind simulation

  • The use of MERRA, including multiple wind speed heights, combined with site specific simulation and metadata significantly improves the accuracy of historic simulation
  • Evaluating simulated wind generation is relatively straightforward due to good information on capacity and generation
  • Countries with large diverse capacity are easier to simulate, as demonstrated by the plots of simualted vs measure German and UK wind 

ESTIMO_wind - RESTLESS wind simulation

  • Location specific simualtion or countries wiht less spatial diversity are less easy to simulate and verfiy, as demonstrated by the plot of a single UK wind farm and Finland.

ESTIMO_wind - RESTLESS wind simulation

  • Scenario modelling using European data can be more sophisticated than SpWIND due to more data
    • e.g. data on different classifications of wind farm as shown in the plots
  • Scenarios first use operational, then under construction, approved and planned
    • Therefore representative of evolving capacity.
  • Overall the RESTLESS method significantly improves accuracy and capability of Energy Space Time wind simulation
    • The maintained MERRA database also enables significant other work

Method improvements

Factors not incorporated in model

  • Wake Effects
  • Downtime
  • Lag in turbine operation
  • Energy Density over blades
  • Air density
  • Decline in performance with age
  • Wind speed changes < 1 hour in frequency Location specific turbine curves
  • Operating efficiency
  • Atmospheric stability, surface roughness and orography in height correction
  • Despite the considerable work and excellent data there is still a lot of room for improvement in the simulation, mainly down to the factors described to the right
  • Some of this can be theoretically incorporated through statistcial alterations of the curves, as seen on the right
  • Or better calibration, though this is not trivial
  • Improved wind speed data can also be introduced in future studies
  • Particularly in regional studies or those addressing climate change.

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

Matplotlib 3d wireframes, animated using a video editor

SpWind  -  PhD outputs

Hourly variability in wind generation, electricity demand and residual demand.  Matplotlib and ArcGIS

Increased variability in both scenarios, higher capacity factors throughout the later in years under Gone Green on the left, especially in the colder months.

Predictable variability under both scenarios for all years.  Little evidence of the impact of heat pumps on the temporal patterns of electricity demand.

The Gone Green scenario experiences greater variability as a result of more wind capacity, particularly offshore.

Wind Generation

Electricity Demand

Residual Demand

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

Hourly variability in residual demand - matplotlib images (adapted)

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

See  Sinden (2007) for original method

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

  • Correlation coefficient between timeseries of generation from pairs of grid squares under different scenarios
  • Demonstrating that in terms of increasing spatial diversity, much can be achieved using onshore only farms in GB
  • Due to the long thin nature of the land mass and the weather patterns.
  • Matplotlib 2d histograms

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

Blog: Animated maps of renewable energy modelling

PhD: Gridded modelling of wind generation - animated output

Case Studies :

Gridded solar generation, data, modelling and visualisation

Solar generation

MERRA driven, gridded, global capacity factor timeseries derivation

  • Like wind, methods for simulating solar generation are well established
  • Even better, some of the are already coded in Python.
  • For that reason Stefan Pfenningers GSEE model was used for PV generation estimation in RESTLESS, as described in the plot
  • It was adapted slightly to run for all MERRA grid points for all hours
  • Assuming a 1 MW fixed panel with optimum direction and tilt
  • 207, 937 points  * 8760 hours * 37 years
    • > 67 billion generation values

*legion required again

  • PV output simulated using top of atmosphere and ground level irradiance and ground temperature data and a physical model - adapted from Stefan's method
  • All hours from 1980 simulated for entire grid - Non tracking panels 1 MW per grid centroid assumed.
  • Aggregated to country by assuming either geographically distributed or population weighted capacity currently testing method.  can be aggregated using any desired geography.
  • Output = long time series of capacity factors for all global countries - easy to use in scenarios.  Renewables ninja provide European timeseries (only 10 years)

Module - Global PV simulation

Simulated monthly mean global capacity factors using 1980 meteorology

Solar generation - global hourly capacity factors using 1980 meteorology

Solar generation

  • Evaluation and calibration of a solar simulation model is harder to carry out than wind, due to limited information on installed capacity and uncertain measured generation timeseries
  • The meteorology is much simpler, however, therefore if some confidence can be gained in the simulation scenario modelling means simply multiplying capacity factors by assumed future capacities
  • Climate change is unlikely to effect irradiance as much as wind speed over larger spatial scopes.

Case Studies :

Capacity visualisation

Blog: Animated maps of renewable energy capacity https://esenergyvis.wordpress.com/

Blog: Animated maps of renewable energy capacity

Animated map of historical wind capacity

ArcGIS mapping, Python plotting and a video editor

Case Studies :

Air Pollution

DEFRA Gridded background pollution projections

  • DEFRA gridded 1km x 1km 2011 - 2030 projections of NOx and PM
  • Disaggregated by source including roads, domestic, commercial, rail, vehicles and point sources
  • Internal and external
  • Current year tethered to reality, projections = scenarios
  • Can be used in environmental modelling
  • Rasters visualised using ArcGIS - harmonised legend values, export to image, gif through photoshop
  • Photoshop good for small gifs
  • Larger number of images = after effects
  • Significant levels of background pollution come from outside of the country
  • Move to Northern Scotland for clean air!

The 5 worst polluted areas in the country

  • Using the same data and visualisation only, it is possible to examine some of the aspects of the projections
  • E.g. road pollution clearly reduces significantly through electrification in these scenarios
  • But pollution from other forms of transport, includign air and sea becomes proportionally more significant
  • Can you tell which areas are the worst?

GB Boroughs ranked by background NOx pollution : 2011 - 2010

  • Using a GIS, the gridded air pollution data was joined with census geographies representing boroughs - a fundamental GIS process.
  • That data was then exported to a spreadsheet (this could be done in Python, including internally in ArcGIS)
  • The mean NOx value by borough was calculated each year and the boroughs ranked each year
  • The result is a simplification of some spatiotemporal data into a much easier to understand form
  • The plot shows interesting artefacts of the scenarios
    • London boroughs remain the most polluted
    • There are significant improvements for boroughs with high levels of transport pollution (e.g. Lewisham)
    • Boroughs with airports and sea ports nearby clearly fall

Gridded Air Pollution

Gridded Air Pollution

The London Atmospheric Emissions inventory provide data on background air pollution on a 20 m grid 

Roadside air pollution

  • DEFRA also provide estimates of roadside air pollution, estimated from traffic counts and emissions factors as well as measured pollution and dispersion modelling.
  • This dataset was combined with data on schools to identify establishments within 150 m of roads exceeding the EU limit value of 40 µg/m3 of NO2.  
    • ​Used in the poisoned playgrounds campaign
  • A GIS was used to calculate euclidean distances to all roads in the dataset
  • Results presented in a web GIS

Case Studies :

Simulating energy demand

SpWind  -  PhD SpDEAM - Spatiotemporal demand model

PhD: Demand modelling necessitates more data harmonisation

Very few datasets are in the correct framework, therefore considerable work done harmonising to grid. ArcGIS outputs.

SpWind  -  PhD SpDEAM

Case Studies :

Simulating energy demand : Buildings

SpWind  -  PhD SpDEAM

Extending the Estimation of energy demand to non domestic

  • Building footprints from Ordnance Survey Mastermap
  • Building heights from OS 
    • More data from e.g. Lidar
  • Derive simple volume 
  • Non domestic building use from OS Addressbase premium
  • Assign to built form
  • Handle multiple uses
  • Energy intensity by use from academic research
  • Energy demand  = floor area * intensity
  • Simple method, but uses data that covers whole of GB and requires very little in model computation
  • Significant imorvements are available thourgh better data, e.g. VOA, Metered data and Lidar
    • Massive increase in complexity
    • see simstock and 3d stock

Mastermap vs. Openstreetmap

LIDAR point cloud to building extrusion - UCL

Lidar data continued

There are LIDAR derived building height datasets available from EMU analytics, or the raw data from the environment agency

See website

Case Studies :

Simulating energy demand : People

SpWind  -  PhD SpDEAM

Gridded population data

  • Datasets vary in scope and resolution and also in spatial referencing
  • All are based on census data, redistributed to a grid
  • Therefore represent evening domestic population
  • Some datasets use ancillary data, e.g. nighttime lights or surveys of workplaces to improve representation
  • Work to be done on day time population, therefore not the best option for all questions
  • 1 km x 1 km longitudinal data for Europe available via geostat
  • 1 km x 1 km data for GB via Centre for Ecology and Hydrology - same grid as air pollution, useful for modelling
  • Global data available from multiple sources, the most popular is the Global Rural Urban Mapping Project (GRUMP).
    • 30 arc second grid, close to 1 km, but causes a grid mismatch with regional coordinate systems (see plot)

Population data

  • Gridded population densities can be used in conjunction with projections of population for simulation models etc.
  • Need to be careful of data artefacts in many cases due to calibration etc
  • e.g. plot

Population data

Much of the analysis done on datasets which include geotagging is essentially population mapping, albeit at a potentially high temporal resolution, for example google location services

 

Here a Python mapping modules were used to show all of the location data my phone collected (before I turned it off, because it is creepy)

Case Studies :

Simulating energy demand : Transport

Roads

  • Department for Transport produce vehicle km data for major roads from measurements of Annual Average Daily flow
    • by vehicle type
    • by road
    • these can cut city boundaries, so a GIS is needed to estimate what proportion is in the city (see plot)
  • Minor road data is not attributed to roads, but there are a subset of measurements which can be used
  • See also, OS opendata for roads shapefiles and sub national road consumption statistics
  • Spatial transport energy demand can then be estimated

Minor roads - Birmingham

Major roads - Birmingham

Rail

  • Passenger km data are available from the Office for Rail Regulation (ORR), dis aggregated by operator
  • The General Transit Feed Specification (GTFS) provides data on route, stations, stops and operators
  • The two data were joined
  • A simplified rail network describing trips between stations by operator was created (see plot)
  • The network was cliipped around cities and a proportion of the passenger km assigned to travel within the city boundaries
  • The GIS provides the ability to make more reasonable assumptions on data allocation to different geographies.
  • Freight data are available by operator from ORR, but GTFS aggregate these trips to a single code so assignment must be done on a land based proportion

Data :

Meterological data - overview.

Overview

  • Choosing data
  • Met mast data
    • MIDAS
    • Wunderground
    • Offshore Buoys
    • Data above 10 m
  • Reanalysis datasets
    • NASA MERRA
    • NCEP CFSR
    • Alternatives and developments
  • Forecasted data
    • Short term
    • Long term
  • Maintained Database
    • Access
    • Contents
    • Usage case studies
      • Population weighting
      • Points
      • Grids

Provide an overview of the different types of data, strengths and weakness, access and usage. For more detail see knowledge_base.doc which includes links, which are also provided at the end of this presentation

Choosing Historical Meteorological Data

Met Mast Data

For academic projects in GB the Met Office Integrated Data Archive System (MIDAS) Land and Marine Surface Stations Data (1853-current) database is the best choice

  • Free for research
  • 100's of stations spread out across GB
  • Accessible through the Centre for Environmental Data Analysis (CEDA), registration necessary
  • Multiple variables
  • Hourly data available inc. wind speed and rainfall
  • < 10 m elevation
  • Biased to populated areas
  • Not homogeneous w. some data quality issues

Offshore Buoys

If offshore sites are of interest there are data from weather stations at sea

  • The British Oceanographic Centre (BODC) National Oceanographic Database (NODB)
  • Approx 20 offshore buoys and permanent "lightships"
  • Data at 6 m above sea
  • Wind speed, temp, wave height
  • Timeseries available as well as current conditions
  • API and bulk download

Offshore Masts

Temporary masts are often installed at planned wind farm locations

  • Wind speed timeseries at hub equivalent heights e.g. 100m
  • Data available from the Marine Data Exchange - not all reported data can be extracted - map shows accessible timeseries
  • These sites will occur less outside of GB, excluding Denmark, due to less offshore wind capacity
  • Several years often available
  • Wind resource evaluation
  • Model calibration

Reanalysis Datasets

Gridded global or regional data, huge number of weather variables for up to 100 years

  • Derived from Numerical Weather Prediction Models (NWP)
  • Tethered to observed data
  • Convention to provide 3 hourly "forecasts"
    • High resolution data = hourly
  • From 10 - 100 years of hindcasted data
  • Homogeneous and clean
  • Resolution is compromised for scope in both space and time, four common combinations
    • Long global coarse in space and time
    • Long global coarse in space
    • Long regional fine in space coarse in time
    • Short regional fine in space time
  • ​Quality varies regionally .....

NASA MERRA

37 years of global data at fine(ish) spatial and and fine temporal resolution

  • Preferred dataset, db maintained to near present day
  • 1979 - Present (1-2 month lag)
    • MERRA 1 ends 2010, MERRA 2 = latest product
  • 0.5° × 0.66° grid with 72 layers.
  • Very large number of variables. Many Hourly.
    • Wind, Temperature, Water, Humidity - see database slide for those maintained on NAS
  • Wind speed at 2, 10, 50 m above surface as well as pressure levels - key for simulation ...
  • Widely used in research and commercially - may be replaced by regional reanalyses or ERA5 in near future
  • Files retrievable in NETCDF form or GRIB
    • GRIB are meteorologist file of choice
    • NETCdf contain better metadata - see late slides on access
  • Interpolated Extrapolated data used in simulation (will describe in detail later)
    • Can estimate the accuracy of the data and method by comparing to high met mast data
    • Plots show performance .....

NCEP CFSR

37 years of global data at fine(ish) spatial and and fine temporal resolution

  • Very similar to MERRA in many ways
  • 37 years of hourly data, multiple layers
  • CFSR V1- 2010, CFSR v2 from 2010
  • Slightly improved spatial resolution on MERRA
  • Only 1 wind speed height above surface
    • mutliple pressure levels can be used - see next presentation
  • Used in PhD and HighRES
  • Gives some novelty as MERRA is more widely used.
  • GB database on NAS, not maintained

Alternative Reanalysis Datasets

Regional reanalyses provide enhanced spatial resolution, other satellite derived data are available;

  • The choice of regional reanalysis depends on Geography
  • COSMO REA - see figure
    • Fewer variables, wind at 2 and 10 m
    • pressure level data available
    • 2 or 6 km resolution!
    • 2 km only for central Europe
  • ECMWF ERA 5 
    • currently under development
    • global, hourly, 30 km
    • improvement on MERRA according to some
    • 1950 - present (long!)
    • Python module for access
  • CMSAF AVHRR
    • satellite derived irradiance
    • improved accuracy on MERRA
    • Less stable 
  • Using Multiple products may improve individual accuracy, but detaches homogeneous meteorology
  • Others exist - see region specific originators + 

downscaling

Using reanalysis data - fundamentals

Reanalyis data represents a high quality record of past meteorology, this can be used to represent the future, with caution.

  • Long time series from reanalysis provide information on how climate has changed as well as how weather varies over multiple temporal resolutions (hours  - decades) - plot shows minimum annual temperatures by country in Europe.
    • These timeseries have therefore been used to examine how other things are effected by this change and variability, e.g. weather driven energy generation and demand
    • The process is often called hindcasting - projecting a past timeseries into the future
  • However - Not all change and variability is captured
    • Weather and climate will not be the same in the future as it has been in the past.
  • ​Therefore when using the data it may be better to select extreme or "average" periods or introduce stochasticity through the use of multiple periods for single simulations etc. DISCUSS .....

Forecasts

Where hindcasting is not appropriate, forecasted data are available at multiple scales and resolutions. Don't just add 2 degrees to reanalysis data!

  • Forecast modelling, as opposed to simulation modelling should use a different dataset - for example:​
  • Short term prediction or forecasting, e.g. simulation of wind generation on grid to balance grid, predict price or reduce carbon content of activity (Baking messaging....)
    • ​Global Forecast System (GFS) - see gif, gives gridded 3 hourly via FTP
      • ​up to ten days, 0.5 degrees
      • multi decade historical data for training
    • ​ECMWF also provide gridded forecasts to members
    • Wunderground (e.g.) give site specific hourly forecasts
      • to ten days
      • free via api (with limits)​
    • Esembles (multiple datasets) can be used to fill time, accuracy degrades with time
  • ​Long terms projections of climate change are available from multiple sources
    • ​UKCPO9 = weather generator, Coupled Model Intercomparison Project (CMIP5) to 2100
    • Coordinated Regional Downscaling Experiment (CORDEX) 10 km, same framework as COSMO

Examples of visualising gridded met data

https://www.windy.com/?51.448,-0.138,5

Population weighting weather data

Population weighting a gridded weather dataset is a way to get a single value for each timestep that represents the weather experienced by a subset of people, for example in a country.  

  • Recreate MERRA grid in GIS, using centroid coordinates and Voronoi polygons
  • GRUMP population GPW v4 data
    • ​Extract population by country
    • Sum all pop grids within a MERRA grid cell, pop grids are smaller, approximate method, with some crossover.
    • New pop grid by counrty with 0's outside borders
    • Multiply timeslice weather by new grid, sum and divide by total country population
  • ​temperature at 2m (K), wind speed at 2m (m/s), net downward solar radiation at the surface (W/m2), specific humidity at 2 m (kg/kg), Wet bulb temperature (K) and Dew Point temperature (K) for all countries, 1980 - present on NAS

Accessing and interpreting the maintained database

  • On the UCL network (inc. VPN), a Network Attached Storage (NAS) can be accessed (details available from me)
  • NASA MERRA
    • 1979 - near present, global grid, 23 hourly variables (see next slide)
    • Mix of MERRA 1 and MERRA 2
    • MERRA grid shapefiles
  • NCEP CFSR GB subset to approx 2010 
  • Population weighted weather timeseries for whole globe as detailed previously
  • Access via browser:

MERRA data on the maintained database

Links and further material

Methods for evaluating accuracy and variability

  • Mean Absolute Deviation  (MAD)– using the in built pandas method .mad(), which does not totally agree with the manual method. A higher number describes larger deviation around the mean (or median) value.
  • Un biased Variance (Var) - using the in built pandas method .cov(). See Covariance description below.
  • Standard Deviation (StD) - using the in built pandas method .std(). A measure of dispersion around the mean. Low standard deviation indicates values are close to the mean.
  • Mean -  using the in built pandas method .std(). Indicates average value of timeseries.    
  • Pearson’s Correlation Coefficient (P) - using the in built pandas method .corr(). unbiased variance. 1 equals perfect positive correlation. Correlation is a normalised version of Covariance (therefore between 1 and -1) which allows comparison of different variables.
  • Kendall’s Tau Correlation Coefficient (KTau) - using the in built pandas method .corr(). unbiased variance
  • R2 correlation coefficient (R2) – using scipy.stats.linregress(x, y) - 1 equals perfect positive correlation
  • Root Mean Squared Error (RMSE m/s) – using a formula. Average difference between timeseries.
  • Covariance (COV) - using the in built pandas method .cov() the same measure as variance, but between timeseries. pairwise covariance. Covariance is a numerical measure that indicates the inter-dependency between two variables. A covariance of 0 indicates that the variables are totally independent. while a high and positive covariance value means that a variable is big when the other is big.
Made with Slides.com