Data Sources#
PyPSA-Eur is compiled from a variety of data sources. The following table provides an overview of the data sources used in PyPSA-Eur. Different licenses apply to the data sources.
Managing Data Versions#
Many of the data sources used in PyPSA-Eur are updated regularly. To ensure reproducibility, PyPSA-Eur uses a versioning system for data sources which allows users to select specific versions of the data sources to use in their models.
Note
For users, selection and control over which is managed through the configuration file. See _data_cf for details.
Creating a new version of the data sources#
To create a new version of the data sources, you can use the helper script in scripts/create_zenodo_deposition_cli.py.
Here are the steps that this script helps you to navigate:
Locate the data for the new version and place it under
data/<dataset_name>/archive/<version>/. E.g. for creating a new version2029-01-01of theworldbank_populationdataset, place the data into a folder nameddata/worldbank_population/archive/2029-01-01/. We follow the versioning names of the original dataset, so make sure to use the same version name as the original dataset.If you want to use the script, run it now. It will guide you through the process outlined below.
Create a new Zenodo deposition for the new version of the data. You can do this by visiting the Zenodo website and creating a new deposition. * All relevant metadata, such as the title, description, and keywords based on the previous version of the dataset and make changes as needed. * Make sure to add the previous version Zenodo deposit as a related identifier with the relation “is new version of”, “doi” as the identifier scheme, and the DOI of the previous version as the identifier. * Upload the data files from the
data/<dataset_name>/archive/<version>/folder to the Zenodo deposition.Once the deposition is complete, publish it on Zenodo.
Update
data/versions.csvto include the new version of the dataset. * Create a new row in the CSV file based on the previous version, updating the version number and Zenodo URL * Make sure to tag this new version with the tags['latest', 'supported']. * Remove thelatesttag from the previous version. * If the previous version is no longer supported or outdated, remove thesupportedtag and add thedeprecatedtag.Commit the changes to the repository and create a pull request.
Adding a new data source#
The process of adding a new data source is similar to creating a new version of an existing data source, with some additional steps.
It is also possible to use the helper script in scripts/create_zenodo_deposition_cli.py to guide you through the process.
Create a new folder for the new data source in the
data/directory, e.g.data/my_new_data_source/.Place the data files for the new data source in the
data/<dataset_name>/archive/<version>/folder. We follow the versioning names of the original dataset, so make sure to use the same version name as the original dataset.If you want to use the script, run it now. It will guide you through the process outlined below.
Create a new Zenodo deposition for the new data source. You can do this by visiting the Zenodo website and creating a new deposition. * Add all relevant metadata, such as the title, authors, description, keywords. * When adding the license, make sure that the license of the dataset is compatible with redistribution (i.e. uploading to Zenodo). Most of our data is originally CC-BY-4.0 licensed. If you have doubts about the license, reach out to the maintainers. * Make sure to set the version name to be the same as the version name used for the data files, e.g.
2029-01-01. * Upload the data files from thedata/<dataset_name>/archive/<version>/folder to the Zenodo deposition.Once the deposition is complete, publish it on Zenodo.
Update
data/versions.csvto include the new data source. * Create a new row in the CSV file with the following columns:dataset: The name of the dataset as used in the folder name, e.g.my_new_data_source.source: The source of the dataset. For Zenodo uploads the source is by definitionarchive.version: The version name of the dataset as used in the folder name, e.g.2029-01-01.tags: A list of tags for the dataset. Make sure to includelatestandsupportedtags.url: The link to the Zenodo deposition of the dataset, e.g.https://zenodo.org/record/<zenodo_id>. Check whether the respectiveretrieve_<dataset_name>rule inrules/retrieve.smkrequires a direct download link or the link to the Zenodo record.note: An optional note about the dataset.
Implement a
retrieverule for your dataset inrules/retrieve.smk. This rule should download the data from the Zenodo deposition and place it in thedata/<dataset_name>/archive/<version>/folder. Take inspiration from existing rules in the file, e.g. therule retrieve_worldbank_urban_population.Add an additional rule for the
primarysource of the data, i.e. the original source of the data. You may be able to use the same rule forarchiveandprimarysources, but sometimes dedicated rules are needed. Create an entry indata/versions.csvfor theprimarysource as well, with the URL pointing to the original source of the data. Again, take inspiration from existing rules in the file, e.g. therule retrieve_worldbank_urban_population_primary. This rule will also help us in the future to update to new versions of the data set.- Add the new data source to the
datasection in the configuration fileconfig/config.default.yamldoc/configtables/data.csvfor the documentationdata_sources.rstdata inventory for PyPSA-Eur
Data inventory#
Short name |
Long name |
Description |
Owner |
Link to website |
License |
|---|---|---|---|---|---|
enspreso_biomass |
ENSPRESO biomass potentials for Europe |
This collection contains datasets from ENSPRESO2, an EU-28 wide, open dataset on renewable energy potentials, at national (NUTS0), regional and high-resolution (1 x 1 km and 5 x 5 km) levels for the 2010-2050 period. Within ENSPRESO, ENergy Systems Potential Renewable Energy SOurces, and the ENSPRESO2 updates, technical potentials are provided for wind, solar and biomass, based on coherent GIS-based land-restriction scenarios. […] For biomass, agriculture, forestry and waste sectors are considered. The temporal resolution for wind and solar is both annual. |
European Commission Joint Research Centre |
https://data.jrc.ec.europa.eu/dataset/74ed5a04-7d74-4807-9eab-b94774309d9f |
CC-BY-4.0 |
osm |
Open Street Map electricity transmission grid |
Transmission grid topology and infrastructure on substations, lines and cables from Open Street Map (OSM). The dataset is built from OSM data and is not versioned. The latest dataset can be built using the scripts in the repository. |
Open Street Map contributors |
ODbL-1.0 |
|
worldbank_urban_population |
Urban population (% of total population) |
Percentage of Urban population by country, United Nations Population Division. World Urbanization Prospects: 2018 Revision. |
World Bank |
CC-BY-4.0 |
|
hotmaps_industrial_sites |
Hotmaps industrial sites |
In this repository are over 5000 georeferenced industrial sites of energy-intensive industry sectors published, together with GHG-emissions, production capacity, fuel demand and excess heat potentials calculated from emission and production data. |
hotmaps/industrial_sites/industrial_sites_Industrial_Database/ |
CC-BY-4.0 |
|
co2stop |
CO2 Storage Potentials |
An assessment of the CO2 storage potential in Europe, including storage units, traps, and maps. |
European Commission Joint Research Centre |
Reuse policy following 2011/833/EU |
|
nitrogen_statistics |
Nitrogen Statistics and Information |
Statistics and information on the worldwide supply of, demand for, and flow of the mineral commodity nitrogen. |
United States Geological Survey (USGS) |
https://www.usgs.gov/centers/nmic/nitrogen-statistics-and-information |
Public Domain |
eu_nuts2013 |
Nomenclature of Territorial Units for Statistics (NUTS) 2013 - shapefiles |
Shapefiles of EU’s Nomenclature of Territorial Units for Statistics (NUTS) 2013, which is a hierarchical system for dividing up the economic territory of the European Union. |
eurostat |
Reuse policy following 2011/833/EU |
|
eu_nuts2021 |
Nomenclature of Territorial Units for Statistics (NUTS) 2021 - shapefiles |
Shapefiles of EU’s Nomenclature of Territorial Units for Statistics (NUTS) 2021, which is a hierarchical system for dividing up the economic territory of the European Union. |
eurostat |
Reuse policy following 2011/833/EU |
|
eurostat_balances |
Energy Balances |
European energy balances by country and fuel, as reported by Eurostat. |
eurostat |
Reuse policy following 2011/833/EU ; newer versions of the same data are available as CC-BY-4.0 through the eurostat API |
|
eurostat_household_balances |
Eurostat Household Energy Balances |
Disaggregated final energy consumption in household - quantities (nrg_d_hhq) |
eurostat |
https://ec.europa.eu/eurostat/databrowser/product/page/NRG_D_HHQ |
CC-BY-4.0 |
luisa_land_cover |
The LUISA base map 2018 |
The LUISA Base Map 2018 is a high-resolution land use/land cover map developed and produced by the Joint Research Centre of the European Commission |
European Commission Joint Research Centre |
https://data.jrc.ec.europa.eu/dataset/51858b51-8f27-4006-bf82-53eba35a142c |
CC-BY-4.0 |
jrc_idees |
JRC-IDEES-2021 |
The JRC-IDEES-2021 release contains a consistent set of disaggregated energy-economy-emissions data for each Member State of the European Union, covering all sectors of the energy system for the 2000-2021 period: industry, buildings, transport, and power generation. |
European Commission Joint Research Centre |
https://data.jrc.ec.europa.eu/dataset/82322924-506a-4c9a-8532-2bdd30d69bf5 |
CC-BY-4.0 |
scigrid_gas |
Scientific Grid Model of European Gas Transmission Networks,Gas transmission data model |
DLR Institute for Networked Energy Systems |
https://web.archive.org/web/20241112092853/https://www.gas.scigrid.de/ |
CC-BY-4.0 |
|
synthetic_electricity_demand |
Interannual Electricity Demand Calculator,generates country-level electricity consumption time series based on weather data and correlates historical electricity demand to temperature |
CC-BY-4.0 |
|||
copernicus_land_cover |
Copernicus Global Land Service |
Land cover and land use inventory of European continent |
Copernicus |
https://land.copernicus.eu/en/products/global-dynamic-land-cover |
CC-BY-4.0 |
ship_raster |
Global Shipping Traffic Density |
To build ship density raster and use it further to compute availability matrix for renewables |
Worldbank |
https://datacatalog.worldbank.org/search/dataset/0037580/Global-Shipping-Traffic-Density |
CC-BY-4.0 |
eez |
Maritime Boundaries World EEZ |
To estimate potentials for offshore wind in country’s EEZ |
Marine Regions |
CC-BY-4.0 |
|
nuts3_population |
Population by NUTS3 region |
Average annual population to calculate regional GDP data (thousand persons) by NUTS 3 region (nama_10r_3popgdp) |
Eurostat |
https://ec.europa.eu/eurostat/databrowser/bulk?lang=en&searchFilter=nama_10r_3pop |
CC-BY-4.0 |
gdp_per_capita |
Gridded global datasets for Gross Domestic Product over 1990–2015 |
Gross Domestic Product per capita (PPP) |
Kummu, M et al. |
CC-BY-4.0 |
|
population_count |
World - Population Counts |
Spatial distribution of population |
WorldPop |
https://data.humdata.org/dataset/worldpop-population-counts-for-world https://hub.worldpop.org/doi/10.5258/SOTON/WP00647 |
CC-BY-4.0 |
ghg_emissions |
Total GHG emissions and removals in the EU |
National emissions reported to the UNFCCC and to the EU under the Governance Regulation |
European Environment Agency |
CC-BY-4.0 |
|
gebco |
General Bathymetric Chart of the Oceans |
Gridded Bathymetric data for ocean and land, providing elevation data in meters, on a 15 arc-second interval grid. |
GEBCO Comilation Group |
Public domain |
|
attributed_ports |
Global - International Ports |
International ports with attributes describing name, port functions, total capacity and location |
World bank Group |
https://datacatalog.worldbank.org/search/dataset/0038118/Global—International-Ports |
CC-BY-4.0 |
corine |
CORINE Land Cover 2012 |
Pan European Land cover for 44 thematic classes with 2012 as reference year |
Copernicus |
https://land.copernicus.eu/en/products/corine-land-cover/clc-2012 |
Custom similar to CC-BY |
emobility |
Motor and passenger vehicles count |
Bundesanstalt für Straßenwesen (BASt) |
CC-BY-4.0 |
||
h2_salt_caverns |
Technical potential of salt caverns for Hydrogen Storage in Europe |
Salt cavern potentials in GWh/sqkm |
Dilara et al |
https://www.sciencedirect.com/science/article/abs/pii/S0360319919347299?via%3Dihub |
CC-BY-4.0 |
lau_regions |
Local Administrative Units |
Used for local administration regions when building geothermal potentials |
Eurostat |
https://ec.europa.eu/eurostat/web/gisco/geodata/administrative-units |
Permission to download only if used for non-commercial purposes |
aquifer_data |
International Hydrogeological Map of Europe |
Groundwater data |
BGR |
Right to use without restriction but no right to redistribute |
|
osm_boundaries |
OSM Boundaries |
OSM-Boundaries was created to enable users to easily extract boundaries such as country borders, state borders, and equivalents from the OpenStreetMap databases |
Ground Zero Communications AB |
ODbL |
|
gem_europe_gas_tracker |
Europe Gas Tracker |
Methane and hydrogen infrastructure in Europe, including pipelines, LNG terminals, gas power plants and extraction sites. |
Global Energy Monitor |
https://globalenergymonitor.org/projects/europe-gas-tracker/ |
CC-BY-4.0 |
gem_gspt |
Global Steel Plant Tracker |
Steel plant global locations and characteristics, including production capacity, ownership, and emissions data. |
Global Energy Monitor |
https://globalenergymonitor.org/projects/global-steel-plant-tracker/ |
CC-BY-4.0 |
tyndp |
Ten Year Network Development Plan (TYNDP) electricity transmission grid |
Transmission grid topology based on the ENTSO-E/ENTSO-G TYNDP scenarios, including planned and existing lines. |
ENTSO-E/ENTSO-G |
CC-BY-4.0 |
|
powerplants |
Power plants matching dataset |
Global dataset of power plants with their location, capacity and technology type. |
The powerplantmatching contributors |
CC-BY-4.0 |
|
costs |
Technology cost assumptions |
Technology cost and performance assumptions for Europe for various technologies, including renewables, fossil fuels. |
The technologydata contributors |
CC-BY-4.0 |
|
country_runoff |
Country level runoff data |
Country-level runoff data, daily sums, for Europe, used for rescaling hydro-electricity availability in weather years not covered by EIA hydro-generation statistics. |
Fabian Neumann |
see rule retrieve_country_runoff in the PyPSA-Eur repository |
CC-BY-4.0 |
country_hdd |
Country level runoff data |
Country-level heating degree days for Europe, used for rescaling heat demand in weather years not covered by energy statistics. |
Fabian Neumann |
see rule retrieve_country_runoff in the PyPSA-Eur repository |
CC-BY-4.0 |
natura |
Natura 2000 protected areas |
Protected areas in Europe as defined by the Natura 2000 network. |
European Environment Agency |
https://www.eea.europa.eu/en/datahub/datahubitem-view/6fc8ad2d-195d-40f4-bdec-576e7d1268e4 |
CC-BY-4.0 |
bfs_road_vehicle_stock |
Swiss Road Vehicle Stock |
Stock of road motor vehicles in Switzerland. |
Swiss Federal Statistics Office |
https://www.bfs.admin.ch/bfs/de/home/statistiken/kataloge-datenbanken.assetdetail.33827666.html |
custom (OPEN BY ASK) |
bfs_gdp_and_population |
Swiss Population |
Population data for Switzerland. |
Swiss Federal Statistics Office |
https://www.bfs.admin.ch/bfs/en/home/news/whats-new.assetdetail.7786557.html |
custom (OPEN BY ASK) |
mobility_profiles |
German Vehicle Activity Profiles |
Vehicle activity profiles for different vehicle types and road types in Germany, based on monitoring data from the Federal Highway Research Institute (BASt). These profiles provide insights into travel behavior and patterns, which can be used for transport modeling and analysis. |
Federal Highway Research Institute (BASt) |
https://www.bast.de/DE/Themen/Digitales/HF_1/Massnahmen/verkehrszaehlung/Stundenwerte.html?nn=414410 |
CC-BY-4.0 |
dh_areas |
Shapes of district heating areas |
ISI Fraunhofer-Institut für System- und Innovationsforschung |
CC-BY-4.0 |
||
geothermal_heat_utilisation_potentials |
Potentials for Geothermal heat utilisation |
ISI Fraunhofer-Institut für System- und Innovationsforschung |
CC-BY-4.0 |
||
jrc_ardeco |
Annual Regional Database of the European Commission’s Directorate General for Regional and Urban Policy |
The database contains a set of long time-series variables and indicators for EU regions, as well as for regions in some EFTA and candidate countries, at various statistical scales (NUTS1, NUTS2, NUTS3, metro regions). |
European Commission |
similar to CC-BY |