Week 3: Python for geo-scripting

WUR Geoscripting WUR logo

Vector data handling with Python

Jan Verbesselt, Jorge Mendes de Jesus, Aldo Bergsma, Dainius Masiliunas - 22 January, 2017

GDAL/OGR using Python

By now you should know what the following abbreviations stand for, and what can you do with the software:

  • GDAL
  • OGR

Whereas the following may be new to you:

  • OSR: spatial reference (projections!)

Documentation:

Check out these relevant (cook)books:

OGR and the data drivers

OGR supports many different vector formats:

  • ESRI formats such as shapefiles, personal geodatabases and ArcSDE
  • Other software such as MapInfo, GRASS, Microstation
  • Open formats such TIGER/Line, SDTS, GML, KML
  • Databases such as MySQL, PostgreSQL, Oracle Spatial, etc.
  • Webservices such as Web Feature Service 1.0,1.1,2.0 (WFS output as GML)

OGR Data drivers: A driver is an object that knows how to interact with a certain data type (e.g. a shape file). You need to know the right driver to read or write data. Some drivers might only be able to read content, but the majority can read and write.

try:
  from osgeo import ogr
except:
  import ogr
From the terminal you can also type the following to obtain an overview of supported formats:
echo "Terminal command to obtain an overview of supported formats"
ogrinfo --formats

More info about the different OGR formats.

Points

What does "WKT" mean?

from osgeo import ogr
# Create a point geometry
wkt = "POINT (173914.00 441864.00)"
pt = ogr.CreateGeometryFromWkt(wkt)
print(pt)
# print help(osgeo.ogr)
## POINT (173914 441864)

WKT (Well Known Text) is a sort of markup language that describes spatial information in a clean text format, there is an equivalent in binary format (easier for computers to process and more efficient for data transfer).

Explanation of WKT: Wikipedia

Define a spatial reference

A special reference defines: An ellipsoid, a datum using that ellipsoid, and either a geocentric, geographic or projection coordinate system.

Therefore spacial references come in multiple forms according to our data needs. There is even a search engine for it: http://spatialreference.org

The following example is based on WGS84, see:http://spatialreference.org/ref/epsg/wgs-84/

from osgeo import osr
##  spatial reference
spatialRef = osr.SpatialReference()
spatialRef.ImportFromEPSG(4326)  # from EPSG - Lat/long

Reproject a point

from osgeo import ogr
from osgeo import osr

## lat/long definition
source = osr.SpatialReference()
source.ImportFromEPSG(4326)

# http://spatialreference.org/ref/sr-org/6781/
# http://spatialreference.org/ref/epsg/28992/
target = osr.SpatialReference()
target.ImportFromEPSG(28992)

#transform from lat/long to Dutch coord. system
transform = osr.CoordinateTransformation(source, target)
point = ogr.CreateGeometryFromWkt("POINT (5.6660 51.9872)")
point.Transform(transform)
print point.ExportToWkt()
## POINT (174150.990103958 444348.392444438)

Question 1: On the previous example where WKT is printed out, in which coordinate system is the WKT represented?

Create a shape file

Concept:

This can be visualised as a kind of a pyramid of elements:

Driver
    Datasource
        Layer
            Feature
                Geometry
                    Point

Start a new Python script within the Python console of QGIS (recommended for today!):

You can set the working directory using:
import os
os.chdir('pathtoyourworkingdirectory')
print os.getcwd()

Now you can continue (you do not need to repead the import os if you have already done so!):

  • Set spatial reference
  • create shape file
  • create layer
  • create point
  • Put point as a geometry inside a feature
  • Put feature in a layer
  • Flush
## Loading the modules
from osgeo import ogr,osr
import os
#os.chdir('./data')
## Is the ESRI Shapefile driver available?
driverName = "ESRI Shapefile"
drv = ogr.GetDriverByName( driverName )
if drv is None:
    print "%s driver not available.\n" % driverName
else:
    print  "%s driver IS available.\n" % driverName

## choose your own name
## make sure this layer does not exist in your 'data' folder
fn = "points.shp"
layername = "anewlayer"

## Create shape file
ds = drv.CreateDataSource(fn)
print ds.GetRefCount()

# Set spatial reference
spatialReference = osr.SpatialReference()
spatialReference.ImportFromProj4('+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs')

# you can also do the following
# spatialReference.ImportFromEPSG(4326)

## Create Layer
layer=ds.CreateLayer(layername, spatialReference, ogr.wkbPoint)
## Now check your data folder and you will see that the file has been created!
## From now on it is not possible anymore to CreateDataSource with the same name
## in your workdirectory untill your remove the name.shp name.shx and name.dbf file.
print(layer.GetExtent())

## What is the geometry type???
## What does wkb mean??

## ok lets leave the pyramid top and start building the bottom,
## let's do points
## Create a point
point1 = ogr.Geometry(ogr.wkbPoint)
point2 = ogr.Geometry(ogr.wkbPoint)

## SetPoint(self, int point, double x, double y, double z = 0)
point1.SetPoint(0,4.897070,52.377956) 
point2.SetPoint(0,5.104480,52.092876)

## Actually we can do lots of things with points: 
## Export to other formats/representations:
print "KML file export"
print point2.ExportToKML()

## Buffering
buffer = point2.Buffer(1,1)
print buffer.Intersects(point1)

## More exports:
buffer.ExportToGML()

## Back to the pyramid, we still have no Feature
## Feature is defined from properties of the layer:e.g:

layerDefinition = layer.GetLayerDefn()
feature1 = ogr.Feature(layerDefinition)
feature2 = ogr.Feature(layerDefinition)

## Lets add the points to the feature
feature1.SetGeometry(point1)
feature2.SetGeometry(point2)

## Lets store the feature in a layer
layer.CreateFeature(feature1)
layer.CreateFeature(feature2)
print "The new extent"
print layer.GetExtent()

## So what is missing ????
## Saving the file, but OGR doesn't have a Save() option
## The shapefile is updated with all object structure 
## when the script finished of when it is destroyed, 
# if necessay SyncToDisk() maybe used

ds.Destroy()
## below the output is shown of the above Python script that is run in the terminal
## ESRI Shapefile driver IS available.
## 
## 1
## (0.0, 0.0, 0.0, 0.0)
## KML file export
## 5.10448,52.092876,0
## True
## The new extent
## (4.89707, 5.10448, 52.092876, 52.377956)

Now, we can open the created shape file in QGIS using the Python Console.

## add a vector layer to the QGIS interface
qgis.utils.iface.addVectorLayer(fn, layername, "ogr") 
aLayer = qgis.utils.iface.activeLayer()
print aLayer.name()

The method addVectorLayer takes three arguments:

  • the first argument is the path to the data source – the shapefile in our case
  • the second argument is the basename – the name that the layer takes in the table of contents
  • the third argument is the provider key. Basically, the function wants to know what driver will be used to read this data. For our purposes, “ogr” will be used most of the time with vector data.

Another example of how we can use this method:

qgis.utils.iface.addVectorLayer('/home/user/Git/Python/data/points.shp', 'anewlayer', "ogr")

Open Shape File in QGIS

Modify a shape file

Now let's try to add an extra point to the shape file:

#drv is the ESRI shapefile driver from above and fn the filename
ds = drv.Open(fn, 1)
## check layers and get the first layer
layernr = ds.GetLayerCount()
print layernr
layer = ds.GetLayerByIndex(0)
print layer

## get number of features in shapefile layer
features_number = layer.GetFeatureCount()
print "number of features for this layer:", features_number

## get the feature definition:
featureDefn = layer.GetLayerDefn()

## create a point
point = ogr.Geometry(ogr.wkbPoint)
point.SetPoint(0,5.6909725,50.8513682)
print point
## similarly 
## point.AddPoint(2,1)

## create a new feature
feature = ogr.Feature(featureDefn)
feature.SetGeometry(point)
# Lets store the feature in file
layer.CreateFeature(feature)
layer.GetExtent()
ds.Destroy()

Convert a shapefile to JSON format

Conversion of file formats is easier using the bash command line or even online:

See here for an web conversion example

Using the command line

ogr2ogr -f GeoJSON -t_srs crs:84 points.geojson points.shp

ogr2ogr is a command-line version of ogr and is mainly used for file conversion and/or processing. The syntax is in reverse, with new filename first and orginal filename last.

If you want to learn more about GEOJSON:

How to visualise a spatial file?

There are multiple solutions, GIS software (QGIS), web platforms (leaflet), PNG (mapnik)

With leaflet a GeoJSON can be easly vizualized inside a webpage. Example simple webgis:

import folium
map_osm = folium.Map(location=[45.5236, -122.6750])
map_osm.save('osm.html')

 

 

import folium
import os
pointsGeo=os.path.join("points.geojson")
map_points = folium.Map(location=[52,5.7],tiles='Mapbox Bright', zoom_start=6)
map_points.choropleth(geo_path=pointsGeo)
map_points.save('points.html')

 

 

Webtutorial about folium: web-mapping-with-python-and-folium

Clean up

From the terminal you can remove the created "points.shp" and related files using:

echo "in the terminal you can use the following bash commands to easily remove files"
echo "list all the files starting with points*"
ls points*
echo "remove all the files starting with points*"
rm -v points*
## in the terminal you can use the following bash commands to easily remove files
## list all the files starting with points*
## points.dbf
## points.geojson
## points.html
## points.prj
## points.shp
## points.shx
## remove all the files starting with points*
## removed 'points.dbf'
## removed 'points.geojson'
## removed 'points.html'
## removed 'points.prj'
## removed 'points.shp'
## removed 'points.shx'

What have we learned?

  1. Get or create a writeable layer
  2. Add fields if necessary
  3. Create a feature
  4. Populate the feature
  5. Add the feature to the layer
  6. Close the layer
  7. Modify a Shape file
  8. Visualise a vector object using Python/Leaflet

Assignment

Now create your own shapefile containing two locations (e.g. a location of the GAIA building in Wageningen). E.g. go to Google Earth, extract the coordinates and redo the above excercise (BONUS: if you use a for loop) and export to a KML file.

You can also use the points defined here: https://geoscripting-wur.github.io/IntroToVector/.

Bonus: you can create a map using Python code.

For making nice maps see the following python notebooks.

Deadline

Upload your documented and well structured Ipython Notebook to a GitHub repository and send the link before 10:00.

More information

If you have time left you can start the following tutorial made by Prof. P. Lewis.