Week 1, Lesson 2: Scripting for Geo-Information Sciences, demonstration of capabilities

WUR Geoscripting WUR logo

Today's items

Good morning! Here is what you will do today:

Time Activity
Morning Self-study: go through the demo's below and finalise yesterday's lesson
13:30 to 15:30 Presentation & discussion
Rest of the afternoon Do/finalise the exercise.

Objective of today

Demonstration of geo-scripting potential. This will be done by providing geo-spatial showcases (e.g. data manipulation, visualization, analysis). Every showcase will be followed by:

  • A reflection of what is the interest of using scripting for that particular problem.
  • An assessment of the R knowledge required to perform that particular problem.

Learning outcomes of today:

The objective of this lecture is to provide an overview of the capabilities of scripting to solve common problems of the Geo-Information world. This lesson is organized as a set of examples that you can run, visualize the output and understand how the use of scripting was beneficial to tackling the geo-issue. After each example, a reflection is provided about why using scripting is beneficial for tackling that particular issue, as well as what associated scripting skills were required in the example.

So more specifically, at the end of the day you should:

  • Have an idea of what can be achieved using scripting.
  • Identify a situation where the use of scripting is beneficial over a click-based approach.
  • Have identified how scripting can benefit your own project (e.g. master thesis research questions).

Why scripting?

Automation

The use of scripting is particularly well suited for automation. Automated systems can take different forms;

  • Online systems that deliver updated information every day without human intervention needed (e.g. weather prediction).
  • More simple systems that e.g. extract the date from a satellite image file name.

Demo: Integrated analysis chain from download to analysis

In the script below we do the following:

  • Download SRTM data for a country e.g. Belgium.
  • Randomly sample height data.
  • Visualise results
## Libraries needed
library(raster)
library(spatstat)
library(rgeos)
## You can choose your own country here
bel <- raster::getData('alt', country='BEL', mask=TRUE) ## SRTM 90m height data
belshp <- raster::getData('GADM', country='BEL', level=2) ## administrative boundaries
## Create random points
dran <- runifpoint(500, win = as.vector(extent(bel)))
## Make the random point spatial points
S <- SpatialPoints(data.frame(x = dran$x, y = dran$y), 
            proj4string = CRS(proj4string(belshp)))
## Select only the ones within belgium
Sint <- gIntersection(S, belshp)
## Create a map
plot(bel)
plot(belshp, add=TRUE)
plot(Sint, add = TRUE, col = "red", pch = 19, cex = 0.2)

Now we can sample the height data, and plot following random point id:

out <- extract(bel, Sint, df = TRUE)
colnames(out) <- c("id", "height")
head(out)
##   id height
## 1  1     -2
## 2  2      5
## 3  3      0
## 4  4      0
## 5  5     19
## 6  6     -1
plot(out, type = "p", pch = 19)

This example can easily be run for every country of the world with minimal human intervention, and therefore also constitute an other case of automation.

Question 1: You can use the script below to analyse height (or climate data) for your own country by modifying the script above!

The contribution of scripting

  • Ability to automatically download data from any place in the world.
  • Quick and efficient analysis by sampling data randomly.
  • Can easiliy be repeated for other countries, other datasets (reproducible).
  • Fully integrated analysis chain, from data download to output.

Basic scriping skills required

  • String handling (generate automatic file names, recognize patterns, etc ...).
  • Querying lists and data.frames.
  • Error handling.
  • Handling raster and vector data.
  • Understanding and using basic spatial analysis functions.

Flexibility

With GIS software you are often limited to a set of features the software provides and anything beyond is either very hard to implement or even impossible. Scripting offers a much greater flexibility, to:

  • Explore your data in all its dimensions.
  • Implement complex and custom made algorithms and functions.

Demo 1: Visualize the temporal profile of a pixel

The example below illustrates how a small function allows to quickly visualize temporal profile of a raster time-series.

## Download a spatio-temporal dataset
download.file(url = 'https://raw.githubusercontent.com/loicdtx/bfastSpatial/master/data/tura.rda', destfile = 'tura.rda', method = 'auto')
load('tura.rda')

There is now a variable named tura in your environment, it is a time-series of NDVI over an area of Ethiopia.

library(zoo)
## Define the function to extract and plot the time series
click2ts <- function(x) {
    val <- click(x, n = 1)
    z <- getZ(x)
    plot(zoo(t(val), z), type = 'p', pch = 20, xlab = 'Time', ylab = 'NDVI (-)')
}

Then run the code block below. R will be waiting for you to click on the map, and quickly after the time-series will appear.

plot(tura, 1)
click2ts(tura)

The contribution of scripting

Exploring the temporal dimension of a dataset is often not possible in classical GIS environments. The flexibility of scripting allows to do that with minimal efforts.

Basic scripting skills required

  • Understanding the structure of spatial objects.
  • Writing functions.

Demo 2: Implement a custom made function to a raster time-series

The example below uses a time-series of MODIS VCF data, which corresponds to percentage tree cover, to investigate temporal trends of tree cover over the Netherlands.

## Trend calculator
library(zoo)
library(lubridate)
library(raster)

## Download data (if it doesn't work, try with method='wget')
download.file(url = 'https://raw.githubusercontent.com/GeoScripting-WUR/Scripting4Geo/gh-pages/data/MODIS_VCF_2000-2010_NL.rds', destfile = 'MODIS_VCF_2000-2010_NL.rds', method = 'auto')
## Read the data
modis <- readRDS('MODIS_VCF_2000-2010_NL.rds')
## Clean data (values > 100 correspond to water)
modis[modis > 100] <- NA
## Visualize
plot(modis, 1)
# For the example we will reduce the extent so that processing does not take too long
e <- extent(340101, 370323, 5756221, 5787772)
plot(e, add=TRUE)
modis_sub <- crop(modis, e)
plot(modis_sub, 1)

# Define function to calculate temporal trends
fun <- function(x) {
    ts <- zoo(x, time)
    df <- data.frame(t = decimal_date(index(ts)), vcf = c(ts))
    out <- try(lm(vcf ~ t, data = df)$coefficients[2], silent = T)
    if(class(out) == 'try-error')
        out <- NA
    return(out)
}

# Run the function spatially (this may take a few minutes, time for a break?)
time <- getZ(modis)
out <- calc(x = modis_sub, fun = fun)

# Visualize output
plot(out)
hist(out, main = 'Tree cover change at 250 m resolution (2000-2010)', xlab = 'Percentage change')

You should see a map displaying the tree cover percentage temporal trends for the 2000-2010 period.

Contribution of scripting

Scripting, thanks to its flexibility, offers the possibility to implement any custom made function to any dataset.

Basic scripting skills required

  • Writing functions.
  • Manipulating spatial objects.
  • Knowing the details of the functions/algorithms to be implemented spatially.

Visualization

We saw earlier that simple maps can be produced with R for display in reports or static supports. R also offers more advanced visualisation capabilities, particularly suited for web content.

Demo 1: Interactive maps

Check out the nice new leaflet package for R, that creates interactive web maps with the JavaScript 'Leaflet' Library.

Question 2: Try this in your own R environment.

library(leaflet)
m <- leaflet()
m <- addTiles(m)
m <- addMarkers(m, lng=5.665349, lat=51.987870, popup="Wageningen University")
m

See here for the result of the above code: Wageningen Leaflet Demo

For more information: http://www.htmlwidgets.org/showcase_leaflet.html

Demo 2: Interactive maps

The map below (created using the googleVis package) displays statistics about the national parks of Europe. The data is automatically read from this wikipedia page, cleaned and formatted in a data frame, and a webmap is created using the gvisGeoChart() function. If you hover over the dots, the value of the mapped variables is displayed. The googleVis package provides wrappers to easily produce such kind of maps using the google visualization API.

Note: This is advanced code and is shown here for reference purposes. You do not need to try to understand it right now.

library(googleVis)
library(XML)
library(httr)
# set googleVis options to change the behaviour of plot.gvis, 
# so that only the chart component of the HTML file is written into the # output file.
op <- options(gvis.plot.tag='chart')

# Read table from html
url <- "http://en.wikipedia.org/wiki/List_of_national_parks"
#x <- readHTMLTable(readLines(url), which=3, stringsAsFactors = FALSE)
page <- GET(url, user_agent("httr"))
x <- readHTMLTable(text_content(page), which=3, stringsAsFactors = FALSE)

# Clean up df 
colnames (x) <- c('country', 'oldest', 'number', 'area_tot', 'country_percentage')
x$oldest <- as.numeric(x$oldest)
x$number <- as.numeric(gsub("\\*", "", x$number))
x$area_tot <- as.numeric(gsub("(,)|(\\[.*\\])", "", x$area_tot))
x$country_percentage <- as.numeric(gsub("(%)|(\\[.*\\])", "", x$country_percentage))

nationalParks <- x

g <- gvisGeoChart(nationalParks, locationvar="country", colorvar = "oldest", sizevar = "number",
                  options=list(region="150", displayMode="markers", colorAxis="{colors: ['green', 'blue']}"))

plot(g)

Demo 3: Interactive maps

The above map is nice, but still pretty static; it only displays two variables, which we did not choose. The year the first park was created for the dot color, and the total number of parks in the country for the dot size. To make that map a bit more interactive we can embed it in a shiny App. Shiny is among the latest developments of the R world, supported by RStudio; it provides the possibility to create interactive plots. This app can either be run in an RStudio session or online.

Advanced examples are possible, as shown below, with an integration of shiny and leaflet. (Example provided by RStudio)

Contribution of scripting

  • "User friendly" interface to JavaScript libraries.

Basic scipting skills required

  • Dataframe manipulation.
  • String manipulation (subsetting, pattern matching).
  • Function writing.

Reproducibility

All examples shown in this demo have at least one characteristic in common; they are all reproducible. While you may not always remember the parameters you have selected when analyzing or creating map in a GIS environment, scripting leaves a fully reproducible mark of what you have done.

Today's summary

To summarize, we learned today that scripting can be used for many purposes:

  • Automate steps and deal with large amounts of data.
  • Solve complex problems (run custom made functions, while traditional GIS environments are limited to the features they provide).
  • Make work reproducible (also beneficial for collaboration).
  • Visualize data and build interactive geo-data visualization tools.

Assignment

Submit a max 1/2 page document via Blackboard, where you identify (per team) a brief description of an idea or project in which you plan to use GeoScripting. Highlight:

  • The potential contribution of Geo-scripting for yourself.
  • The specific R or Python knowledge you would need to learn about.

Keep it short, to the point and structured (e.g. a list of items are OK, no full sentences are needed).

The deadline is before the end of today. Again, provide feedback to at least one other team on their assignment.

Inspiration and more info