Vector - Raster
Learning objectives
- Plot spatial vector and raster data;
- Transform vector and raster data;
- Write and read spatial vector formats (e.g. KML, GML, GeoJSON and Shapefile);
- Apply basic operations on vector data, such as masking and cropping;
- Be able to extract raster data using vector data.
Introduction
In the previous tutorials we saw how to deal with raster data using R. This tutorial is an introduction on how to handle vector data in R, as well as how to handle the combination of vector and raster data.
Some packages for working with vector and raster data in R
In this tutorial we will use the sf package. This package focuses solely on vector data. It provides a standardized encoding of vector data and uses GDAL to read and write data, GEOS for geometrical operations and PROJ for projection conversions and datum transformations.
The GDAL library is well-documented (http://gdal.org/), but with a catch for R and Python programmers. The GDAL (and associated OGR) library and command line tools are all written in C and C++. Bindings are available that allow access from a variety of other languages including R and Python but the documentation is all written for the C++ version of the libraries. This can make reading the documentation rather challenging. Fortunately, the sf package, providing GDAL bindings in R, is also well documented with lots of examples. The same is valid for the Python libaries.
Thus, functionality that you commonly find in expensive GIS software
is also available within R, using free but very powerful software
libraries. Here is handy
‘cheatsheet’ for spatial operations with sf. The functions
of the sf package are prefixed by st_
, short for
‘spatial type’.
The possibilities are huge. In this course we can only scratch the surface with some essentials, which hopefully invites you to experiment further and use them in your research. Details can be found in the book Applied Spatial Data Analysis with R and several vignettes authored by Roger Bivand, Edzer Pebesma and Virgilio Gomez-Rubio. This book can be accessed for free through the following link!
Raster and vector integration and conversion
Historically, the first package handling vectors in R was the sp package, and the first package handling rasters was the raster package. They had cross-integration so that you could perform operations such as cropping a raster by a vector or extracting raster information over a vector location. However, the sp package got deprecated and sf was made as its successor, with much easier handling of the data (as a regular data.frame). Many of the packages that previously handled sp objects, including raster, got updated to also handle sf objects. Next, the stars package was developed by the creators of the sf package as a means of having multidimensional rasters. However, the older raster package was still the go-to solution for raster handling in R. Finally, as you noticed in a previous tutorial, in 2020 the raster package was deprecated and replaced with the terra package, which is a much faster C++ version of raster, but it also includes its own definition of vector data.
The result is that some packages provide or work with sp,
sf and terra vectors, and some provide or work with
raster, stars and terra rasters. This can be
quite confusing. Therefore, in the course we currently use only the
sf package for handling vectors and the terra package
for handling rasters. The workflow is then to convert any object that is
not sf or terra into sf or terra, do
processing in sf and/or terra, and optionally convert
the objects back if integration with some other package is necessary
(e.g. sf
to SpatVector
for use in
terra in some cases). Below is a matrix showing how to convert
the various objects.
From/To | sf | terra | raster |
---|---|---|---|
sp
SpatialPoints /SpatialLines /SpatialPolygons |
st_as_sf() |
vect() |
None |
sf sf data.frame |
None | vect() |
None |
terra SpatVector |
st_as_sf() |
None | None |
raster
RasterLayer /RasterStack /RasterBrick |
None | rast() |
None |
stars stars |
None | rast() |
None |
terra SpatRaster |
None | None | brick() |
raster extent |
st_bbox() |
ext() |
None |
sf bbox |
None | ext() |
extent() |
terra SpatExtent |
st_bbox() |
None | None so far |
The matrix above shows lossless conversions, also known as casting or coercing data types. You can also convert data from raster to vector format and vice-versa. However, whenever you start converting between rasters and vectors, you should wonder whether you are taking the right approach to solve your problem. An approach that does not involve converting your data from vector to raster or the opposite should almost always be preferred.
As a result, because these functions are only useful for some very particular situations, we only give a brief description of them below.
- Vector to raster:
- There is one function that allows to convert a vector object to a
raster object. It is the
rasterize()
function.
- There is one function that allows to convert a vector object to a
raster object. It is the
- Raster to vector:
- Three functions allow to convert raster data to vector:
as.points()
,as.contour()
andas.polygons()
produceSpatVector
(terra) objects. The conversion to polygons can be useful to convert the result of a classification. In that case, setdissolve =
toTRUE
so that the polygons with the same attribute value will be dissolved into multi-polygon regions.
- Three functions allow to convert raster data to vector:
Geometric operations
Raw raster data do not usually conform to any notion of administrative or geographical boundaries. Vector data (and extents) can be used to mask or crop data to a desired region of interest.
- Crop:
- Cropping consists in reducing the extent of a spatial object to a
smaller extent. As a result, the output of
crop()
will automatically be rectangular and will not consider any features such as polygons to perform the subsetting. It is often useful to crop data as tightly as possible to the area under investigation to reduce the amount of data and have a more focused view when visualizing the data. crop()
uses objects of classSpatExtent
to define the new extent, or any object that can be coerced to an extent (see?ext
for more info on this). This means that practically all spatial objects (raster or vector) can be used directly in crop. Considering two rastersr1
andr2
withr2
smaller thanr1
, you can simply usecrop(r1, r2)
in order to cropr1
to the extent ofr2
.- You can easily define an extent interactively (by clicking) thanks
to the
draw()
function.
- Cropping consists in reducing the extent of a spatial object to a
smaller extent. As a result, the output of
- Mask:
mask()
can be used with almost all spatial objects to mask (= set toNA
) values of a raster object. When used with a polygon object,mask()
will keep values of the raster overlayed by polygons and mask the values outside of polygons.- Note the very useful
inverse =
argument ofmask()
, which allows to mask the inverse of the area covered by the features. We will use this feature of mask later in the tutorial to exclude water areas of a raster, defined in an independent polygon object.
- Buffer:
buffer()
Calculate a buffer around all cells that are not NA in a SpatRaster, or around the geometries of a SpatVector (currently only implemented for points).- Note that the distance unit of the buffer width parameter is meters if the CRS is (+proj=longlat), and in map units (typically also meters) if not.
- See
?buffer
for more info.
- Extract:
- The most common operation when combining vector and raster data is
the extraction, using the function
extract()
. It simply consists in extracting the values of a raster object for locations specified by a vector object (e.g. points). - When using
extract()
with polygons or lines, individual features of the vector layer may overlay or intersect several pixels. In that case a function (fun =
) can be used to summarize the values into one. Note that although most often the functionsmin
,max
,mean
ormedian
are used for the spatial aggregation, also any custom-made function can be used. The result is a matrix of raster values at the given locations. See?extract()
example section.
- The most common operation when combining vector and raster data is
the extraction, using the function
Visualise spatial vector layer together with a Landsat 8 image from Wageningen
Step by step we will:
- Download the Landsat 8 data of Wageningen;
- Download and prepare administrative boundary data of the Netherlands;
- Download water area data of Wageningen;
- Mask the data to match the boundaries of the city;
- Mask the data to exclude water bodies.
Prepare the data
# Install necessary packages and load them
if(!"terra" %in% installed.packages()){install.packages("terra")}
if(!"sf" %in% installed.packages()){install.packages("sf")}
library(terra)
## terra 1.7.46
## Linking to GEOS 3.11.2, GDAL 3.6.2, PROJ 9.2.0; sf_use_s2() is TRUE
# Create data directory if it does not yet exist
if (!dir.exists("data")) {
dir.create("data")
}
# Download to data directory
download.file(url = 'https://github.com/GeoScripting-WUR/VectorRaster/releases/download/tutorial-data/landsat8.zip', destfile = 'data/landsat8.zip')
# Unzip to data directory
unzip('data/landsat8.zip', exdir = "data")
# Identify the right file
landsatPath <- list.files(path = "data/", pattern = glob2rx('LC8*.tif'), full.names = TRUE)
wagLandsat <- rast(landsatPath)
We can start by visualizing the data. Since it is a multispectral
image, we can use plotRGB()
to do so.
# plotRGB does not support negative values, so we remove them
wagLandsat[wagLandsat < 0] <- NA
# Band names can be changed here
names(wagLandsat) <- c("band1", "band2", "band3", "band4", "band5", "band6", "band7")
# Select which bands to assign to the red, green, and blue colour channels
plotRGB(wagLandsat, 5, 4, 3)
We have chosen to visualize the Landsat image as a false color composite, meaning that the chosen bands do not match the RGB channels. Indeed, we have plotted the near-infrared band as red, the red as green, and the green as blue.
# Download and unzip the Dutch municipalities boundaries
download.file(url = 'https://github.com/GeoScripting-WUR/Scripting4GeoIntro/releases/download/data/data.zip', destfile = 'data/GADM.zip')
unzip('data/GADM.zip', exdir = "data")
# Load level 2 dataset
nlCitySf <- st_read("data/gadm41_NLD_2.json")
## Reading layer `gadm41_NLD_2' from data source
## `C:\Users\tijug\Desktop\geoscripting\VectorRaster\data\gadm41_NLD_2.json'
## using driver `GeoJSON'
## Simple feature collection with 355 features and 13 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 3.3608 ymin: 50.7235 xmax: 7.227 ymax: 53.5546
## Geodetic CRS: WGS 84
## Simple feature collection with 6 features and 13 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 6.2237 ymin: 52.6132 xmax: 7.0923 ymax: 53.0942
## Geodetic CRS: WGS 84
## GID_2 GID_0 COUNTRY GID_1 NAME_1 NL_NAME_1 NAME_2 VARNAME_2
## 1 NLD.1.1_1 NLD Netherlands NLD.1_1 Drenthe NA AaenHunze NA
## 2 NLD.1.2_1 NLD Netherlands NLD.1_1 Drenthe NA Assen NA
## 3 NLD.1.3_1 NLD Netherlands NLD.1_1 Drenthe NA Borger-Odoorn NA
## 4 NLD.1.4_1 NLD Netherlands NLD.1_1 Drenthe NA Coevorden NA
## 5 NLD.1.5_1 NLD Netherlands NLD.1_1 Drenthe NA DeWolden NA
## 6 NLD.1.6_1 NLD Netherlands NLD.1_1 Drenthe NA Emmen NA
## NL_NAME_2 TYPE_2 ENGTYPE_2 CC_2 HASC_2 geometry
## 1 NA Gemeente Municipality NA NL.DR.AH MULTIPOLYGON (((6.5699 52.9...
## 2 NA Gemeente Municipality NA NL.DR.AS MULTIPOLYGON (((6.6408 53.0...
## 3 NA Gemeente Municipality NA NL.DR.BO MULTIPOLYGON (((6.7457 52.8...
## 4 NA Gemeente Municipality NA NL.DR.CO MULTIPOLYGON (((6.8716 52.6...
## 5 NA Gemeente Municipality NA NL.DR.DW MULTIPOLYGON (((6.2732 52.6...
## 6 NA Gemeente Municipality NA NL.DR.EM MULTIPOLYGON (((7.0923 52.8...
It seems that the municipality names are in the NAME_2
column. So we can subset the sf data.frame
to the city of
Wageningen alone. To do so we can use simple data frame
manipulation/subsetting syntax.
# Remove rows with NA
nlCitySf <- nlCitySf[!is.na(nlCitySf$NAME_2),]
# Filter Wageningen
wagContour <- nlCitySf[nlCitySf$NAME_2 == 'Wageningen',]
We can use the resulting wagContour
object, to mask the
values out of Wageningen, but first, since the two objects are in
different coordinate systems, we need to reproject the projection of one
to the other.
Question 1: Would you rather reproject a raster or a vector layer? Give two reasons why you would choose to reproject a raster or vector.
# Get the target CRS from the wagLandsat raster object
targetCRS <- st_crs(wagLandsat)
# Use sf to transform wagContour to the same projection as wagLandsat
wagContourUTM <- st_transform(wagContour, targetCRS)
Now that the two objects are in the same CRS, we can do the masking and visualize the result. Let’s first crop and then mask, to see the difference.
Crop, mask and visualise
wagLandsatCrop <- crop(wagLandsat, wagContourUTM)
wagLandsatSub <- mask(wagLandsat, wagContourUTM)
# Set graphical parameters (one row and two columns)
opar <- par(mfrow = c(1, 2))
plotRGB(wagLandsatCrop, 5, 4, 3)
plotRGB(wagLandsatSub, 5, 4, 3)
plot(st_geometry(wagContourUTM), add = TRUE, border = 'green', col = 'transparent', lwd = 3) # set fill colour to transparent
# Reset graphical parameters
par(opar)
In the figure above, the left panel displays the output of
crop
, while the second panel shows the result of masking
the Landsat scene using the contour of Wageningen as input.
We also have a water mask of Wageningen in vector format. Let’s download it and also reproject it to the CRS of the Landsat data.
Important functions are st_read
and st_write
.
These are very powerful functions that enable reading and writing simple
features or layers from a file or database.
# Download and unzip files
download.file(url = 'https://github.com/GeoScripting-WUR/VectorRaster/releases/download/tutorial-data/wageningenWater.zip', destfile = 'data/wageningenWater.zip')
unzip('data/wageningenWater.zip', exdir = "data")
# Load the Water Shapefile directly as an sf object
water <- st_read('data/Water.shp')
## Reading layer `Water' from data source
## `C:\Users\tijug\Desktop\geoscripting\VectorRaster\data\Water.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 632 features and 32 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 156404.9 ymin: 437540.3 xmax: 181084.9 ymax: 448260.9
## Projected CRS: Amersfoort / RD New
## [1] "IDENTIFICA" "TOP10_ID" "BREEDTE" "BREEDTEKLA" "BRUGNAAM"
## [6] "FUNCTIE" "FYSIEKVOOR" "FYSIEKVO_1" "HOOFDAFWAT" "NAAM_FR_CS"
## [11] "NAAM_NL_CS" "SCHEEPSLAA" "SLUISNAAM" "STROOMRICH" "TYPEINFRAS"
## [16] "TYPEWATER" "VOORKOMENW" "HOOGTENIVE" "STATUS" "DIMENSIE"
## [21] "OBJECTBEGI" "OBJECTEIND" "VERSIEBEGI" "VERSIEEIND" "BRONACTUAL"
## [26] "BRONBESCHR" "BRONNAUWKE" "BRONTYPE" "TDN_CODE" "VISUALISAT"
## [31] "GEOMETRIE_" "GEOMETRI_1" "geometry"
# Transform the water object to match the projection of our other data
waterUTM <- st_transform(water, targetCRS)
Note the use of inverse = TRUE
in the code below, to
mask the pixels that intersect with the features of the vector
object.
# Mask the Landsat data
wagLandsatSubW <- mask(wagLandsatSub, mask = waterUTM, inverse = TRUE)
# Plot
plotRGB(wagLandsatSubW, 5, 4, 3)
plot(st_geometry(waterUTM), col = 'lightblue', add = TRUE, border = '#3A9AF0', lwd = 1) # use a site such as https://htmlcolorcodes.com/ to find the hexidecimal code for colours
Also, some of our friends want these exact data too (e.g. the
water
polygon object).
One friend of ours is a software engineer and he wants a GeoJSON. Another friend is a GIS-analyst in QGIS and as a backup he wants the file in Geographic Markup Language (GML). These fileformats (GeoJSON and GML, but also KML and Shapefile) are commonly used in spatial analysis. Let’s try to give them the files in those formats!
You can try for yourself and e.g. start by converting them to KML and opening them in Google My Maps.
# Create output directory if it does not yet exist
if (!dir.exists("output")) {
dir.create("output")
}
# Export the simple feature to a KML.
try(st_write(water, "output/water.kml", driver = "kml", delete_dsn = ifelse(file.exists("output/water.kml"), TRUE, FALSE)))
# Note that the code above checks for existence of the KML file and sets the delete_dsn option to true or false accordingly
You can now open the created water.kml
file in Google My Maps or Google Earth
Pro. You can also try out other formats like
e.g. GeoJSON
. Another option for visualisation is an
interactive map using mapview, which
in its turn is based on leaflet. The output of the
simple example below can be viewed here.
Extract raster values along a transect
Another use of the extract()
function can be to
visualize or analyse data along transects. In the following example, we
will run a transect across Belgium and visualize the change in
elevation.
Let’s first download the elevation data of Belgium:
# Download to data directory
download.file(url = 'https://github.com/GeoScripting-WUR/VectorRaster/releases/download/tutorial-data/belgium_elev.zip', destfile = 'data/belgium_elev.zip')
# Unzip to data directory
unzip('data/belgium_elev.zip', exdir = "data")
# Download data
bel <- rast('data/belgium_elev.tif')
# Display metadata
bel
## class : SpatRaster
## dimensions : 264, 480, 1 (nrow, ncol, nlyr)
## resolution : 0.008333333, 0.008333333 (x, y)
## extent : 2.5, 6.5, 49.4, 51.6 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326)
## source : belgium_elev.tif
## name : BEL_msk_alt
## min value : -105
## max value : 691
We can start by visualizing the data.
Everything seems correct.
We want to look at a transect, which we can draw by hand by selecting
two points by clicking. The draw('line')
function will help
us do that. Once you run the function, you will be able to click in the
plotting window of R (The bel
object should already be
present in the plot panel before running draw('line')
).
Press esc once you have selected the two extremities of the
line.
## Loading required package: sp
## The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
## which was just loaded, will retire in October 2023.
## Please refer to R-spatial evolution reports for details, especially
## https://r-spatial.org/r/2023/05/15/evolution4.html.
## It may be desirable to make the sf package available;
## package maintainers should consider adding sf to Suggests:.
## The sp package is now running under evolution status 2
## (status 2 uses the sf package in place of rgdal)
Let’s first write this line to a file, using a few different file
formats. Note that draw()
returns a SpatVector. That means
we cannot (directly) use st_write()
here like we did
before. Using the overview table at the start of this tutorial, we can
change that using st_as_sf()
. After writing to disk, you
might try opening these new files in R, or in different software, such
as QGIS or ArcGIS.
# To a GeoJSON
st_write(st_as_sf(line), 'output/line.geojson')
# To an ESRI Shapefile
st_write(st_as_sf(line), 'output/line.shp')
Now the elevation values can simply be extracted using
extract()
.
We can plot the result as follows.
Question 2: Take a look at the x-axis of the graph above. What unit corresponds to the given values? Can you think of interpretation problems when following the method we apply here?
Extra
Check out this overview of examples for creating static and interactive maps in R, making use of packages like mapview and leaflet.