Intro to Raster

Learning objectives

Read & write raster data
Perform basic raster file operations/conversions
Perform simple raster calculations

Introduction

Raster data is like any image. Although it may portray various properties of objects in the real world, these objects don’t exist as separate objects; rather, they are represented using pixels of various values which are assigned a color.

Today’s tutorial is about constructing a simple spatio-temporal analysis using raster data, R, and git.

In a previous tutorial you briefly saw how to read and use vector data from a file into your R environment. These vector read/write operations were made possible thanks to the GDAL library. You can check the project home page at http://www.gdal.org/. You will be surprised to see that a lot of the software you have used in the past to read gridded geospatial data use GDAL (i.e.: ArcGIS, QGIS, GRASS, etc). In this tutorial, we will use GDAL indirectly via the terra package. However, it is also possible to call GDAL functionalities directly through the command line from a terminal, which is equivalent to calling a system() command directly from within R. In addition, if you are familiar with R and its string handling utilities, it may facilitate the building of the expressions that have to be passed to GDAL. (Note: This is also doable in Bash scripting, as learned in a previous tutorial, and you can even combine the two.)

Let’s start working with terra by performing system setup checks.

# Example to perform system set-up checks
if(!"terra" %in% installed.packages()){install.packages("terra")}
library(terra)

## terra 1.7.78

gdal()

## [1] "3.8.4"

The previous function should return the version number of the current version of GDAL installed on your machine. Starting with GDAL 2.0 vector processing becomes incorporated into GDAL. In case the function above returns an error, or if you cannot install terra at all, you should verify that all required software and libraries are properly installed. Please refer to the system setup page.

Overview of the terra package

The raster package used to be the reference R package for raster processing, with Robert J. Hijmans as its the original developer. The introduction of the raster package to R was a revolution for geo-processing and analysis using R. The raster package is now deprecated, as Robert Hijmans has developed a successor to it called terra which is both simpler and much faster, as it’s rewritten in C++.

Among other things the terra package allows to:

Read and write raster data of most commonly used formats.
Perform most raster operations, such as creation of raster objects, performing spatial/geometric operations (re-projections, resampling, etc), filtering and raster calculations.
Work on large raster datasets thanks to its built-in block processing functionalities.
Perform fast operations thanks to optimized back-end C++ code.
Visualize and interact with the data.
etc…

Tip: Check the home page of the terra package. The package is extremely well documented, including vignettes and demos. See also the reference manual there. Another useful resource for information on spatial data handling with terra can be found here.

Explore the terra objects

The terra package produces and uses R objects of two main classes: SpatRaster and SpatVector. A SpatRaster represents a spatially referenced surface divided into three dimensional cells (rows, columns, and layers). A SpatVector represents geometries as well as attributes (variables) describing the geometries. Note: we will be using terra only for raster processing, as you will see in subsequent tutorials.

Let’s take a look into the structure of a SpatRaster.

# Generate a SpatRaster
r <- rast(ncol = 40, nrow = 20)
class(r)

## [1] "SpatRaster"
## attr(,"package")
## [1] "terra"

# Simply typing the object name displays its general properties / metadata
r

## class       : SpatRaster 
## dimensions  : 20, 40, 1  (nrow, ncol, nlyr)
## resolution  : 9, 9  (x, y)
## extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84)

From the metadata displayed above, we can see that the SpatRaster contains all the properties that geo-data should have; that is to say a coordinate reference system, an extent and a pixel size.

Multi-layer SpatRasters can also fairly easily be generated directly in R, as shown in the example below.

# Using the previously generated SpatRaster
# Let's first put some values in the cells of the layer
r[] <- rnorm(n = ncell(r))

# Create a SpatRaster with 3 layers
s <- c(r, r*2, r)

# Let's look at the properties of the resulting object
s

## class       : SpatRaster 
## dimensions  : 20, 40, 3  (nrow, ncol, nlyr)
## resolution  : 9, 9  (x, y)
## extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) 
## source(s)   : memory
## names       :     lyr.1,     lyr.1,     lyr.1 
## min values  : -3.186141, -6.372283, -3.186141 
## max values  :  2.944902,  5.889803,  2.944902

Also note that using c() here behaves different from using list():

# Create a list of three separate SpatRasters
s2 <- list(r, r*2, r)

# Let's look at the properties of the resulting object
s2

## [[1]]
## class       : SpatRaster 
## dimensions  : 20, 40, 1  (nrow, ncol, nlyr)
## resolution  : 9, 9  (x, y)
## extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) 
## source(s)   : memory
## name        :     lyr.1 
## min value   : -3.186141 
## max value   :  2.944902 
## 
## [[2]]
## class       : SpatRaster 
## dimensions  : 20, 40, 1  (nrow, ncol, nlyr)
## resolution  : 9, 9  (x, y)
## extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) 
## source(s)   : memory
## name        :     lyr.1 
## min value   : -6.372283 
## max value   :  5.889803 
## 
## [[3]]
## class       : SpatRaster 
## dimensions  : 20, 40, 1  (nrow, ncol, nlyr)
## resolution  : 9, 9  (x, y)
## extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) 
## source(s)   : memory
## name        :     lyr.1 
## min value   : -3.186141 
## max value   :  2.944902

SpatRaster manipulations

Reading and writing from/to file

The actual data used in geo-processing projects often comes as geo-data, stored on files such as GeoTIFF or other commonly used file formats. Reading data directly from these files into the R working environment is made possible thanks to the terra package. The main command for reading raster objects from files is the rast() function, which returns a SpatRaster.

Writing a SpatRaster to file is achieved using the writeRaster() function.

To illustrate the reading and writing of raster files, we will use data subsets that we have prepared for the course and need to be downloaded from the repository. For that, first make sure your working directory is set properly. Then run the following line; it will handle the download:

# Start by making sure that your working directory is properly set
# If not you can set it using setwd()
getwd()

## [1] "/home/osboxes/IntroToRaster"

# Create data directory if it does not yet exist
if (!dir.exists("data")) {
  dir.create("data")
}

# Download the data
# In case the download code doesn't work, use method = 'wget'
download.file(url = 'https://github.com/GeoScripting-WUR/IntroToRaster/releases/download/tahiti/gewata.zip', destfile = 'data/gewata.zip')

# Unpack the archive
unzip('data/gewata.zip', exdir = "data")

Gewata is the name of the dataset that we just downloaded. It is a multi-layer GeoTIFF object, and its file name is LE71700552001036SGS00_SR_Gewata_INT1U.tif, informing us that this is a subset from a scene acquired by NASA’s Landsat 7 ETM+ sensor. Let’s not worry about the region that the data covers for now, we will find a nice way to discover that later on in the tutorial.

Now that we have downloaded and unpacked the GeoTIFF file, it should be present in our working directory. We can investigate the content of the working directory (or any directory) using the list.files() function.

# When passed without arguments, list.files() returns a character vector, listing the content of the working directory
list.files()

# To get only the files with .tif extension in the data directory
list.files('data', pattern = glob2rx('*.tif'))

# Or if you are familiar with regular expressions
list.files('data', pattern = '^.*\\.tif$')

We can now load this object in R:

gewata <- rast('data/LE71700552001036SGS00_SR_Gewata_INT1U.tif')

Let’s take a look at the structure of this object.

gewata

## class       : SpatRaster 
## dimensions  : 593, 653, 6  (nrow, ncol, nlyr)
## resolution  : 30, 30  (x, y)
## extent      : 829455, 849045, 825405, 843195  (xmin, xmax, ymin, ymax)
## coord. ref. : UTM Zone 36, Northern Hemisphere 
## source      : LE71700552001036SGS00_SR_Gewata_INT1U.tif 
## names       : LE717~T1U_1, LE717~T1U_2, LE717~T1U_3, LE717~T1U_4, LE717~T1U_5, LE717~T1U_6 
## min values  :           4,           6,           3,          18,           6,           2 
## max values  :          39,          56,          71,         102,         138,         408

The metadata above informs us that the gewata object is a relatively small (593x653 pixels) SpatRaster with 6 layers.

Data type is (still) important

When writing files to disk using writeRaster() or the filename = argument in most raster processing functions, you are able set an appropriate data type. Using the datatype = argument, you can save some precious disk space compared to the default datatype, and thus increase read and write speed.

Geoprocessing: in memory vs. on disk

When looking at the documentation of most functions of the terra package, you will notice that the list of arguments is almost always ended by .... These ‘three dots’ are called an ellipsis; it means that extra arguments can be passed to the function. Often these arguments are those that can be passed to the writeRaster() function; meaning that most geoprocessing functions are able to write their output directly to file, on disk. This reduces the number of steps and is always a good consideration when working with big raster objects that tend to overload the memory if not written directly to file.

Cropping a SpatRaster

crop() is the terra package function that allows you to crop data to smaller spatial extents. It accepts objects of classes SpatRaster, SpatVector or SpatExtent to crop to. One way of obtaining such a SpatExtent object interactively is by using the draw() function. In the example below, we will manually draw a regular extent that we will use later to crop the gewata SpatRaster.

# Plot the first layer of the SpatRaster
plot(gewata, 1)
e <- draw()

Now you have to define a rectangular bounding box that will define the spatial extent of the extent object. Click twice, for the two opposite corners of the rectangle. Note that R will wait until you finish drawing before it will let you run any more code (!). Now we can crop the data following the boundaries of this extent.

# Crop gewata using e
gewataSub <- crop(gewata, e)

# Visualize the new cropped object
plot(gewataSub, 1)

You should see on the resulting plot that the original image has been cropped.

Creating layer stacks

To end this section on general files and raster object manipulations, we will see how multi-layer objects can be created from multiple single-layer objects. We have already derived the Normalized Difference Vegetation Index (NDVI) from the Landsat 7 imagery in advance for you. The resulting product is an image (2-dimensional SpatRaster) where high pixel values represent green vegetated areas. The dataset that we are working with is a series of NDVI images of a region in Tahiti over time. Let’s create a multi-layer NDVI object, composed of NDVI layers derived from Landsat acquisitions at different dates. We first need to fetch the data file by file.

# Again, make sure that your working directory is properly set
getwd()

# Download and unzip the data
download.file(url = 'https://github.com/GeoScripting-WUR/IntroToRaster/releases/download/tahiti/tura.zip', destfile = 'data/tura.zip')
unzip(zipfile = 'data/tura.zip', exdir = 'data')

# Retrieve the content of the tura sub-directory
turaList <- list.files(path = 'data/tura/', full.names = TRUE)

The object turaList contains the file names of all the single layers we have to stack. Let’s open the first one to visualize it.

plot(rast(turaList[1]))

We see an NDVI layer, with the clouds masked out. Now let’s create a multi-layer SpatRaster.

turaStack <- rast(turaList)
turaStack

Now that we have our SpatRaster with 166 layers in memory, let’s write it to disk using the writeRaster() function. Note that we adjust the layer names to the original file names (in which information on dates is written). Also note that the data range is comprised between -10000 and +10000, therefore such a file can be stored as signed 2 byte integer (INT2S).

# Create output directory if it does not yet exist
if (!dir.exists("output")) {
  dir.create("output")
}

# Write this file to the 
writeRaster(x = turaStack, filename = 'output/turaStack.tif', names = list.files(path = 'data/tura/'), datatype = "INT2S")

Now this object is stored on your computer, ready to be archived for later use.

Simple raster arithmetic

Adding, subtracting, multiplying and dividing SpatRasters

Performing simple raster operations with a SpatRaster is fairly easy. For instance, if you want to subtract two SpatRasters of same extent, r1 and r2; simply doing r1 - r2 will give the expected output, which is, every pixel value of r2 will be subtracted from the matching pixel value of r1. These types of pixel-based operations almost always require a set of conditions to be met in order to be executed; the two SpatRasters need to be identical in term of extent, pixel size, coordinate reference system, etc.

Subsetting layers from SpatRaster

Previously, we used data with already derived NDVI. Now, we will make use of the Landsat 7 spectral bands to create the NDVI SpatRaster ourselves. The different spectral bands of the same satellite scene are often stored in multi-layer objects. This means that you will very likely import them in your R working environment as one multi-layer SpatRaster. As a consequence, to perform calculations between these bands, you will have to write an expression referring to individual layers of the object. Referring to individual layers in a SpatRaster can be done by using double square brackets [[]].

Let’s look for instance at how the famous NDVI index would have to be calculated from the gewata SpatRaster read earlier. The gewata SpatRaster already contains 6 bands of the Landsat 7 sensor. The formula below uses Landsat 7’s NIR and Red bands, corresponding to layers 4 and 3 of the gewata SpatRaster, respectively.

\[ NDVI=\frac{NIR-Red}{NIR+Red} \]

with NIR and Red being band 4 and 3 of Landsat 7 respectively.

ndvi <- (gewata[[4]] - gewata[[3]]) / (gewata[[4]] + gewata[[3]])

The plot() function automatically recognises the objects of terra classes and returns an appropriate spatial plot.

plot(ndvi)

The resulting NDVI can be viewed in the figure above. As expected the NDVI ranges from about 0.2, which corresponds to nearly bare soils, to 0.9, which means that there is some dense vegetation in the area.

Although this is a quick way to perform the calculation, directly adding, subtracting, multiplying, etc, the layers of big raster objects is not recommended. When working with big objects, it is advisable to use the app() function to perform these types of calculations. The reason is that R needs to load all the data first into its internal memory before performing the calculation and then runs everything in one block. It is really easy to run out of memory when doing that. A big advantage of the app() function is that it has a built-in block processing option for any vectorized function, allowing such calculations to be fully “RAM friendly”. The example below illustrates how to calculate NDVI from the same date set using the app() function.

# Define the function to calculate NDVI using app()
ndviApp <- function(x) {
    ndvi <- (x[[4]] - x[[3]]) / (x[[4]] + x[[3]])
    return(ndvi)
}
ndvi2 <- app(x = gewata, fun = ndviApp)

We can verify that the two layers ndvi and ndvi2 are actually identical using the all.equal() function from the terra package.

all.equal(ndvi, ndvi2)

## [1] "Names: 1 string mismatch"                                                 
## [2] "Attributes: < Component \"ptr\": Component \"names\": 1 string mismatch >"

Here we see that only the name of ndvi2 is different. This is because for the first method, which resulted in ndvi, the layer name was copied from gewata[[4]], whereas this was not the case for the second approach.

Simple raster statistics

The global() function allows the extraction of basic global statistics from an entire SpatRaster, such as the mean value, maximum value, the range of values, or the number of NA cells. global will return a dataframe, from which we can consequently select the value we need. See also ?global for more information and documentation. Let’s try some of these options for our ndvi SpatRaster.

# Calculate the mean (this produces a dataframe)
mean <- global(ndvi, fun = 'mean')
mean

##                                              mean
## LE71700552001036SGS00_SR_Gewata_INT1U_4 0.6865039

# Get only the value from the dataframe
mean$mean

## [1] 0.6865039

# Calculate the amount of NA values (which should be zero in this case)
global(ndvi, fun = 'isNA')

##                                         isNA
## LE71700552001036SGS00_SR_Gewata_INT1U_4    0

# Calculate the amount of non-NA values
global(ndvi, fun = 'notNA')

##                                          notNA
## LE71700552001036SGS00_SR_Gewata_INT1U_4 387229

# Which, as we have no NA values, should be equal to
ncell(ndvi)

## [1] 387229

Reprojections

The project() function allows re-projection of raster objects to any projection one can think of. It accepts the following formats to define coordinate reference systems: WKT, PROJ.4 (e.g., +proj=longlat +datum=WGS84), or an EPSG code (e.g., epsg:4326). Note that the PROJ.4 notation has been deprecated, and you can only use it with the WGS84/NAD83 and NAD27 datums. Other datums are silently ignored.

A central place to search for coordinate reference systems is the spatial reference website (http://spatialreference.org/), from this database you will be able to query almost any reference and retrieve it in any format, including its EPSG code. Well-Known Text (WKT) expressions are preferred for scientific correctness and lack of ambiguity.

Instead of specifying a coordinate reference system to convert to, you can also provide project() with another SpatRaster, and it will convert your input to the CRS of that SpatRaster.

# One single line is sufficient to project any raster to any CRS
ndviLL <- project(ndvi, 'epsg:4326')

Note that if re-projecting and mosaicking is really a large part of your project, you may want to consider using the gdalwarp command line utility (gdalwarp) directly. The gdalUtils R package provides utilities to run GDAL commands from R, including gdalwarp, for reprojection, resampling and mosaicking.

Exporting and inspecting

By the way, we still don’t know where this area is. In order to investigate that, we are going to try displaying it in QGIS. Let’s write our NDVI layer in Lat/Long to a .tif file first.

# Since this function will write a file to your working directory
# you want to make sure that it is set where you want the file to be written
# It can be changed using setwd()
getwd()

# Note that we are using the filename argument, contained in the ellipsis (...) of
# the function, since we want to write the output directly to file.
writeRaster(x = ndviLL, filename = 'output/gewataNDVI.tif')

Now open this file in QGIS, and add a base layer (in QGIS for instance OpenStreetMap, using XYZ tiles with the url https://tile.openstreetmap.org/{z}/{x}/{y}.png). Once we zoom out a bit, we see we are all the way in … Ethiopia. More information will come later in the course about that specific area.

Question 1: Could you also have used ndvi SpatRaster instead of ndviLL for this final step? Why (not)?

We are done with this data set for this tutorial. So let’s explore another data set, from the Landsat sensors. This dataset will allow us to find other interesting raster operations to perform.

Performing simple value replacements

Since 2014, the USGS has started releasing Landsat data processed to surface reflectance. This means that they are taking care of important steps such as atmospheric correction and conversion from sensor radiance to reflectance factors. Additionally, they provide a cloud mask with this product. The cloud mask is an extra layer, at the same resolution as the surface reflectance bands, that contains information about the presence or absence of cloud as well as shadowing effects from the clouds. The cloud mask of Landsat surface reflectance product is named cfmask, after the name of the algorithm used to detect the clouds. For more information about cloud detection, see the algorithm page, and the publication by Zhu & Woodcock. In the following section we will use that cfmask layer to mask out remaining clouds in a Landsat scene.

About the area

The area selected for this exercise covers most of the South Pacific island of Tahiti, French Polynesia. It is a mountainous, volcanic island, and according to Wikipedia about 180,000 people live on the island. For convenience, the Landsat scene was subsetted to cover only the area of interest and is stored online.

# Download the data
download.file(url = 'https://github.com/GeoScripting-WUR/IntroToRaster/releases/download/tahiti/tahiti.zip', destfile = 'data/tahiti.zip')
unzip(zipfile = 'data/tahiti.zip', exdir = 'data')

# Load the data as a SpatRaster and investigate its contents
tahiti <- rast('data/LE70530722000126_sub.tif')
tahiti

## class       : SpatRaster 
## dimensions  : 1014, 1322, 7  (nrow, ncol, nlyr)
## resolution  : 30, 30  (x, y)
## extent      : 224380.7, 264040.7, 8029882, 8060302  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=utm +zone=6 +south +ellps=WGS72 +units=m +no_defs 
## source      : LE70530722000126_sub.tif 
## names       : LE705~band1, LE705~band2, LE705~band3, LE705~band4, LE705~band5, LE705~band7, ... 
## min values  :           0,           0,           0,           0,           0,           0, ... 
## max values  :       16000,       16000,       16000,       16000,       16000,       16000, ...

# Display names of each individual layer
names(tahiti)

## [1] "LE70530722000126_sub_band1"  "LE70530722000126_sub_band2" 
## [3] "LE70530722000126_sub_band3"  "LE70530722000126_sub_band4" 
## [5] "LE70530722000126_sub_band5"  "LE70530722000126_sub_band7" 
## [7] "LE70530722000126_sub_cfmask"

# Visualize the data
plotRGB(tahiti, 3, 4, 5, stretch = "lin")

We can also visualize the cloud mask layer (layer 7).

plot(tahiti, 7)

According to the algorithm description, water is coded as 1, cloud as 4 and cloud shadow as 2.

Question 2: Does the cloud mask fit with the visual interpretation of the RGB image we plotted before?

We can also plot the two on top of each other, but before that we need to assign no values (NA) to the ‘clear land pixels’ so that they appear transparent on the overlay plot.

# Extract cloud layer from the SpatRaster
cloud <- tahiti[[7]]

# Replace 'clear land' with 'NA'
cloud[cloud == 0] <- NA

# Plot the stack and the cloud mask on top of each other
plotRGB(tahiti, 3, 4, 5, stretch = "lin")
plot(cloud, add = TRUE, legend = FALSE)

Applying a cloud mask to a dataset simply consists in performing value replacement. In this case, a condition on the 7th layer of the stack (the fmask layer) will determine whether values in the other layers are kept, or replaced by NA, which is equivalent to masking them. It is more convenient to work on the cloud mask as a separate SpatRaster. We will therefore subset the SpatRaster using the subset() function.

# Extract cloud mask layer
fmask <- tahiti[[7]]

# Create a subset of the first six layers
tahiti6 <- subset(tahiti, 1:6)

We will first do the masking using simple vector arithmetic, as if tahiti6 and fmask were simple vectors. We want to keep any value with a ‘clean land pixel’ flag in the cloud mask; or rather, since we are assigning NAs, we want to discard any value of the stack which has a corresponding cloud mask pixel different from 0. This can be done in one line of code.

# Perform value replacement
tahiti6[fmask != 0] <- NA

However, this is possible here because both objects are relatively small and the values can all be loaded in the computer memory without any risk of overloading it. When working with very large raster objects, you will very likely run into problems if you do that. It is then preferable, as presented earlier in this tutorial to use app().

# First define a value replacement function
cloud2NA <- function(x) {
    x[1:6][x[7] != 0] <- NA
    return(x)
}

The value replacement function takes one argument, x. x corresponds to a SpatRaster, where the 7th layer is our cloud mask.

# Apply the function on the two SpatRasters
tahitiCloudFree <- app(x = tahiti, fun = cloud2NA)

# Visualize the output
plotRGB(tahitiCloudFree, 3, 4, 5, stretch = "lin")

There are holes in the image, but at least the clouds are gone. We could use another image from another date, to create a composite image, but that is a little bit too much for today.

Obtaining satellite imagery

So far we have worked on a prepared dataset. However, in actual work (and the project), you will need to obtain your own data. A lot of satellite imagery is free to use, but sometimes downloading the data in a script can be challenging. Thankfully, there are packages developed to help automate it.

One package that is very helpful for downloading, cleaning, performing classification and postprocessing satellite image time series is called sits (for Satellite Image Time Series). We will not go over most of its functionality (though it can be useful to study it for the project, see its documentation here), but we will use its image downloading capabitilites to quickly get our own dataset to work with.

First, let’s get the sits package and its dependencies (this can take hours if you are not using r2u):

if(!"sits" %in% installed.packages()){install.packages("sits")}
if(!"geojsonsf" %in% installed.packages()){install.packages("geojsonsf")}
if(!"stars" %in% installed.packages()){install.packages("stars")}
if(!"tmap" %in% installed.packages()){install.packages("tmap")}
if(!"httr2" %in% installed.packages()){install.packages("httr2")}
library(sits)

## SITS - satellite image time series analysis.

## Loaded sits v1.5.1.
##         See ?sits for help, citation("sits") for use in publication.
##         Documentation avaliable in https://e-sensing.github.io/sitsbook/.

Next, let’s take a look at what kind of satellite data the package lets us download. There are several data providers, but the most useful are AWS (Amazon Web Services), MPC (Microsoft Planetary Computer), CDSE (Copernicus Dataspace Ecosystem), and USGS (US Geological Survey). Let’s see what they have:

sits_list_collections(c("AWS", "MPC", "CDSE", "USGS"))

## AWS:
## - SENTINEL-2-L2A (SENTINEL-2/MSI)
## - grid system: MGRS
## - bands: B01 B02 B03 B04 B05 B06 B07 B08 B8A B09 B11 B12 CLOUD
## - opendata collection 
## 
## - SENTINEL-S2-L2A-COGS (SENTINEL-2/MSI)
## - grid system: MGRS
## - bands: B01 B02 B03 B04 B05 B06 B07 B08 B8A B09 B11 B12 CLOUD
## - opendata collection 
## 
## - LANDSAT-C2-L2 (LANDSAT/TM-ETM-OLI)
## - grid system: WRS-2
## - bands: BLUE GREEN RED NIR08 SWIR16 SWIR22 CLOUD
## - not opendata collection
## 
## MPC:
## - MOD13Q1-6.1 (TERRA/MODIS)
## - grid system: STG
## - bands: NDVI EVI BLUE RED NIR MIR CLOUD
## - opendata collection 
## 
## - MOD10A1-6.1 (TERRA/MODIS)
## - grid system: STG
## - bands: SNOW ALBEDO
## - opendata collection 
## 
## - MOD09A1-6.1 (TERRA/MODIS)
## - grid system: STG
## - bands: BLUE RED GREEN NIR08 LWIR12 SWIR16 SWIR22
## - opendata collection 
## 
## - COP-DEM-GLO-30 (TANDEM-X/X-band-SAR)
## - grid system: Copernicus DEM coverage grid
## - bands: ELEVATION
## - opendata collection 
## 
## - LANDSAT-C2-L2 (LANDSAT/TM-ETM-OLI)
## - grid system: WRS-2
## - bands: BLUE GREEN RED NIR08 SWIR16 SWIR22 CLOUD
## - opendata collection 
## 
## - SENTINEL-2-L2A (SENTINEL-2/MSI)
## - grid system: MGRS
## - bands: B01 B02 B03 B04 B05 B06 B07 B08 B8A B09 B11 B12 CLOUD
## - opendata collection 
## 
## - SENTINEL-1-GRD (SENTINEL-1/C-band-SAR)
## - grid system: MGRS
## - bands: VV VH
## - opendata collection 
## 
## - SENTINEL-1-RTC (SENTINEL-1/C-band-SAR)
## - grid system: MGRS
## - bands: VV VH
## - not opendata collection
## 
## CDSE:
## - SENTINEL-1-RTC (SENTINEL-1/RTC)
## - grid system: MGRS
## - bands: VV VH
## - opendata collection (requires access token)
## 
## - SENTINEL-2-L2A (SENTINEL-2/MSI)
## - grid system: MGRS
## - bands: B01 B02 B03 B04 B05 B06 B07 B08 B8A B09 B11 B12 CLOUD
## - opendata collection (requires access token)
## 
## USGS:
## - LANDSAT-C2L2-SR (LANDSAT/TM/ETM+/OLI)
## - grid system: WRS-2
## - bands: BLUE GREEN RED NIR08 SWIR16 SWIR22 CLOUD
## - not opendata collection

We can see that several providers provide the same data, such as Sentinel-2 imagery. Note that their level of preprocessing may differ, so it is always good to check the details to know what exactly is the data you are downloading!

In this tutorial we were working with Tahiti data, so let’s get some more images over Tahiti! First, let’s find what area our Tahiti images cover. At the moment sits requires either the coordinates in the WGS84 CRS, or a vector object, which you will learn about in the next tutorial. For now, let’s get the WGS84 coordinates by reprojecting the raster.

tahiti84 <- project(tahiti, "EPSG:4326")
(roi <- ext(tahiti84))

## SpatExtent : -149.599930307084, -149.222668482149, -17.8059811069656, -17.5266515551702 (xmin, xmax, ymin, ymax)

We have a Landsat image so far, so let’s now get a few MODIS images to compare with. It’s easiest to get data from Microsoft Planetary Computer, since it does not require authentication.

tahitiMODIS <- sits_cube("MPC", "MOD13Q1-6.1",
  roi = as.vector(roi), # At the moment SpatExtent objects need to be converted to a vector manually
  bands = c("RED", "NIR", "BLUE"),
  start_date = "2023-07-01", end_date = "2023-08-01")

## Warning: no CRS informed, assuming EPSG:4326

##   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%

Let’s explore what we got. We can see all details in the file_info slot:

tahitiMODIS$file_info[[1]]

## # A tibble: 18 × 14
##    fid        date       band   xres  yres    xmin    ymin    xmax    ymax nrows
##    <chr>      <date>     <chr> <dbl> <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <dbl>
##  1 MYD13Q1.A… 2023-06-18 BLUE   232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
##  2 MYD13Q1.A… 2023-06-18 NIR    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
##  3 MYD13Q1.A… 2023-06-18 RED    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
##  4 MOD13Q1.A… 2023-06-26 BLUE   232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
##  5 MOD13Q1.A… 2023-06-26 NIR    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
##  6 MOD13Q1.A… 2023-06-26 RED    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
##  7 MYD13Q1.A… 2023-07-04 BLUE   232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
##  8 MYD13Q1.A… 2023-07-04 NIR    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
##  9 MYD13Q1.A… 2023-07-04 RED    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## 10 MOD13Q1.A… 2023-07-12 BLUE   232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## 11 MOD13Q1.A… 2023-07-12 NIR    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## 12 MOD13Q1.A… 2023-07-12 RED    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## 13 MYD13Q1.A… 2023-07-20 BLUE   232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## 14 MYD13Q1.A… 2023-07-20 NIR    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## 15 MYD13Q1.A… 2023-07-20 RED    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## 16 MOD13Q1.A… 2023-07-28 BLUE   232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## 17 MOD13Q1.A… 2023-07-28 NIR    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## 18 MOD13Q1.A… 2023-07-28 RED    232.  232. -1.67e7 -2.22e6 -1.56e7 -1.11e6  4800
## # ℹ 4 more variables: ncols <dbl>, crs <chr>, path <chr>, cloud_cover <dbl>

We got six dates and three bands, that is 18 items in total. What are the dates of the images we downloaded?

# This is equivalent to unique(tahitiMODIS$file_info[[1]]$date)
sits_timeline(tahitiMODIS)

## [1] "2023-06-18" "2023-06-26" "2023-07-04" "2023-07-12" "2023-07-20"
## [6] "2023-07-28"

Let’s look at one of these images:

plot(tahitiMODIS, red="RED", blue="BLUE", green="NIR", date="2023-07-04")

## Warning: ignoring unrecognized unit: reflectance
## Warning: ignoring unrecognized unit: reflectance
## Warning: ignoring unrecognized unit: reflectance

As you can see, this is way zoomed out! MODIS has a coarser pixel size compared to Landsat, and therefore a tile in MODIS is much larger than in Landsat.

Let’s load the data with terra. First, we will save all the files on our disk. Do not forget to again include the roi, else it will download the entire tile (as shown above), rather than what you actually need, and will waste your time and space!

outdir <- "output"
if (!dir.exists(outdir))
  dir.create(outdir)
sits_cube_copy(tahitiMODIS, output_dir = outdir, roi=c(lon_min=xmin(roi),lat_min=ymin(roi),lon_max=xmax(roi),lat_max=ymax(roi)))

Now load the files in terra:

(MODISfiles <- list.files(outdir, full.names=TRUE))

##  [1] "output/TERRA_MODIS_h3v10_BLUE_2023-06-18.tif"
##  [2] "output/TERRA_MODIS_h3v10_BLUE_2023-06-26.tif"
##  [3] "output/TERRA_MODIS_h3v10_BLUE_2023-07-04.tif"
##  [4] "output/TERRA_MODIS_h3v10_BLUE_2023-07-12.tif"
##  [5] "output/TERRA_MODIS_h3v10_BLUE_2023-07-20.tif"
##  [6] "output/TERRA_MODIS_h3v10_BLUE_2023-07-28.tif"
##  [7] "output/TERRA_MODIS_h3v10_NIR_2023-06-18.tif" 
##  [8] "output/TERRA_MODIS_h3v10_NIR_2023-06-26.tif" 
##  [9] "output/TERRA_MODIS_h3v10_NIR_2023-07-04.tif" 
## [10] "output/TERRA_MODIS_h3v10_NIR_2023-07-12.tif" 
## [11] "output/TERRA_MODIS_h3v10_NIR_2023-07-20.tif" 
## [12] "output/TERRA_MODIS_h3v10_NIR_2023-07-28.tif" 
## [13] "output/TERRA_MODIS_h3v10_RED_2023-06-18.tif" 
## [14] "output/TERRA_MODIS_h3v10_RED_2023-06-26.tif" 
## [15] "output/TERRA_MODIS_h3v10_RED_2023-07-04.tif" 
## [16] "output/TERRA_MODIS_h3v10_RED_2023-07-12.tif" 
## [17] "output/TERRA_MODIS_h3v10_RED_2023-07-20.tif" 
## [18] "output/TERRA_MODIS_h3v10_RED_2023-07-28.tif"

tahitiMOD = rast(MODISfiles)
plotRGB(tahitiMOD, 13, 7, 1, stretch = "lin")

That looks zoomed in to Tahiti, though the island looks quite distorted! This is because MODIS uses a sinusoidal projection that distorts shapes that are far away from the center (Africa), and Tahiti is almost as far away as you can get!

Let’s reproject and crop to our region of interest:

tahitiMOD = project(tahitiMOD, tahiti)
tahitiMOD = crop(tahitiMOD, tahiti)
plotRGB(tahitiMOD, 13, 7, 1, stretch = "lin")

It looks a bit more pixelated than the Landsat image, but now the shape of the island is no longer distorted!

Summary

Today you got a general introduction to the terra package, its basic functions, its object classes and methods. You also learned how to use a few functions from the sits package to retrieve data. The functions can be categorized as follows:

Terra classes

SpatRaster: Single- or multi-layer raster object.
SpatVector: Vector object.

Functions

Reading and writing

rast(): Read a raster object from disk.
ext(): get the extent of the raster
writeRaster(): Write a SpatRaster to disk.
filename = argument: Available for most functions of the terra package that produce SpatRaster objects, write directly the output of the function to disk.

Reformat data

crop(): Modify the extent of a SpatRaster based on another spatial object or a SpatExtent object.
project(): Reproject (and resample) a SpatRaster to a desired coordinate reference system or reference SpatRaster.
subset(): Create a subset of a SpatRaster.
c(): Combine multiple SpatRasters into a single SpatRaster.

Simple visualization

plot(): Plot a SpatRaster. Use add = TRUE to overlay several objects.
plotRGB(): Plot an RGB color composite. Specifying the bands to assign colors as arguments.

Raster calculations

First of all, SpatRasters work just like vectors of numerics (c(1, 2, 3)). They can be subsetted, added, subtracted, etc.
app(): Apply a function to every pixel independently of a single SpatRaster (Single or multi-layer). RAM friendly and can write output directly to disk using the filename = argument.

sits functions

sits_cube(): Download satellite imagery from a cloud platform
sits_cube_copy(): Write the imagery to disk

Extra

The gdalUtilities package provides interesting wrappers facilitating the use of GDAL functions from within R.

Loïc Dutrieux, Jan Verbesselt, Johannes Eberenz, Dainius Masiliūnas, Sabina Rosca

2024-08-30