Linux terminal & Bash
Learning objectives
- Knowing how to use the terminal
- Running R and Python from the terminal
- Learn the basics of Bash scripting and know how to create a shell script
Using the terminal and Bash
There are two ways to interact with your operating system: a graphical user interface (GUI), where you point and click, and a command-line interface (CLI), where you type commands to make something happen. GUIs are simpler to use, but CLIs are more powerful and faster for some tasks, once you get used to them.
Question 1: What are the advantages of using CLI? Can you think of some examples?
Most Linux distributions come with a terminal, which is a program you use to run CLI programs. You might know the Command Prompt program on Windows: that is a type of terminal. On Linux, there is a variety of terminal applications to choose from. You can start one on your virtual machine by clicking on Show Apps → Terminal. This will look like:
A terminal is just a gateway to the world of CLIs, but through it you interact with a particular shell (or command interpreter) which speaks a programming language. The default shell on Linux is Bash, and programs written in the Bash language are called Bash scripts. Much like the R console, you can input commands to Bash line by line through the terminal.
Bash shell scripting, like also R or Python, allows multiple commands to be combined, facilitating automation. A shell script (shell program) is a text file that contains commands that are interpreted by the shell (see below, we will learn how to write a shell script). Each command can be linked in a script to combine several commands by providing the output of one as input to the other. Shell scripts can also contain the control structures common to the majority of programming languages (i.e. variables, logic constructs, looping constructs, functions and comments). The main distinction between shell programs and those written in C, C++, Java (to name but a few) is that shell programs are not compiled for execution, but are readily interpreted by the shell.
Question 2: What is a shell script?
Bash is not only the default shell on Linux, but also macOS, and there are versions of Bash that run on Windows too. It is included with Git for Windows, and in Windows 10 Bash is even included by default with the Windows Subsystem for Linux. However, without the wealth of CLI programs that Linux distributions come with, Bash functionality is fairly limited.
But enough theory: let’s try using the terminal in practice!
Using the terminal
Now, fire up your terminal. You get a line, stating your user name and the machine’s host name. This is called the shell prompt. It means it’s ready for you to enter a command. Let’s try something random. Type in anything, and press enter.
Most likely the system doesn’t have the command you typed in! Random doesn’t work: you need to remember (or look up) commands to use them!
Now, press the up arrow, and you’ll see the previous command reappear. What’s this sorcery? The up arrow key on your keyboard is for accessing the command history. The terminal saves up to 500 commands you entered, so to not type them over and over, you can look for them with the up/down arrow. The left and right arrows are for moving the cursor within a specific line, so you can edit the text in between. The terminals were designed to work with a keyboard, so you can’t use your mouse to move the cursor, but you can use the Home key to go to the beginning of the line, and the End key to go to the end. Now there’s another thing – Ctrl+V for pasting text doesn’t work. You can set it up as a shortcut somewhere, but it’s usually something else, e.g. Ctrl+Shift+Insert. But you can always paste if you right-click on the terminal, and it usually tells you what is the keyboard shortcut to do so, so that you don’t need the mouse every time.
Now, for us not to get the ‘command not found’ slap to the face,
let’s try something simple. Type date
.
There you go. Why bother looking at your built-in calendar in the
clock, when you can fire up your terminal and type date
,
and see what day it is! Just kidding, it’s a simple command, the more
useful/difficult ones are coming up next. The related command to
date
is cal
– it will display the current
month’s calendar.
You may also try free
, and it will display the amount of
free memory.
Or df
(standing for “disk free”), to list free space on
your drives.
If you’re already in the type-only mood, you can enter the command
exit
to get out of the terminal emulator instead of
pressing the “x” button.
Command options
Now we know how to move from one directory to another, but how do you
know what directories there are for you to move between? ls
is a command used to list files and directories in a given directory. It
can be used in various ways. These various ways come with using a form
of adding an option to our command. To make things clearer; you
can simply type ls
. But, you can also add an option, which
will modify your command. It can come in useful when you are
looking for something specific.
That’s what an option is. And formally we can write it down like this:
command -option argument
Command is, well, a command we write in (like
pwd
, ls
or anything else we have learned by
now).
We already stated above the purpose of an option. But note
that it should be written exactly as it’s in the form; with a
dash in front of it. So, if the option is l
, you
should put -l
after the command.
An argument is an object upon which the command operates (in this case, it will be directories, as we are learning how to navigate through them).
So, let’s try out ls
, and use it on the
/etc
directory in the root of the filesystem. This time,
without any options.
There you go, a whole bunch of files. It also sorts them by colours. The blue ones are directories, the white ones are regular files, the green ones are executable files. There are more colours, as they represent different file types.
Next, you can use the same command, but with an option
-l
added. Option -l
will list the same files
and directories, but in a long format. In case you need more
information:
So, using the long format, you see much more information, and some
crazy looking signs like -rw-r–r–
at the beginning of all
lines. Actually, here’s a scheme, representing what all of the given
information actually means:
File Name is the name of the file. Modification time is the last time the file has been modified. Size is the size of the file in bytes. Group is the name of the group that has file permissions along with the owner, and Owner is the user who owns the file.
The most important one is File Permissions. That’s the
gibberish at the beginning of every line in long format. The first
character is the file type. If it’s a d
, it means the file
is actually a directory. If it’s -
, it means it’s an
ordinary file. The next three characters represent the read, write and
execution rights of the file’s owner. The next three are the same rights
of the group that also has access to the file, and the last three
characters represent rights of everyone else trying to use the file.
So for example, if we have a file which in long format displays:
-rw-r--r--
, it means it’s an ordinary file (the first
-
), the owner of the file can read and write the file, but
he can’t execute it, as it’s not an executable file (the
rw-
characters after the initial -
), and the
user group and everyone else can only read the file (you can see
r--
sequence repeating twice). If the user group had
rwx
instead of r--
, it would mean they could
read, write and execute the file.
Next option for ls
is ls -la ..
- this will
list all of the files, as in a usual command, hidden files are not
shown. It will list all files in the parent of the working directory in
long format.
Question 3: What is the difference between
ls -l
,ls -lh
andls -lh --si
? Hint: you can runman ls
to inspect the meaning of different options
Getting information about files
less
is a command which will display a text file and let
you scroll through it. For example, you’re looking for text file
os-release
in /etc
. You have succesfully found
it there with ls /etc
, and now you want to read it. You
just use less /etc/os-release
.
How do you control less
? Easy, with your keyboard!
less
will display only one page of your text at a time.
You can move line by line with the arrow keys. To go forward an entire
page, you can press Page Up. To go back one
page, you can use Page Down. > will
take you to the end of the text file, while < will
take you to the beginning of the text. /characters
will
search for characters
inside the text (for example, if you
write /ubuntu
, it will search for occurrences of
ubuntu
inside your text and mark them). n
will go to the next occurrence of the search term, and
h will display all your options (h as in help!). You
quit less with the letter q.
The name less
is a pun on the word more
,
which is a much more basic tool for displaying a text file and
scrolling, because it only allows scrolling down; therefore,
less
is more than more
.
The file
command will show what kind of file is that
you’re looking for, be it ASCII text, a jpg image, a bash script etc. As
we performed our exercise with /etc/os-release
, let’s use
it here also.
There you go, now you know what os-release
is.
Incidentally, it may be either an ASCII text file or a link to one! It
depends on your Linux distribution (version). If it’s a link, try to run
the command on the linked file. Now try it out with something else, and
see the output.
Next, we have the commands type
and which
.
Like file
, they give information on the type, but they
operate on commands instead of files. which
tells you where
you can find the executable that is run if you type in a command. Let’s
try it on the command file
:
Now we know that when we run file
, Bash executes the
program /usr/bin/file
. How about cd
?
What?! It seems that there is no such executable! This is because it
is so common, it’s built into Bash itself. type
is a bit
more clever than which
and tells you whether a command is
an executable file, or a command built into Bash itself. Let’s see what
it says about cd
:
In some cases, you might have both available. Let’s take a look at
the command time
that is used to measure how long a command
runs for:
It is also built into Bash itself. But there is another command
called time
that is an actual executable:
Because the shell prefers builtins compared to executables, when you
run time
you will run the builtin version, rather than the
executable version. But you can reach the executable version (which is
more feature-rich!) by calling it with its absolute path:
type
and which
will come very much in handy
once we get to Python, as we will have several Python versions
installed. It will help determine which version we have active.
File manipulation
Copying, pasting files, creating directories etc. is probably easier using graphical tools, but, if you’d like to perform more complicated tasks, like copying only .html files from one directory to another, and only copying files that don’t exist in the destination directory, CLI just might come in handy. So, before we start with the commands themselves, let’s take a quick stop at wildcards. They are a set of special characters that help you pick out a set of files based on some simple rules (which characters appear in a file name, how many characters, upper/lower case characters etc.). Here’s the table:
And here are a few examples:
If you use a command with an argument containing a filename, you can use wildcards with no problem. Bash will go ahead and expand the wildcard into a set of all matching filenames, and the command will actually receive a set of files and not the wildcard string.
cp
is used to copy files or directories. You can use it
pretty easily: navigate to the directory you’d like to copy the files
from and to, and simply do cp file1 file2
- to copy single
files, or cp file1 file2 ... directory
- to copy files from
your current working directory to the directory specified.
We can use mv
to rename a file or directory, or to
move a file or directory. We can use it this way:
mv filename1 filename2
- if we want to rename
filename1 to filename2, or
mv file directory
- if we want to move file to
directory.
The rm
command removes/deletes files and directories.
Usage is pretty straightforward: rm file
or
rm -r directory
. But, do be careful when using
rm
, as there is no undelete option (the file is erased and
doesn’t go to the bin), so be extra careful not to inflict unwanted
damage to your system!
mkdir
is used for creating directories. Now, create a
directory called Bash
(i.e. a directory that will contain
our Bash scripts):
It should now look like this:
Now, try out the commands that you learned:
- make a directory and remove it
(e.g.
mkdir namedirectory
andrmdir namedirectory
orrm -r namedirectory
). - create an R script via
rstudio
orrkward
, and then remove it via the terminal usingrm filename.R
. - create another file and copy it and then remove it, etc.:
- use
ls
commands and its options.
Tip: Bash has a feature called Tab-completion. If you
start writing a command or filename, pressing the Tab
key a
couple of times will give a list of suggestions for auto-completion.
This is super-handy so that you never need to write filenames etc. In
addition, you can recall the last commands you entered by using the up
arrow key. Lastly, you can always open multiple terminals, even in tabs,
by using File → Open Tab.
To recap so far, here’s a list of most common commands:
pwd
: show your current working directorycd
: change directorycd ..
: move up one directorymkdir
: create directoryrm
orrm -R
: delete files or directoriessudo
: running programs as root (administrator/super-user), which may ask for your user paswordls
: listing files in a directorycp
: copy files e.g. for backing up things or just copying. We will use these command in the scripts below.
How to find help
Documentation and manuals
Mostly every command has documentation that comes with it. So you’re
somewhere doing your CLI thing, no access to the internet so you can’t
bug people on the forums or IRC, and you need to find out how to exactly
use a command. You can do it two ways. The first is the command
help
. The help
command works with shell
builtins, and not executable files. So you can pick a shell builtin,
like cd
or time
, and simply type
help cd
or help time
. You’ll get a helpful
page printed out in your terminal, so go ahead and read what they have
to offer. Here’s another example:
The help page shows in what ways you can use the command, what options you can use (it’s in square brackets, which means they are optional! Also, if there’s a vertical separator inside the square brackets, it means the options mentioned are mutually exclusive. Don’t use them together!)
help
works only for the shell builtins. But most
executables provide an option --help
. As far as usage goes,
it’s similar to help
, but you have to type
--help
after the command you want to inspect. For
example:
However, --help
is just a convention, which programs are
not obliged to follow. Sometimes the option is called -h
,
and sometimes it is not present at all.
To get more information about how to use a command, most executables
come with a formal documentation page. Distributions often mandate the
inclusion of a manual page for every package, so the manual page is the
most useful source of information. You can inspect the manual page using
the man
command. You just enter man program
,
and see what it prints out. Pick any program on your computer, and try
it out. For example, let’s try man which
. You get a file
opened, split into categories. It gives you information what the program
is, what it does, how you can use it etc., but it doesn’t offer
examples, as it’s not a tutorial.
Manual pages are text files displayed in a pager program that allows
easy scrolling. The default pager is less
, which you have
already used in the third exercise. You can also look at its manual page
using man less
. Also try man intro
: the
“Introduction to user commands”, a well-written, fairly brief
introduction to the Linux command line.
Optional: You can also read the Ubuntu documentation on CLI to learn more, and let us know if you have questions about some commands.
Online resources
Great, now we know how to find help about specific commands! But how do we know how and what to write in the first place? Even the most experienced programmers run into these questions, so it’s important to know how to find answers to them.
There are many places where help can be found on the internet. So in case the documentation is not sufficient for what you are trying to achieve, a search engine like Google is your best friend. Most likely by searching the right key words relating to your problem, the search engine will direct you to online documentation, a tutorial, or to some discussions on Stack Exchange. It is quite likely that the problem you are trying to figure out has already been answered before, and using these resources you should be able to solve your particular problem as well. However, you need to be critical about the information you find on the internet, as it may refer to old versions of the software you are using, or it may provide a workaround but not a real solution to the problem. And, of course, some of the solutions may simply not work for you.
ChatGPT and generative AI
Another type of online resource that has recently been gaining in popularity is generative AI, such as ChatGPT. Generative AI models can be interacted with by asking it questions, including questions about programming. The AI responds by providing examples of code, explanations about what the code does, and how to run it. Of course, most AI solutions are not limited to code and will also answer questions on history, biology, quantum mechanics, and will even play Dungeons and Dragons with you, including throwing dice.
Generative AI models can be a great tool to enhance learning, as they can quickly answer specific questions and give coding suggestions. However, many of the limitations of web search apply to generative AI models as well (in fact, most of these models are something of a smart web search engine, as they are trained on a lot of text found on the internet). Therefore, you need to be very critical of AI-generated answers. The code that the AI generates may seem like it would solve your problem, but it may also do something incorrectly, such as calling functions that are no longer available, or even making them up altogether. Previously, many generative AI solutions were unable to provide references for their statements, and when asked provided a list of references and links that did not exist in reality, though this has improved in the past months. They may also answer questions completely wrong, but the explanation that they provide usually sounds quite convincing, therefore it may mislead you or make you second-guess yourself. When generative AI models are confronted about a wrong answer, they often insist that it is correct, and the longer you talk with a generative AI, the more it will get facts mixed up with its own previous answers, as it remembers and learns from its own output.
Generative AI tools can be chatbots, like ChatGPT, but they can also be tools that suggest code snippets as you write code, such as GitHub Copilot. The AI code suggestions are based on the same models and have the same pitfalls. But in addition, they may suggest code that was taken from software whose license is incompatible with the license of your own code, which could cause copyright issues. Some of the newer code suggestion models are able to provide references to where the code is sourced from, and the license it is under.
Some of the currently active generative AI tools are:
- ChatGPT - the original chatbot that started the generative AI trend. Made by a team of top AI researchers that formed into a company. The newest version, ChatGPT-4 is able to provide sources and references if you specify this in the prompt. It also allows the usage of custom GPTs, better suited to specific tasks, like the Python and R Wizard GPTs.
- Perplexity - an alternative chatbot built using ChatGPT-4o and Anthropic’s Claude 3.5 models that is able to provide references for its statements (and you can even pick which ones it uses to give you answers). However, it still gives biased output and may get confused with its own answers.
- Microsoft Copilot - Microsoft’s version of ChatGPT, also built on the ChatGPT-4 framework. It can also provide references for you if you ask. It generally allows you to access the newest version of ChatGPT for free, instead of having to pay for GPT Plus on the ChatGPT website.
- Google Gemini - Google’s version of ChatGPT. It has recently improved a lot compared to Google’s older models. Will also provide sources or references, if you ask for it. Can integrate with your Google Workspace (Gmail, Drive, YouTube), for example, it’s a great way to summarize a YouTube tutorial into bulletpoints or clear steps.
- Amazon Codewhisperer - code suggestion AI, free to use, but works only with some code editors.
Note that all of these generative AI tools are built on proprietary models, but there are open source alternatives such as Meta’s Llama 3.1 that you can use for your own applications.
Every day, more generative AI tools become available, increasingly embedded in the tools we use, like Google Gemini replacing ‘Ok Google’ on your phone and some web-browsers shipping with built-in models, like Brave browser shipping with Leo AI based on MistralAI’s Mixtral model. Despite the fact that generative AI is increasingly embedded, it remains important to check whether the results it provides are factual, accurate and if the output is compatible with the license of your own code.
Question and answer forums
However, it may also happen that you discover a bug or something that you would qualify as abnormal behavior, or that you really have a question that no one has ever asked (corollary: has never been answered). In that case, you may submit a question to an appropriate Stack Exchange (e.g. Unix & Linux for Bash questions, or contact the author of the package you are using (often by filing an issue on the package’s GitHub page).
Stack Exchange has a few rules, and it’s important to respect them in order to ensure that:
- no one gets offended by your question,
- people who are able to answer the question are actually willing to do so,
- you get the best quality answer.
So, when posting to Stack Exchange:
- Be courteous.
- Provide a brief description of the problem and why you are trying to do that.
- Provide a reproducible example that illustrate the problem, reproducing the eventual error.
- Do not expect an immediate answer (although well presented questions often get answered fairly quickly).
Reproducible examples (reprex)
Indispensable when asking a question to the online community, being able to write a reproducible example has many advantages:
- It may ensure that when you present a problem, people are able to answer your question without guessing what you are trying to do.
- Reproducible examples are not only to ask questions; they may help
you in your thinking, developing or debugging process when writing your
own functions.
- For instance, when developing a function to do a certain type of raster calculation, start by testing it on a small subset file, and not directly on your actual data that might be covering the whole world.
One could define a reproducible example by:
- A piece of code that can be executed by anyone who can run the programming language you are using, independently of the data present on their machine or any preloaded variables.
- The computation time should not exceed a few seconds and if the code automatically downloads data, the data volume should be as small as possible.
So basically, if you can quickly start a terminal on your neighbour’s computer while he is on a break, copy-paste the code without making any adjustments and see almost immediately what you want to demonstrate; congratulations, you have created a reproducible example.
Let’s illustrate this by an example.
I want to move all directories with Star Wars film subtitles to the
directory ../starwars
, but not move any of the Star Trek
directories. Here is a piece of code that can recreate my directory
structure:
mkdir -p films/{"the phantom menace","attack of the clones","revenge of the sith","a new hope","the empire strikes back","return of the jedi",\
"the motion picture","the wrath of khan","the search for spock","the voyage home","the final frontier","the undiscovered country","generations","first contact","insurrection","nemesis"} starwars
cd films
# I tried this, but it did not move the phantom menace, a new hope and the empire strikes back
mv *\ t* ../starwars
As you can see from this example, the problem is reproduced on any
computer that is running Bash, and the changes are restricted to
creating two directories, namely, films
and
starwars
, which are easy to clean up afterwards.
Package installation and management
One of the greatest advantages of Linux distributions over other OSs is the package manager. Even if you never used Linux before, you are probably already using a package manager on your mobile device: The App Store, Google Play Store and Windows Store are all package managers, modelled after the Linux ones. A package manager is a central system for downloading, installing and removing software.
Each major Linux distribution has its own package manager, which is aware of all packages maintained by the distribution. These packages are tested and are known to work with that particular distribution version, so the package manager is the first place to look for installing additional software. The package manager is typically a command-line program, although some distributions also have GUI interfaces for it.
Ubuntu uses Aptitude as the package manager. Here is a short list of the most useful package manager commands on Ubuntu:
apt search packagename
: Search for a package called “packagename”.apt list package*
: List all packages starting with “package”.sudo apt install packagename
: Install or update a package. This changes system files and therefore requires administrator privileges (sudo
).sudo apt remove packagename
: Uninstall a package.- See
man apt
for more.
For instance, if you run apt list chrom*
, one of the
results will be chromium-browser
. It’s Chromium,
the open-source version of Google Chrome. You can install it by running
sudo apt install chromium-browser
. Similarly, the Ubuntu
package repository contains a lot (but not all) of R packages (they are
prefixed with r-cran-
) and Python packages (prefixed with
python3-
; the ones prefixed with python-
are
for Python 2 which is deprecated). If there is a package available in
the distribution repository, almost always it is better to use that
instead of using a package manager built into the language
(install.packages
in R and
easyinstall
/pip
/conda
in
Python).
The aforementioned commands are specific to the Debian family of
Linux distribution (of whom Ubuntu is a member). In other distributions,
package manager syntax is different, but the result is the same. For
instance, in openSUSE the equivalent commands would be
zypper search
, sudo zypper install
and
sudo zypper remove
.
Whenever a package is not included in the distribution repository, one option is to look for additional software sources. Ubuntu allows users to maintain their own packages through a system called Personal Package Archives (PPA). However, these packages are not tested and are not guaranteed to work, or could even cause problems in the system, so you have to be careful. Other distributions also have their own third-party repository systems: openSUSE uses Open Build Service, Fedora uses Copr etc.
If a package doesn’t exist in third party repositories either, there is often the possibility to download the source code of a package and compile it. It is common for cross-platform software vendors to provide installers for Windows and source code for Linux. However, compiling from source yourself should only be done as the very last resort; in fact, it is often easier and safer to create a package yourself than to try to build it from source!
Starting R or Python from the terminal
Starting and stopping R from the terminal (this is the same as the R console you know from RStudio/RKWard):
Starting and stopping Python from the terminal:
Scripting in the terminal
Hello, world Bash script
Bash is primarily a scripting language, so it would be a
crime not to talk about scripting. Let’s dive straight in with a
Bash script. More precisely the infamous “Hello World” script.
You can create a bash script by opening your favorite text editor to
edit your script and then saving it (typically the .sh
file
extension is used for your reference, but is not required. In our
example, we will be using the .sh
extension).
So let’s get started. First, create a simple text file and call it
HelloWorld.sh
, save this in the Bash
directory
you just created, and add the following text. You can use the
gedit
editor, or use rstudio
or
rkward
as a sort of text editor. In fact,
rstudio
makes it rather convenient to edit Bash scripts,
exactly the same way as R scripts, including the ability to run commands
line by line. It is also worth noting that there are even command-line
text editors, like nano
, which are useful for editing files
that require administrative privileges.
The first line of the script just defines which interpreter to use (and where it is located). That’s it, simple as that!
Note: There is no leading whitespace before
#!/bin/bash
, and you cannot add any comments before it.
This shebang should be the very first thing in the file.
To find out where your bash
interpreter is located type
the following in the terminal (this works also on a Mac terminal!):
Second, to run a bash script, you have two options. The first is have
to set the correct file permissions. We do this with chmod
(change mode) command in terminal as follows, this needs to be done only
once per file:
Optional: More info about
chmod
for your future reference. Note: today is just an
introduction to let you know what is possible so that you can find your
way easier in the future.
In this case, we can then proceed to run the script directly:
Alternatively, we can specify which interpreter to use specifically, and then pass the file name to the interpreter. This option does not require changing file permissions:
Below is a summary of what we have done in the terminal:
echo "Go to the Bash directory"
cd Bash
echo "Check that the file is there using the ls command:"
ls
echo "Then change the permissions:"
chmod u+x HelloWorld.sh
echo "We can now run our first Bash script:"
./HelloWorld.sh
Hopefully you should have seen it print Hello, World
onto your screen. If so well done! That is your first Bash
script (see below for a screenshot):
Question 4: In the first option above, why do we add
./
in front of the Bash script name? What happens if you don’t? Why?
Note: we can also run Bash code from R using the
system()
function that can invoke an OS command:
# R code
setwd("Bash/") # Set the working directory in R
print(system("./HelloWorld.sh", intern = TRUE)) # Execute this command in Bash
Note: And vice versa, we can run an R script from the terminal using Bash:
Bash script with a variable
Variables basically store information. You set variables like this (you can type this in the terminal).
var
can be anything you want as long as it doesn’t begin
with a number. “FOO” can be anything you want. There cannot be
any space in between the =
sign! To access the
information from the variable you need to put a ‘$’ in front of it like
this:
Now create the following e.g. variables.sh
script in the
Bash directory and apply the
chmod u+x variables.sh
command on this script using the
terminal.
#!/bin/bash
echo "Now with the read function"
clear
echo "Please enter your name"
read name
echo "Please enter your age"
read age
echo "So you're a $age year old, called $name"
You can run the script once it is executable:
Question 5: Try it out yourself, and try to do a calculation of e.g. a + b as input variables. Hint: Shell-tips
Optional: If you want to learn more about Bash scripting: https://help.ubuntu.com/community/Beginners/BashScripting
For the next section, let’s download a file from the Intro to
raster tutorial. Manually download the
gewata.zip
file from Github (link)
and unzip in a data
directory you create, you should have a
.TIF file. Then
navigate to this directory in your terminal.
Using the GDAL library from the terminal
GDAL is a very powerful and fast processing library written in C/C++ for raster and vector geospatial data formats. Now via the terminal we can access GDAL directly! E.g. we can check out what the current version of GDAL is that is installed on our Linux OS. We will learn more about GDAL in the later tutorials.
Type the following in the data
directory: (Note: You can
write a shell script to do the following commands below but first type
in the commands via the terminal to understand what is happening.)
One of the easiest and most useful commands in GDAL is
gdalinfo
. When given an image as an argument, it retrieves
and prints all relevant information that is known about the file. This
is especially useful if the image contains additional tag data, as is
the case with TIF
files.
Using gdalinfo:
You should now see some information about the raster file, for example the coordinate system, the cell size, and some statistics about the raster bands.
Now let’s calculate the NDVI
via GDAL terminal by using the gdal_calc.py
script. See GDAL_calc for more
information.
cd data
cp LE71700552001036SGS00_SR_Gewata_INT1U.tif input.tif
echo "* all files in the directory"
ls
echo "* now apply gdal_calc: Command line raster calculator with numpy syntax"
gdal_calc.py -A input.tif --A_band=4 -B input.tif --B_band=3 --outfile=ndvi.tif --calc="(A.astype(float)-B)/(A.astype(float)+B)" --type='Float32'
echo "* remove the input temporary file"
rm input.tif
Question 6: Try to write to calculate the NDVI using the lines above in a nice and short shell script.
-
Hint 1: use
cd ..
to move to the parent directory -
Hint 2: No spaces in file names are allowed and try to use variables
e.g.
fn=$(ls *.tif)
Let’s now check if the range of the NDVI values makes sense, and make a nice script in a separate file (this will work only if you have one .TIF file in the data directory):
#!/bin/bash
echo "teamname"
echo "Current date"
echo "Calculate LandSat NDVI"
mkdir -p output
fn=data/*.tif
echo "The input file(s): $fn"
outfn=output/ndvi.tif
echo "The output file: $outfn"
echo "calculate ndvi"
gdal_calc.py -A $fn --A_band=4 -B $fn --B_band=3 --outfile=$outfn --calc="(A.astype(float)-B)/(A.astype(float)+B)" --type='Float32'
echo "look at some histogram statistics"
gdalinfo -hist -stats $outfn
More info here on the power of GDAL via the terminal: GDAL_website and gdalinfo
Handy functions are (See the examples at the bottom):
Optional:
More info about Bash basics from GNU.
More information
- Ryan’s Linux & Bash Tutorial goes into more depth but is also very handy to reference, and includes a cheat sheet with commands.
- Hands-on introduction to bash basics for beginners
- A great bash scripting tutorial
- Basic terminal usage and installing software
- Beginners guide to nano, the linux command line text editor
- Learning the shell
- How to use pwd command in Linux
- For macOS users and introduction to use the terminal on macOS: