Linux terminal & Bash
Learning objectives
- Knowing how to use the terminal
- Running R and Python from the terminal
- Learn the basics of Bash scripting and know how to create a shell script
Using the terminal and Bash
There are two ways to interact with your operating system: a graphical user interface (GUI), where you point and click, and a command-line interface (CLI), where you type commands to make something happen. GUIs are simpler to use, but CLIs are more powerful and faster for some tasks, once you get used to them.
Question 1: What are the advantages of using CLI? Can you think of some examples?
Most Linux distributions come with a terminal, which is a program you use to run CLI programs. You might know the Command Prompt program on Windows: that is a type of terminal. On Linux, there is a variety of terminal applications to choose from. You can start one on your virtual machine by clicking on Show Apps → Terminal. This will look like:
A terminal is just a gateway to the world of CLIs, but through it you interact with a particular shell (or command interpreter) which speaks a programming language. The default shell on Linux is Bash, and programs written in the Bash language are called Bash scripts. Much like the R console, you can input commands to Bash line by line through the terminal.
Bash shell scripting, like also R or Python, allows multiple commands to be combined, facilitating automation. A shell script (shell program) is a text file that contains commands that are interpreted by the shell (see below, we will learn how to write a shell script). Each command can be linked in a script to combine several commands by providing the output of one as input to the other. Shell scripts can also contain the control structures common to the majority of programming languages (i.e. variables, logic constructs, looping constructs, functions and comments). The main distinction between shell programs and those written in C, C++, Java (to name but a few) is that shell programs are not compiled for execution, but are readily interpreted by the shell.
Question 2: What is a shell script? What is Bash? And what is terminal?
Bash is not only the default shell on Linux, but also macOS, and there are versions of Bash that run on Windows too. It is included with Git for Windows, and in Windows 10 Bash is even included by default with the Windows Subsystem for Linux. However, without the wealth of CLI programs that Linux distributions come with, Bash functionality is fairly limited.
But enough theory: let’s try using the terminal in practice!
Using the terminal
Now, fire up your terminal. You get a line, stating your user name and the machine’s host name. This is called the shell prompt. It means it’s ready for you to enter a command. Let’s try something random. Type in anything, and press enter.
Most likely the system doesn’t have the command you typed in! Random doesn’t work: you need to remember (or look up) commands to use them!
Now, press the up arrow, and you’ll see the previous command reappear. What’s this sorcery? The up arrow key on your keyboard is for accessing the command history. The terminal saves up to 500 commands you entered, so to not type them over and over, you can look for them with the up/down arrow. The left and right arrows are for moving the cursor within a specific line, so you can edit the text in between. The terminals were designed to work with a keyboard, so you can’t use your mouse to move the cursor, but you can use the Home key to go to the beginning of the line, and the End key to go to the end. Now there’s another thing – Ctrl+V for pasting text doesn’t work. You can set it up as a shortcut somewhere, but it’s usually something else, e.g. Ctrl+Shift+Insert. But you can always paste if you right-click on the terminal, and it usually tells you what is the keyboard shortcut to do so, so that you don’t need the mouse every time.
Now, for us not to get the ‘command not found’ slap to the face,
let’s try something simple. Type date
.
There you go. Why bother looking at your built-in calendar in the
clock, when you can fire up your terminal and type date
,
and see what day it is! Just kidding, it’s a simple command, the more
useful/difficult ones are coming up next. The related command to
date
is cal
– it will display the current
month’s calendar.
You may also try free
, and it will display the amount of
free memory.
Or df
(standing for “disk free”), to list free space on
your drives.
If you’re already in the type-only mood, you can enter the command
exit
to get out of the terminal emulator instead of
pressing the “x” button.
Command options
Now we know how to move from one directory to another, but how do you
know what directories there are for you to move between? ls
is a command used to list files and directories in a given directory. It
can be used in various ways. These various ways come with using a form
of adding an option to our command. To make things clearer; you
can simply type ls
. But, you can also add an option, which
will modify your command. It can come in useful when you are
looking for something specific.
That’s what an option is. And formally we can write it down like this:
command -option argument
Command is, well, a command we write in (like
pwd
, ls
or anything else we have learned by
now).
We already stated above the purpose of an option. But note
that it should be written exactly as it’s in the form; with a
dash in front of it. So, if the option is l
, you
should put -l
after the command.
An argument is an object upon which the command operates (in this case, it will be directories, as we are learning how to navigate through them).
So, let’s try out ls
, and use it on the
/etc
directory in the root of the filesystem. This time,
without any options.
There you go, a whole bunch of files. It also sorts them by colours. The blue ones are directories, the white ones are regular files, the green ones are executable files. There are more colours, as they represent different file types.
Next, you can use the same command, but with an option
-l
added. Option -l
will list the same files
and directories, but in a long format. In case you need more
information:
So, using the long format, you see much more information, and some
crazy looking signs like -rw-r–r–
at the beginning of all
lines. Actually, here’s a scheme, representing what all of the given
information actually means:
File Name is the name of the file. Modification time is the last time the file has been modified. Size is the size of the file in bytes. Group is the name of the group that has file permissions along with the owner, and Owner is the user who owns the file.
The most important one is File Permissions. That’s the
gibberish at the beginning of every line in long format. The first
character is the file type. If it’s a d
, it means the file
is actually a directory. If it’s -
, it means it’s an
ordinary file. The next three characters represent the read, write and
execution rights of the file’s owner. The next three are the same rights
of the group that also has access to the file, and the last three
characters represent rights of everyone else trying to use the file.
So for example, if we have a file which in long format displays:
-rw-r--r--
, it means it’s an ordinary file (the first
-
), the owner of the file can read and write the file, but
he can’t execute it, as it’s not an executable file (the
rw-
characters after the initial -
), and the
user group and everyone else can only read the file (you can see
r--
sequence repeating twice). If the user group had
rwx
instead of r--
, it would mean they could
read, write and execute the file.
Next option for ls
is ls -la ..
- this will
list all of the files, as in a usual command, hidden files are not
shown. It will list all files in the parent of the working directory in
long format.
Question 3: What is the difference between
ls -l
,ls -lh
andls -lh --si
? Hint: try running each command in the terminal and observe the differences in the output. You can also runman ls
to inspect the meaning of different options.
Getting information about files
less
is a command which will display a text file and let
you scroll through it. For example, you’re looking for text file
os-release
in /etc
. You have succesfully found
it there with ls /etc
, and now you want to read it. You
just use less /etc/os-release
.
How do you control less
? Easy, with your keyboard!
less
will display only one page of your text at a time.
You can move line by line with the arrow keys. To go forward an entire
page, you can press Page Up. To go back one
page, you can use Page Down. > will
take you to the end of the text file, while < will
take you to the beginning of the text. /characters
will
search for characters
inside the text (for example, if you
write /ubuntu
, it will search for occurrences of
ubuntu
inside your text and mark them). n
will go to the next occurrence of the search term, and
h will display all your options (h as in help!). You
quit less with the letter q.
The name less
is a pun on the word more
,
which is a much more basic tool for displaying a text file and
scrolling, because it only allows scrolling down; therefore,
less
is more than more
.
The file
command will show what kind of file is that
you’re looking for, be it ASCII text, a jpg image, a bash script etc. As
we performed our exercise with /etc/os-release
, let’s use
it here also.
There you go, now you know what os-release
is.
Incidentally, it may be either an ASCII text file or a link to one! It
depends on your Linux distribution (version). If it’s a link, try to run
the command on the linked file. Now try it out with something else, and
see the output.
Next, we have the commands type
and which
.
Like file
, they give information on the type, but they
operate on commands instead of files. which
tells you where
you can find the executable that is run if you type in a command. Let’s
try it on the command file
:
Now we know that when we run file
, Bash executes the
program /usr/bin/file
. How about cd
?
What?! It seems that there is no such executable! This is because it
is so common, it’s built into Bash itself. type
is a bit
more clever than which
and tells you whether a command is
an executable file, or a command built into Bash itself. Let’s see what
it says about cd
:
In some cases, you might have both available. Let’s take a look at
the command time
that is used to measure how long a command
runs for:
It is also built into Bash itself. But there is another command
called time
that is an actual executable:
Because the shell prefers builtins compared to executables, when you
run time
you will run the builtin version, rather than the
executable version. But you can reach the executable version (which is
more feature-rich!) by calling it with its absolute path:
type
and which
will come very much in handy
once we get to Python, as we will have several Python versions
installed. It will help determine which version we have active.
File manipulation
Copying, pasting files, creating directories etc. is probably easier using graphical tools, but, if you’d like to perform more complicated tasks, like copying only .html files from one directory to another, and only copying files that don’t exist in the destination directory, CLI just might come in handy. So, before we start with the commands themselves, let’s take a quick stop at wildcards. They are a set of special characters that help you pick out a set of files based on some simple rules (which characters appear in a file name, how many characters, upper/lower case characters etc.). Here’s the table:
And here are a few examples:
If you use a command with an argument containing a filename, you can use wildcards with no problem. Bash will go ahead and expand the wildcard into a set of all matching filenames, and the command will actually receive a set of files and not the wildcard string.
cp
is used to copy files or directories. You can use it
pretty easily: navigate to the directory you’d like to copy the files
from and to, and simply do cp file1 file2
- to copy single
files, or cp file1 file2 ... directory
- to copy files from
your current working directory to the directory specified.
We can use mv
to rename a file or directory, or to
move a file or directory. We can use it this way:
mv filename1 filename2
- if we want to rename
filename1 to filename2, or
mv file directory
- if we want to move file to
directory.
The rm
command removes/deletes files and directories.
Usage is pretty straightforward: rm file
or
rm -r directory
. But, do be careful when using
rm
, as there is no undelete option (the file is erased and
doesn’t go to the bin), so be extra careful not to inflict unwanted
damage to your system!
mkdir
is used for creating directories. Now, create a
directory called Bash
(i.e. a directory that will contain
our Bash scripts):
It should now look like this:
Now, try typing the following commands in the terminal, run them, and observe what they do:
- make a directory and remove it
(e.g.
mkdir namedirectory
andrmdir namedirectory
orrm -r namedirectory
). - create an R script via
touch
(e.g.touch filename.R
; you can also userstudio
orrkward
to start R to create R script). - then copy it (e.g.
cp filename.R newname.R
). - then remove it (e.g.
rm newname.R
). - use
ls
commands and its options to check content in current directory.
Tip: Bash has a feature called Tab-completion. If you
start writing a command or filename, pressing the Tab
key a
couple of times will give a list of suggestions for auto-completion.
This is super-handy so that you never need to write filenames etc. In
addition, you can recall the last commands you entered by using the up
arrow key. Lastly, you can always open multiple terminals, even in tabs,
by using File → Open Tab, or the little sheltered plus
mark on the top left corner of terminal.
Question 4: What command line would you need if you want to move all R files in a directory into its parent directory?
To recap so far, here’s a list of most common commands:
pwd
: show your current working directorycd
: change directorycd ..
: move up one directorymkdir
: create directorytouch
: basic command used to change file timestamps or create an empty file if it doesn’t existrm
orrm -R
: delete files or directoriessudo
: running programs as root (administrator/super-user), which may ask for your user passwordls
: listing files in a directorycp
: copy files e.g. for backing up things or just copying. We will use these command in the scripts below.
Find help with documentation and manuals
Mostly every command has documentation that comes with it. So you’re
somewhere doing your CLI thing, no access to the internet so you can’t
bug people on the forums or IRC, and you need to find out how to exactly
use a command. You can do it two ways. The first is the command
help
. The help
command works with shell
builtins, and not executable files. So you can pick a shell builtin,
like cd
or time
, and simply type
help cd
or help time
. You’ll get a helpful
page printed out in your terminal, so go ahead and read what they have
to offer. Here’s another example:
The help page shows in what ways you can use the command, what options you can use (it’s in square brackets, which means they are optional! Also, if there’s a vertical separator inside the square brackets, it means the options mentioned are mutually exclusive. Don’t use them together!)
help
works only for the shell builtins. But most
executables provide an option --help
. As far as usage goes,
it’s similar to help
, but you have to type
--help
after the command you want to inspect. For
example:
However, --help
is just a convention, which programs are
not obliged to follow. Sometimes the option is called -h
,
and sometimes it is not present at all.
To get more information about how to use a command, most executables
come with a formal documentation page. Distributions often mandate the
inclusion of a manual page for every package, so the manual page is the
most useful source of information. You can inspect the manual page using
the man
command. You just enter man program
,
and see what it prints out. Pick any program on your computer, and try
it out. For example, let’s try man which
. You get a file
opened, split into categories. It gives you information what the program
is, what it does, how you can use it etc., but it doesn’t offer
examples, as it’s not a tutorial.
Manual pages are text files displayed in a pager program that allows
easy scrolling. The default pager is less
, which you have
already used in the third exercise. You can also look at its manual page
using man less
. Also try man intro
: the
“Introduction to user commands”, a well-written, fairly brief
introduction to the Linux command line.
Optional: You can also read the Ubuntu documentation on CLI to learn more, and let us know if you have questions about some commands.
Package installation and management
One of the greatest advantages of Linux distributions over other OSs is the package manager. Even if you never used Linux before, you are probably already using a package manager on your mobile device: The App Store, Google Play Store and Windows Store are all package managers, modelled after the Linux ones. A package manager is a central system for downloading, installing and removing software.
Each major Linux distribution has its own package manager, which is aware of all packages maintained by the distribution. These packages are tested and are known to work with that particular distribution version, so the package manager is the first place to look for installing additional software. The package manager is typically a command-line program, although some distributions also have GUI interfaces for it.
Ubuntu uses Aptitude as the package manager. Here is a short list of the most useful package manager commands on Ubuntu:
apt search packagename
: Search for a package called “packagename”.apt list package*
: List all packages starting with “package”.sudo apt install packagename
: Install or update a package. This changes system files and therefore requires administrator privileges (sudo
).sudo apt remove packagename
: Uninstall a package.- See
man apt
for more.
For instance, if you run apt list chrom*
, one of the
results will be chromium-browser
. It’s Chromium,
the open-source version of Google Chrome. You can install it by running
sudo apt install chromium-browser
. Similarly, the Ubuntu
package repository contains a lot (but not all) of R packages (they are
prefixed with r-cran-
) and Python packages (prefixed with
python3-
; the ones prefixed with python-
are
for Python 2 which is deprecated). If there is a package available in
the distribution repository, almost always it is better to use that
instead of using a package manager built into the language
(install.packages
in R and
easyinstall
/pip
/conda
in
Python).
Now, let’s install a package from the terminal, which we will later use to create our Bash script:
You will be prompted to enter your password — this is the same as your login password for the virtual machine. The installation may take a moment but should complete quickly!
The aforementioned commands are specific to the Debian family of
Linux distribution (of whom Ubuntu is a member). In other distributions,
package manager syntax is different, but the result is the same. For
instance, in openSUSE the equivalent commands would be
zypper search
, sudo zypper install
and
sudo zypper remove
.
Whenever a package is not included in the distribution repository, one option is to look for additional software sources. Ubuntu allows users to maintain their own packages through a system called Personal Package Archives (PPA). However, these packages are not tested and are not guaranteed to work, or could even cause problems in the system, so you have to be careful. Other distributions also have their own third-party repository systems: openSUSE uses Open Build Service, Fedora uses Copr etc.
If a package doesn’t exist in third party repositories either, there is often the possibility to download the source code of a package and compile it. It is common for cross-platform software vendors to provide installers for Windows and source code for Linux. However, compiling from source yourself should only be done as the very last resort; in fact, it is often easier and safer to create a package yourself than to try to build it from source!
Starting R or Python from the terminal
Starting and stopping R from the terminal (this is the same as the R console you know from RStudio/RKWard):
Starting and stopping Python from the terminal:
Scripting in the terminal
So far, you’ve been running commands directly in the terminal, one at a time. But what if you want to automate a sequence of commands or reuse them later? That’s where Bash scripts come in — they allow you to store a series of commands in a file and run them all at once, just like a simple program.
Hello, world Bash script
Bash is primarily a scripting language, so it would be a
crime not to talk about scripting. Let’s dive straight in with a
Bash script. More precisely the infamous “Hello World” script.
You can create a bash script by opening your favorite text editor to
edit your script and then saving it (typically the .sh
file
extension is used for your reference, but is not required. In our
example, we will be using the .sh
extension).
So let’s get started. First, create a simple text file and call it
HelloWorld.sh
, save this in the Bash
directory
you just created, and add the following text. While gedit
,
the package we just installed, is used as the main example for editing
files, you can also use rstudio
or rkward
as
an alternative text editor if you prefer. In fact, rstudio
makes it rather convenient to edit Bash scripts, exactly the same way as
R scripts, including the ability to run commands line by line. It is
also worth noting that there are even command-line text editors, like
nano
, which are useful for editing files that require
administrative privileges.
Paste this piece of code into an editor you choose and save it:
The first line of the script just defines which interpreter to use (and where it is located). That’s it, simple as that!
Note: There is no leading whitespace before
#!/bin/bash
, and you cannot add any comments before it.
This shebang should be the very first thing in the file.
To find out where your bash
interpreter is located type
the following in the terminal (this works also on a Mac terminal!):
Second, to run a bash script, you have two options. The first is have
to set the correct file permissions. We do this with chmod
(change mode) command in terminal as follows, this needs to be done only
once per file:
ls -l # Check what's the current permissions
chmod u+x HelloWorld.sh # Gives your user execute permissions
Optional: More info about
chmod
for your future reference. Note: today is just an
introduction to let you know what is possible so that you can find your
way easier in the future.
In this case, we can then proceed to run the script directly in the terminal:
Alternatively, we can specify which interpreter to use specifically, and then pass the file name to the interpreter. This option does not require changing file permissions:
Below is a summary of what we have done in the terminal:
echo "Go to the Bash directory"
cd Bash
echo "Check that the file is there using the ls command:"
ls -l
echo "Then change the permissions:"
chmod u+x HelloWorld.sh
echo "We can now run our first Bash script:"
./HelloWorld.sh
Hopefully you should have seen it print Hello, World
onto your screen. If so well done! That is your first Bash
script (see below for a screenshot):
Question 5: In the first option above, why do we add
./
in front of the Bash script name? What happens if you don’t? Why?
Note: optinally, we can also run Bash code from R using
the system()
function that can invoke an OS command:
# R code
setwd("Bash/") # Set the working directory in R
print(system("./HelloWorld.sh", intern = TRUE)) # Execute this command in Bash
Note: And vice versa, we can run an R script from the terminal using Bash:
Note: In this lesson, to keep things simple, we’ll use
gedit
and RStudio as text editors to edit scripts, and run
them only from the terminal.
Bash script with a variable
Variables basically store information. You set variables like this (you can type this in the terminal, no space in between!).
var
can be anything you want as long as it doesn’t begin
with a number. “FOO” can be anything you want. There cannot be
any space in between the =
sign! To access the
information from the variable you need to put a ‘$’ in front of it like
this (again, this can be done after following the previous line in
script or in terminal):
Now create the following e.g. variables.sh
script in the
Bash directory and apply the
chmod u+x variables.sh
command on this script using the
terminal.
#!/bin/bash
echo "Now with the read function"
clear
echo "Please enter your name"
read name
echo "Please enter your age"
read age
echo "So you're a $age year old, called $name"
You can run the script once it is executable:
Question 6: Try it out yourself, and try to do a calculation of e.g. a + b as input variables. Hint: Shell-tips
Optional: If you want to learn more about Bash scripting: https://help.ubuntu.com/community/Beginners/BashScripting
For the next section, let’s download a file from the Intro to
raster tutorial. Manually download the
gewata.zip
file from Github (link).
Create a data
directory and unzip it there, you should have
a .TIF file. Then
navigate to this directory in your terminal.
Using the GDAL library from the terminal
GDAL is a very powerful and fast processing library written in C/C++ for raster and vector geospatial data formats. Now via the terminal we can access GDAL directly! E.g. we can check out what the current version of GDAL is that is installed on our Linux OS. We will learn more about GDAL in the later tutorials.
Type the following in the data
directory: (Note: You can
write a shell script to do the following commands below but first type
in the commands via the terminal to understand what is happening.)
One of the easiest and most useful commands in GDAL is
gdalinfo
. When given an image as an argument, it retrieves
and prints all relevant information that is known about the file. This
is especially useful if the image contains additional tag data, as is
the case with TIF
files.
Using gdalinfo:
You should now see some information about the raster file, for example the coordinate system, the cell size, and some statistics about the raster bands.
Now let’s calculate the NDVI
by running the following command line by line in terminal. The
calculation is done via GDAL command by using the
gdal_calc.py
script. See GDAL_calc for more
information.
cd data
cp LE71700552001036SGS00_SR_Gewata_INT1U.tif input.tif
echo "* all files in the directory"
ls
echo "* now apply gdal_calc: Command line raster calculator with numpy syntax"
gdal_calc.py -A input.tif --A_band=4 -B input.tif --B_band=3 --outfile=ndvi.tif --calc="(A.astype(float)-B)/(A.astype(float)+B)" --type='Float32'
echo "* remove the input temporary file"
rm input.tif
Question 7: Try to write to calculate the NDVI using the lines above in a nice and short shell script.
-
Hint 1: use
cd ..
to move to the parent directory -
Hint 2: No spaces in file names are allowed and try to use variables
e.g.
fn=$(ls *.tif)
Let’s now check if the range of the NDVI values makes sense, and make
a nice script from the following code block in a separate file (this
will work only if you have one .TIF file in the data directory, as
fn=$(ls *.tif)
will get you all the tif files in the
directory):
#!/bin/bash
echo "teamname"
echo "Current date"
echo "Calculate LandSat NDVI"
mkdir -p output
fn=data/*.tif
echo "The input file(s): $fn"
outfn=output/ndvi.tif
echo "The output file: $outfn"
echo "calculate ndvi"
gdal_calc.py -A $fn --A_band=4 -B $fn --B_band=3 --outfile=$outfn --calc="(A.astype(float)-B)/(A.astype(float)+B)" --type='Float32'
echo "look at some histogram statistics"
gdalinfo -hist -stats $outfn
More info here on the power of GDAL via the terminal: GDAL_website and gdalinfo
Handy functions are (See the examples at the bottom):
Optional:
More info about Bash basics from GNU.
More information
- Ryan’s Linux & Bash Tutorial goes into more depth but is also very handy to reference, and includes a cheat sheet with commands.
- Hands-on introduction to bash basics for beginners
- A great bash scripting tutorial
- Basic terminal usage and installing software
- Beginners guide to nano, the linux command line text editor
- Learning the shell
- How to use pwd command in Linux
- For macOS users and introduction to use the terminal on macOS: