Introduction to GIS

Intro

A GIS consists of:

Digital Data –-- the geographical information that you will view and analyse using computer hardware and software.
Computer Hardware –-- computers used for storing data, displaying graphics and processing data.
Computer Software –-- computer programs that run on the computer hardware and allow you to work with digital data. A software program that forms part of the GIS is called a GIS Application.

Intro

GIS is a relatively new field — it started in the 1970’s. These days, anyone with a personal computer or laptop can use GIS software.  GIS is more than just software, it refers to all aspects of managing and using digital geographical data. In the tutorials that follow we will be focusing on GIS Software. 

There is no clear definition of what a GIS software is, one I like is "a toolbox to support spatial thinking"

Where it began

Intro

GIS Applications are normally programs with a GUI that can be manipulated using the mouse and keyboard. The application provides menus near to the top of the window (File, Edit etc.) which, when clicked using the mouse, show a panel of actions.

 

Note: there are other ways, Python, R, etc. offer scriptable GIS tools / workflows

apps

Intro

A common feature of GIS is that they allow you to associate information (non-geographical data) with places (geographical data).

For example, a health care worker could store the person’s age and gender on her table. When the GIS Application draws the layer, you can tell it to draw the layer based on gender, or based on disease type, and so on. 

The big idea

Intro

  • Geoscience: why not?
  • web integration: the future
  • Open source: my bias
  • Mapmaking

This course focusses on:

This course doesn't cover:

This course focusses on:

This course focusses on:

  • Databases
  • Spatial statistics

Thinking spatially

GIS is a relatively new field — it started in the 1970’s. These days, anyone with a personal computer or laptop can use GIS software.  GIS is more than just software, it refers to all aspects of managing and using digital geographical data. In the tutorials that follow we will be focusing on GIS Software.

Where it began

Thinking spatially

Observe as much as you can in the room and think about how these things are related or interacting with one another spatially

Write down the ways you could:

1) Describe the spatial relationships you see

2) What the spatial relationships might mean

Spatial thinking excercise

Thinking spatially

Always try to:

  • ask questions about spatial patterns and relationships
  • form hypotheses about patterns and relationships and their underlying process
  • question assumptions about what we think we might be know

Spatial thinking should be critical

Thinking spatially

We are generally interested in patterns that are non-random

There is typically an underlying process that results in the establishment of the pattern

Spatial configuration

Thinking spatially

  • Random
  • Clustered
  • Regular
  • Periodic
  • Self-similar (scale invariant)

Types of patterns

Thinking spatially

Point classification (machine learning)

Thinking spatially

Connectivity

  • nodes / vertices / adjacency
  • Orientation
  • Self-similarity (scale invariant)

 

 

For a stone, when it is examined, will be found a mountain in miniature. John Ruskin

Thinking spatially

Connectivity

  • nodes / vertices / adjacency
  • Orientation
  • Self-similarity (scale invariant)

 

 

Thinking spatially

Connectivity

  • nodes / vertices / adjacency
  • Orientation
  • Self-similarity (scale invariant)

 

Thinking spatially

The idea of developing our ability to think spatially and conduct geospatial analysis, has required us to develop methods of creating and storing geospatial data

Where it led to

GIS data models

Vector data provide a way to represent real world features within the GIS environment. A feature is anything you can see on the landscape. Imagine you are standing on the top of a hill. Looking down you can see houses, roads, trees, rivers, and so on (see figure_landscape). Each one of these things would be a feature when we represent them in a GIS Application

Vector data

A vector feature has its shape represented using geometry. The geometry is made up of one or more interconnected vertices. A vertex describes a position in space using an X, Y and optionally z axis. Geometries which have vertices with a Z axis are often referred to as 2.5D since they describe height or depth at each vertex, but not both.

Vector data

GIS data models

When a feature’s geometry consists of only a single vertex, it is referred to as a point feature. Where the geometry consists of two or more vertices and the first and last vertex are not equal, a polyline feature is formed. Where three or more vertices are present, and the last vertex is equal to the first, an enclosed polygon feature is formed.

Points lines polygons

GIS data models

The first thing we need to realise when talking about point features is that what we describe as a point in GIS is a matter of opinion, and often dependent on scale. let’s look at cities for example. If you have a small scale map (which covers a large area), it may make sense to represent a city using a point feature. 

Points in detail

GIS data models

The shapefile format is a popular geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a (mostly) open specification for data interoperability among Esri and other GIS software products.

Mandatory files   

.shp — shape format; the feature geometry itself
.shx — shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly
.dbf — attribute format; columnar attributes for each shape, in dBase IV format
Other files 
.prj — projection format; the coordinate system and projection information, a plain text file describing the projection using well-known text format
.sbn and .sbx — a spatial index of the features

Shapefiles

GIS data models

Now that we have described what vector data is, let’s look at how vector data is managed and used in a GIS environment. Most GIS applications group vector features into layers. Features in a layer have the the same geometry type (e.g. they will all be points) and the same kinds of attributes (e.g. information about what species a tree is for a trees layer). 

Features and Layers

GIS data models

vector attributes

If every line on a map was the same colour, width, thickness, and had the same label, it would be very hard to make out what was going on. The map would also give us very little information. 

Overview

vector attributes

Attribute data is the power behind GIS

Why is it important

vector attributes

All attribute data can be categorized as one of:

  • nominal - descriptive, categorical
  • ordinal - imply a ranking (but not scale)
  • interval - numeric values
  • cyclic - some kind of continuous data, that repeats

attributes abstractly

vector attributes

It is very common to want to classify attribute data,

interval data is commonly divided into classes using:

  • natural breaks - an algorithm `looks` for breaks
  • quantile breaks - each group has the same sumber of observation
  • equal interval breaks
  • standard deviation

Classification

vector attributes

A table is the two dimensional set of rows and columns that store our data:

  • Each row represents a unique data point or record. 
  • Each column represents a field, or an attrubute
  • Tables are also called flat files - particularly when compared to  other database types.  

Tables

vector attributes

The fact that features have attributes as well geometry in a GIS Application opens up many possibilities. For example we can use the attribute values to tell the GIS what colours and style to use when drawing features. The process of setting colours and drawing styles is often referred to as setting feature symbology.

Symbology

We've discussed spatial data models - vectors and rasters.

It's going to be really important to learn how to visualise these data types 

 

Brainstorm, what are the basic ways we could represent and represent points with different attributes?

(or styling)

Symbology

A really easy way to visualise data (particularly non-numeric data) is to generate a classification scheme, linking unique attribute values to a symbol

 

How many buckets before the information you're conveying becomes overwhelming?

Points: classifications

Symbology

If a feature is symbolised without using any attribute table data, it can only be drawn in a simple way. For example with point features you can set the colour and marker (circle, square, star etc.) but that is all. You cannot tell the GIS to draw the features based on one of its properties in the attribute table.

Symbols

Symbology

Sometimes the attributes of features are not numeric, but instead strings are used. ‘String’ is a computer term meaning a group of letters, numbers and other writing symbols. Strings attributes are often used to classify things by name. We can tell the GIS Application to give each unique string or number its own colour and symbol.

Unique symbols

Symbology

Sometimes it is useful to draw features in a colour range from one colour to another. The GIS Application will use a numerical attribute value from a feature (e.g. contour heights or pollution levels in a stream) to decide which colour to use. 

Continuous colour symbols

Symbology

Setting colours based on discrete groups of attribute values is called Graduated Symbology in QGIS.

Contour lines are a good example of this. Each contour usually has an attribute value called ‘height’ that contains information about what height that contour represents. 

Graduated colour symbols

Topology

Topology has several meanings.

Within mathematics, topology is the mathematical study of the properties that are preserved through deformations, twistings, and stretchings of objects.

 

 

 

 

Definitions

Topology

Within GIS, topology refers to the spatial relationships between connecting or adjacent features (points, polylines and polygons) in a geographic data layer. 

...or, rules that describe how geographic elements interact with one another

Many phenomena are subject to topological constraints: for example, two counties cannot overlap, two contours cannot cross, and the boundary of an area cannot cross itself.

 

 

 

 

Definitions

Topology

Topology

Elements may interact with one another via:

  • adjacency 
  • containment
  • overlap
  • intersection
  • proximity

 

 

 

Definitions

Topology

Many GIS applications provide tools for topological editing. For example in QGIS you can enable topological editing to improve editing and maintaining common boundaries in polygon layers. A GIS such as QGIS ‘detects’ a shared boundary in a polygon map so you only have to move the edge vertex of one polygon boundary and QGIS will ensure the updating of the other polygon boundaries as shown in

 

 

 

Tools

Topology

On a sightseeing tour of London you plan to visit St. Paul’s Cathedral first and in the afternoon Covent Garden Market for some souvenirs. This requires topological information (data) about where it is possible to change trains. Looking at a map of the underground, the topological relationships are illustrated by circles that show connectivity.

 

 

 

 

Topic

Topology

There are different types of topological errors and they can be grouped according to whether the vector feature types are polygons or polylines. Topological errors with polygon features can include unclosed polygons, gaps between polygon borders or overlapping polygon borders. A common topological error with polyline features is that they do not meet perfectly at a point (node)

 

 

 

 

Errors

Databases       & queries

We have discussed feature attribute tables - which are a kind of database, and are commonly called flat files. 

For larger projects, it may be useful to create a proper database.

Databases       & queries

Hierarchical files store data in more than one type of record. This method is usually described as a "parent-child, one-to-many" relationship. One field is key to all records, but data in one record does not have to be repeated in another.

Databases       & queries

Relational database: A relational database is a database that groups data using common attributes found in the data set. The resulting "clumps" of organized data are much easier for people to understand.

 

A data structure in which collections of tables are logically associated with each other by shared fields.

 

Databases       & queries

Queries will be covered in a separate lecture...but we need to know a little bit now

The QGIS Query Builder allows you to define a subset of a table using a SQL-like WHERE clause and display the result in the main window. The query result then can be saved as a new vector layer.

Queries / filtering

Databases       & queries

Queries / filtering

The Fields list contains all attribute columns of the attribute table to be searched. To add an attribute column to the SQL where clause field, double click its name in the Fields list. Generally you can use the various fields, values and operators to construct the query or you can just type it into the SQL box.

 

The Values list lists the values of an attribute table. To list all possible values of an attribute, select the attribute in the Fields list and click the [all] button. To list the first 25 unique values of an attribute column, select the attribute column in the Fields list and click the [Sample] button. To add a value to the SQL where clause field, double click its name in the Values list.

 

The Operators section contains all usable operators. To add an operator to the SQL where clause field, click the appropriate button. Relational operators ( = , > , ...), string comparison operator (LIKE), logical operators (AND, OR, ...) are available.

Databases       & queries

Spatial queries are core to many types of GIS analysis. In QGIS, this functionality is available via the Spatial Query.  This allows you to make a spatial query (i.e., select features) in a target layer with reference to another layer. 



Spatial queries

  • Contains
  • Equals
  • Overlap
  • Crosses
  • Intersects
  • Is disjoint
  • Touches
  • Within

Joins

Table joins

Not every dataset you want to use comes as a shapefile, or in a spatial format. Often the data would come as a table or a spreadsheet and you would need to link it with your existing spatial data for use in your analysis. This operation is known as a Table Join and this tutorial will cover how to carry out table joins in QGIS.

Joins

Spatial joins

Spatial Join is a classic GIS problem - transferring attributes from one layer to another based on their spatial relationship. In QGIS, this functionality is available through the Join Attributes by Location tool.

A spatial join would enable you to sum a field stored on points, an a per polygon basis. Alternatively, you might want to transfer polygon attributes to the points they enclose. 

 

Spatial analysis

Now that you have edited a few features, you must want to know what else one can do with them. Having features with attributes is nice, but when all is said and done, this doesn’t really tell you anything that a normal, non-GIS map can’t.

The key advantage of a GIS is this: a GIS can answer questions.

 

Overview

Spatial analysis

Spatial analysis uses spatial information to extract new and additional meaning from GIS data. Usually spatial analysis is carried out using a GIS Application. GIS Applications normally have spatial analysis tools for feature statistics (e.g. how many vertices make up this polyline?) or geoprocessing such as feature buffering. 

 

Overview

Spatial analysis

Geometry statistics can easily be computed on shapefiles: 

  • length
  • area
  • perimeter 

 

Geometry stats.

Spatial analysis

Buffering usually creates two areas: one area that is within a specified distance to selected real world features and the other area that is beyond. The area that is within the specified distance is called the buffer zone.

Buffers

Spatial analysis

Spatial overlay is a process that allows you to identify the relationships between two polygon features that share all or part of the same area. The output vector layer is a combination of the input features information .

Spatial overlay

Spatial analysis

The power of GIS lies in analysing multiple data sources together. Often the answer you are seeking lies in many different layers and you need to do some analysis to extract and compile this information. One such type of analysis is Points-in-Polygon. When you have a polygon layer and a point layer - and want to know how many or which of the points fall within the bounds of each polygon, you can use this method of analysis.

 

Points in polygon

Spatial analysis

The Toolbox is the main element of the processing GUI, and the one that you are more likely to use in your daily work. It shows the list of all available algorithms grouped in different blocks, and it is the access point to run them, whether as a single process or as a batch process involving several executions of the same algorithm on different sets of inputs.

The toolbox

Spatial analysis

The toolbox contains all the available algorithms, divided into so-called “Providers”.

Providers can be (de)activated in the settings dialog. A label in the bottom part of the toolbox will remind you of that whenever there are inactive providers. Use the link in the label to open the settings window and set up providers. We will discuss the settings dialog later in this manual.

The toolbox

Spatial analysis

Once you double-click on the name of the algorithm that you want to execute, a dialog similar to that in the figure below is shown (in this case, the dialog corresponds to the ‘Polygon centroids’ algorithm).

algorithm dialog

Spatial analysis

Data objects generated by an algorithm can be of any of the following types:

A raster layer
A vector layer
A table
An HTML file (used for text and graphical outputs)

algorithm output

Spatial analysis

Can be useful for many reasons, 

here the centroids were a useful indication of topology errors - can you identify these?

Polygon centroids

GIS lecture 1

By Dan Sandiford