Ploting with ggplot#
Learning objectives
Questions:
How to plot in R using ggplot2 package
Objectives:
Learn advanced plotting using ggplot2
Keypoints:
ggplot2
ggplot
ggplot2
is a graphics package, written by Hadley Wickham, a grad student at Iowa State, based on the ideas from the book “Grammar of Graphics” by Leland Wilkinson. Let’s install it (it will install multiple additional packages that it requires):
install.packages("ggplot2")
library(ggplot2)
Basic component of ggplot
A data frame
aes: aesthetic mappings showing how data are mapped to color, size
geoms: geometric objects like points, lines, shapes.
facets: for conditional plots.
stats: statistical transformations like binning, quanti les, smoothing.
scales: what scale an aesthetic map uses
coordinate system
Type of ggplot
Basic qplot
Same as plot in Base plot
Nicer graphics than Base plot
Difficult for customize
Advanced ggplot
Flexible with many built-in function
A quick way to get familiar with
ggplot2
is theqplot
function, which stands for quick plot. Let’s do a quick scatter plot from theiris
dataset, plotting sepal length versus petal length:
qplot(Sepal.Length, Petal.Length, data=iris)
Notice that it already looks nicer than the basic R plots we did in the last chapter. Now, let’s plot different species of iris with different colors and shapes:
qplot(Sepal.Length, Petal.Length, data=iris,
color=factor(Species),
shape=factor(Species))
Now, let’s add smooth lines which show trends in the data:
qplot(Sepal.Length, Petal.Length, data=iris,
color=factor(Species),
shape=factor(Species),
geom=c("point","smooth"))
Let’s make these lines straight – that is, let’s fit a linear model to each species’ data:
qplot(Sepal.Length, Petal.Length, data=iris,
color=factor(Species),
shape=factor(Species),
geom=c("point","smooth"), method="lm")
Let’s make a density plot (a smoothed histogram) of sepal lengths of each species:
qplot(Sepal.Length,data=iris,geom="density",
color=Species)
There are many more ways to use ggplot2. Some useful (and beautiful) examples of code are here: http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html
Basic qplot: Facets
qplot(Sepal.Length,Petal.Length,facets=.~Species, data=iris)
Advanced ggplot
Sample plot
gp <- ggplot(mpg, aes(hwy, cty))
gp+geom_point(aes(color=cyl))
gp+geom_point(aes(color=factor(cyl)))
gp+geom_point(aes(color=factor(cyl)))+geom_smooth(method="lm")
gp+geom_point(aes(color=factor(cyl)))+geom_smooth(method="lm")
+facet_grid(.~cyl)
# Save plot to file
ggsave("plot.png",width=5,height=5)
Annotation
Labels: xlab(), ylab(), labs(), ggtitle()
global annotation: use theme()
Standard appearance: theme_bw()
gp+geom_point(aes(color=factor(cyl),
size=factor(cyl)))+
geom_smooth(method="lm")+
xlab("Highway miles per gallon")+
ylab("city miles per gallon")+
ggtitle("Scatter plot for cty & hwy")+
xlim(10,40)+ylim(10,40)+
theme_bw(base_size = 15)
Some nice ggplots featuring
Boxplot
ggplot(mpg,aes(x=manufacturer,y=hwy,
fill=factor(manufacturer)))+
geom_boxplot()+
geom_jitter()+
labs(title="Boxplot for Hwy per manufacturer",x="Manufacturer",y="Highway milage")+
theme_bw()+coord_flip()+
theme(legend.position = "none")
Violin plot
g <- ggplot(mpg, aes(class, cty))
g + geom_violin(aes(fill=class)) +
labs(title="Violin plot",
subtitle="City Mileage vs Class of vehicle",
caption="Source: mpg",
x="Class of Vehicle",
y="City Mileage")
Histogram
g <- ggplot(mpg, aes(displ)) + scale_fill_brewer(palette = "Spectral")
g + geom_histogram(aes(fill=class),
bins=10,
col="black",
size=.1) + # change number of bins
labs(title="Histogram with Fixed Bins",
subtitle="Engine Displacement across Vehicle Classes",
x="enginer displacement (m)",
y="Frequency count")
Scatter plot
data("midwest")
gg <- ggplot(midwest, aes(x=area, y=poptotal)) +
geom_point(aes(col=state, size=popdensity)) +
geom_smooth(method="loess", se=F) +
xlim(c(0, 0.1)) +
ylim(c(0, 500000)) +
labs(subtitle="Area Vs Population",
y="Population",
x="Area",
title="Scatterplot",
caption = "Source: midwest")
plot(gg)
Density
g <- ggplot(mpg, aes(cty))
g + geom_density(aes(fill=factor(cyl)), alpha=0.8) +
labs(title="Density plot",
subtitle="City Mileage Grouped by Number of cylinders",
caption="Source: mpg",
x="City Mileage",
fill="# Cylinders")+
theme_bw()
Density 2D
gg <- ggplot(faithful,aes(x=eruptions,y=waiting))
gg + stat_density_2d(aes(fill=..level..),
geom="polygon",color="black")+
geom_smooth(method="lm",linetype=2,color="red")+
scale_fill_continuous(low="green",high="red")+
geom_point() +
theme_bw()
Geographic visualization with ggplot
library(maps)
states <- map_data("state")
ggplot(data = states)+
geom_polygon(aes(x=long,y=lat,fill=region),
color="black")+
coord_fixed(1.3)+
guides(fill=FALSE)
counties <- map_data("county")
SC_counties <- subset(counties,region == "south carolina")
ggplot(data = SC_counties)+
geom_polygon(aes(x=long,y=lat,fill=subregion),
color="black")+
coord_fixed(1.3)+
guides(fill=FALSE)
some.eu.countries <- c(
"Portugal", "Spain", "France", "Switzerland", "Germany",
"Austria", "Belgium", "UK", "Netherlands",
"Denmark", "Poland", "Italy",
"Croatia", "Slovenia", "Hungary", "Slovakia",
"Czech republic"
)
# Retrievethe map data
some.eu.maps <- map_data("world", region = some.eu.countries)
ggplot(some.eu.maps, aes(x = long, y = lat)) +
geom_polygon(aes( group = group, fill = region))+
scale_fill_viridis_d()+
theme_void()+
theme(legend.position = "none")
Plot Shapefile for geography study
Download shape file data here
Store it in your folder: c:/R/GIS/ in Windows or /user/R/GIS in MacOS
Unzip it and rename all files to
Countries_WGS84.*
underC:/GIS/
Install additional packages:
install.packages("rgdal")
install.packages("colorspace")
Perform plotting
library(rgdal)
library(colorspace)
library(maps)
setwd('c:/R/GIS/')
gfile <- readOGR(dsn="Countries_WGS84.shp")
names(gfile)
gfile$CNTRY_NAME
plot(gfile)
plot(gfile,col=rainbow_hcl(50))
llgridlines(gfile,lty=5)
Plot raster
Here we will plot a raster data base using Global land cover data set. The data can be downloaded from here.
Unzip and put the raster data to working directory:
install.packages("raster")
library(raster)
library(rgdal)
setwd('c:/R/GIS/')
#import raster
Gcover <- raster("GLOBCOVER_L4_200901_200912_V2.3.tif")
#plot raster
plot(Gcover,main="GLobal Land cover")