Ggplot keeps repeating colors for combined plots
I am trying to create a map of all ethnic groups in the world - based on SpatialPolygonsDataFrame
(form files can be downloaded here ). My problem is that ggplot
it seems to reassign colors after every successive call geom_polygon
. The following code for two countries works great and all regions / ethnic groups can be distinguished from each other.
library(rgeos)
library(maptools)
library(rms)
library(igraph)
library(foreign)
library(sp)
library(spdep)
library(ggplot2)
setwd("yourdirectory")
# load GREG dataset
greg <- readShapePoly("GREG.shp", proj4string=CRS("+proj=longlat +datum=WGS84"))
# exclude very small polygons (<= 5 square km)
greg <- greg[greg$AREA > 1000e+06,]
dev.off()
temp <- greg[greg$COW==325,]
g<-ggplot(temp, aes(x = long, y = lat)) +
geom_polygon(data=temp,aes(group = group, fill=group, size=1))
temp <- greg[greg$COW==225,]
g +
geom_polygon(data=temp,aes(group = group, fill=group, size=1)) +
theme(legend.position = "none")
However, when I run this code in a loop and on a large number of polygons (in this case, countries), the color of many polygons (leaving Italy and Switzerland) becomes indistinguishable from each other because ggplot assigns a unique color to each (apparently 6011 polygons). is there a way to keep the "unique" colors of each polygon in the composite plot? In other words, the plot should allow for duplicate colors.
dev.off()
temp <- greg[greg$COW==0,]
g <- ggplot(temp, aes(x = long, y = lat)) +
geom_polygon(data=temp,aes(group = group, fill=group, size=1))
for (cow in unique(greg$COW)) {
if (cow==0) next
temp <- greg[greg$COW==cow,]
g <- g +
geom_polygon(data=temp, aes(group = group, fill=group, size=1))
}
g <- g + theme(legend.position = "none")
PS: you might have to export the second graph (i.e. to PNG) to see it.
source to share
So, as I mentioned earlier, you can only have a scale for each attribute. This way, the fill colors are not reset for each country, even if you add them as separate layers. To achieve the same coloring, you need to create your own variable that behaves this way. What I have done is used interaction()
to find unique country / ethnicity combinations. Then I took those values ββand compared them to 1:12. I did it with
greg$ceid <- (as.numeric(interaction(greg$G1ID, greg$FIPS_CNTRY, drop=T)) %% 12) +1
This now suggests what FIPS_CNTRY
is a better country indicator than COW
. It also appears to be G1ID
a better identifier for a specific ethnic group than GROUP1
a dataset. If there is documentation for this dataset, you probably want to read it carefully to verify this information. Most countries have less than 10 ethnic groups, but there is one with 206 and the next highest at 87.
So it tried to spread the colors between countries. The next trick is to use explicitly fortify
to tell ggplot how to group the regions. We do this with
fortify(greg, region="ceid")
which creates something similar to
long lat order hole piece group id
1 -158.7752 63.22207 1 FALSE 1 1.1 1
2 -158.7752 63.36345 2 FALSE 1 1.1 1
3 -158.4783 63.54724 3 FALSE 1 1.1 1
4 -158.4359 63.64621 4 FALSE 1 1.1 1
5 -158.3228 63.83000 5 FALSE 1 1.1 1
6 -158.0262 63.98471 6 FALSE 1 1.1 1
where group
specifies the grouping of polygons, but id
corresponds to the regions specified in fortify
. So these are the numbers 1:12. Now we will build it all with
g <- ggplot(fortify(greg, region="ceid"), aes(x = long, y = lat)) +
geom_polygon(aes(group = group, fill = id), size=1) +
scale_fill_brewer(type="qual", palette = "Set3") +
theme(legend.position = "none")
Here I have used a colorbrewer quality color palette. It looks like
If you instead provided the actual ethnic group names for group 1 with default colors, you could get
g <- ggplot(fortify(greg, region="G1ID"), aes(x = long, y = lat)) +
geom_polygon(aes(group = group, fill=id), size=1) +
theme(legend.position = "none")
The last plot is definitely "smoother", but it really depends on you what you want to communicate the plot though.
source to share