Over the last few days I pretty much dove head first into using R for mapping. I have used R extensively for modelling etc. but not this kind of work before. I have some questions and issues regarding shapefiles, how they're read and so on.
I have downloaded the shape files from the Australian Bureau of Statistics there are numerous files with state borders, post codes, cities and so on. The shape files are massive, the Australian state borders has about 1.8 million coordinate points in it, the other file I tried was the statistical area which has over 8 million in it. I didn't do anything with this file as it is just too big for my R set up.
I read the shape file in with readShapePoly
and converted it like so
AUS@data$id = rownames(AUS@data)
AUS.points = fortify(AUS, region="id")
AUS.df = join(AUS.points, AUS@data, by="id")
Once I had converted the State borders shape file from SpatialPolygonsDataFrame
to a regular dataframe I plotted it successfully but it took forever and the detail was too great.
I thought to use thinnedSpatialPoly
to simplify it but it gives the error:
Error in stopifnot(length(Sr@polygons) == nrow(data)) :trying to get slot "polygons" from an object of a basic class ("NULL") with no slots
Which google cannot help me with.
My next strategy was to read it into SAS and use proc greduce
which takes the file and creates a density field and you can choose how dense the polygons are.
proc mapimport out=states datafile='\Digital Boundaries\States\Shape file\STE_2011_AUST.shp';
id ste_code11; run;
proc greduce data = states out = reduced_states;
id ste_code11; run;
SAS has crap graphics and couldn't even plot the thing for me so I exported the dataset and read it back into R with the new density field which I hoping to subset the dataframe by and use in my plots.
My problem now is that when I go to plot in R i get this
ggplot(data=states.df, aes(X, Y, group=SEGMENT)) +
geom_polygon(colour='black', fill='white') + theme_bw()
I guess it is because the polygons are not in order or have broken? I used this function to try and rejoin my polygons but still no luck
RegroupElements <- function(df, longcol, idcol){
g <- rep(1, length(df[,longcol]))
if (diff(range(df[,longcol])) > 300) { # check if longitude within group differs more than 300 deg, ie if element was split
d <- df[,longcol] > mean(range(df[,longcol])) # we use the mean to help us separate the extreme values
g[!d] <- 1 # some marker for parts that stay in place (we cheat here a little, as we do not take into account concave polygons)
g[d] <- 2 # parts that are moved
}
g <- paste(df[, idcol], g, sep=".") # attach to id to create unique group variable for the dataset
df$group.regroup <- g
df
}
### Function to close regrouped polygons
# Takes dataframe, checks if 1st and last longitude value are the same, if not, inserts first as last and reassigns order variable
ClosePolygons <- function(df, longcol, ordercol){
if (df[1,longcol] != df[nrow(df),longcol]) {
tmp <- df[1,]
df <- rbind(df,tmp)
}
o <- c(1: nrow(df)) # rassign the order variable
df[,ordercol] <- o
df
}
So, finally my questions! How do people deal with large overly detailed shape files? Why wasn't thinnedspatialpoly working (I'd like to avoid SAS if possible)? How can I get my plot to not look like crap?
Finally my R specs:
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] gridExtra_0.9 gpclib_1.5-1 ggmap_2.1 maptools_0.8-16
[5] lattice_0.20-6 rgeos_0.2-7 plyr_1.7.1 stringr_0.6
[9] ggplot2_0.9.1 sp_0.9-99 shapefiles_0.6 foreign_0.8-50
[13] fastshp_0.1-0
loaded via a namespace (and not attached):
[1] colorspace_1.1-1 dichromat_1.2-4 digest_0.5.2 labeling_0.1
[5] MASS_7.3-18 memoise_0.1 munsell_0.3 png_0.1-4
[9] proto_0.3-9.2 RColorBrewer_1.0-5 reshape2_1.2.1 RgoogleMaps_1.2.0
[13] rjson_0.2.8 scales_0.2.1 tools_2.15.1