29

我有一个位置列表,其中包含每个位置的城市、州、邮编、纬度和经度。

我单独有一份县级经济指标清单。我玩过zipcode包、ggmap包和其他几个免费的地理编码网站,包括 US Gazeteer 文件,但似乎找不到匹配这两个部分的方法。

目前是否有任何软件包或其他来源可以做到这一点?

4

4 回答 4

22

我最终使用了JoshO'Brien上面提到的建议并在此处找到。

state我把他的代码改成county如下所示:

library(sp)
library(maps)
library(maptools)

# The single argument to this function, pointsDF, is a data.frame in which:
#   - column 1 contains the longitude in degrees (negative in the US)
#   - column 2 contains the latitude in degrees

latlong2county <- function(pointsDF) {
    # Prepare SpatialPolygons object with one SpatialPolygon
    # per county
    counties <- map('county', fill=TRUE, col="transparent", plot=FALSE)
    IDs <- sapply(strsplit(counties$names, ":"), function(x) x[1])
    counties_sp <- map2SpatialPolygons(counties, IDs=IDs,
                     proj4string=CRS("+proj=longlat +datum=WGS84"))

    # Convert pointsDF to a SpatialPoints object 
    pointsSP <- SpatialPoints(pointsDF, 
                    proj4string=CRS("+proj=longlat +datum=WGS84"))

    # Use 'over' to get _indices_ of the Polygons object containing each point 
    indices <- over(pointsSP, counties_sp)

    # Return the county names of the Polygons object containing each point
    countyNames <- sapply(counties_sp@polygons, function(x) x@ID)
    countyNames[indices]
}

# Test the function using points in Wisconsin and Oregon.
testPoints <- data.frame(x = c(-90, -120), y = c(44, 44))

latlong2county(testPoints)
[1] "wisconsin,juneau" "oregon,crook" # IT WORKS
于 2012-11-20T13:36:45.470 回答
9

将邮政编码与县匹配是困难的。(某些邮政编码跨越多个县,有时跨越一个州。例如 30165)

我不知道任何可以为您匹配这些的特定 R 包。

但是,您可以从密苏里人口普查数据中心获得一张漂亮的表格。
您可以使用以下内容进行数据提取:http ://bit.ly/S63LNU

示例输出可能如下所示:

    state,zcta5,ZIPName,County,County2
    01,30165,"Rome, GA",Cherokee AL,
    01,31905,"Fort Benning, GA",Russell AL,
    01,35004,"Moody, AL",St. Clair AL,
    01,35005,"Adamsville, AL",Jefferson AL,
    01,35006,"Adger, AL",Jefferson AL,Walker AL
    ...

注意县2。元数据解释可以在这里找到。

    county 
    The county in which the ZCTA is all or mostly contained. Over 90% of ZCTAs fall entirely within a single county.

    county2 
    The "secondary" county for the ZCTA, i.e. the county which has the 2nd largest intersection with it. Over 90% of the time this value will be blank.

另见 ANSI 县代码 http://www.census.gov/geo/www/ansi/ansi.html

于 2012-11-09T22:19:01.303 回答
7

I think the package "noncensus" is helpful.

corresponding is what I use to match zipcode with county

### code for get county based on zipcode

library(noncensus)
data(zip_codes)
data(counties)

state_fips  = as.numeric(as.character(counties$state_fips))
county_fips = as.numeric(as.character(counties$county_fips))    
counties$fips = state_fips*1000+county_fips    
zip_codes$fips =  as.numeric(as.character(zip_codes$fips))

# test
temp = subset(zip_codes, zip == "30329")    
subset(counties, fips == temp$fips)
于 2014-12-01T21:37:02.700 回答
3

一个简单的选项是使用geocode()函数 in ggmap, 与选项output="more"output="all

这可以采用灵活的输入,例如地址或纬度/经度,并将地址、城市、县、州、国家、邮政编码等作为列表返回。

require("ggmap")
address <- geocode("Yankee Stadium", output="more")

str(address)
$ lon                        : num -73.9
$ lat                        : num 40.8
$ type                       : Factor w/ 1 level "stadium": 1
$ loctype                    : Factor w/ 1 level "approximate": 1
$ address                    : Factor w/ 1 level "yankee stadium, 1 east 161st street, bronx, ny 10451, usa": 1
$ north                      : num 40.8
$ south                      : num 40.8
$ east                       : num -73.9
$ west                       : num -73.9
$ postal_code                : chr "10451"
$ country                    : chr "united states"
$ administrative_area_level_2: chr "bronx"
$ administrative_area_level_1: chr "ny"
$ locality                   : chr "new york"
$ street                     : chr "east 161st street"
$ streetNo                   : num 1
$ point_of_interest          : chr "yankee stadium"
$ query                      : chr "Yankee Stadium"

Another solution is to use a census shapefile, and the same over() command from the question. I ran into a problem using the maptools base map: because it uses the WGS84 datum, in North America, points that were within a few miles of the coast were mapped incorrectly and about 5% of my data set did not match up.

try this, using the sp package and Census TIGERLine shape files

counties <- readShapeSpatial("maps/tl_2013_us_county.shp", proj4string=CRS("+proj=longlat +datum=NAD83"))

# Convert pointsDF to a SpatialPoints object 
pointsSP <- SpatialPoints(pointsDF, proj4string=CRS("+proj=longlat +datum=NAD83"))

countynames <- over(pointsSP, counties)
countynames <- countynames$NAMELSAD
于 2013-11-21T20:56:44.803 回答