我在 R 中的简单实现方面非常熟练,但是我不熟悉通过 R 和并行编程与 SQL 进行通信(在今天之前没有这两个方面的经验)。我编写了以下代码,其中包含来自博客、论坛等的提示。
library(doParallel)
library(RMySQL)
library(DBI)
library(foreach)
cl <- makeCluster(12)
registerDoParallel(cl)
Postcodecsv <- read.csv("C:/Users/Henry Crosby/Desktop/PostcodeLatLong.csv")
mydb = dbConnect(MySQL(), user='****', password="******* ****",
dbname='population_distance', host='****.**.*.*')
dbListFields(mydb,'Postcodes')
foreach (a = 1:120000, .combine="rbind") %dopar% {
Done <- dbGetQuery(mydb, paste("select FID, Postcode2, (6371 * acos( cos(
radians( ",Postcodecsv[a,6],"))*cos(radians(latitude))*cos(radians(Longitude)-radians(",Postcodecsv[a,5],"))+sin(radians(",Postcodecsv[a,6],") )* sin( radians( latitude ) ) ) ) AS distance from Postcodes having distance < 2 ORDER BY distance",sep=" "))
write.table(Done,file="C:/Users/Henry Crosby/Desktop/2km.csv",append=TRUE, col.names=FALSE, sep=",")
}
此计算在 for 循环中工作,但需要永远(我必须将其应用于 LARGE 数据集!)。当我运行上面的代码时,我得到下面的错误!有人可以告诉我为什么会出现错误以及如何解决它!
{ 中的错误:任务 1 失败 - “找不到函数“dbGetQuery””