0

我正在使用RMySQL()将数据从 R 发送到 MySQL 数据库。问题是数据库没有收到任何数据....我正在使用doParallel(),因为我正在运行超过 4500 次迭代....可能是因为我尝试将数据发送到pullSpread()函数中的数据库吗?

library(RMySQL)
library(doParallel)
library(stringr)
library(foreach)

makeCluster(detectCores()) # ANSWER = 4
cl <- makeCluster(4, type="SOCK") # also used PSOCK & FORK but receive the same problem
registerDoParrallel(cl)

# Now use foreach() and %dopar% to pull data...
# the apply(t(stock1), 2, pullSpread) works but not "parallelized"
# I have also used clusterApply() but is unsuccessful
system.time(
foreach(a=t(stock1)) %dopar% pullSpread(a)
)

当我查看我的工作目录时,所有文件都按原样成功复制到一个.csv文件中,但是当我检查 MySQL 工作台甚至从 R 调用文件时,它们不存在......

这是stock1()字符向量和pullSpread()使用的函数...

# This list contains more than 4500 iterations.. so I am only posting a few
stock1<-c(
  "SGMS.O","SGNL.O","SGNT.O",
  "SGOC.O","SGRP.O", ...) 

功能所需的重要日期:

Friday <- Sys.Date()-10

# Get Previous 5 days
Thursday <- Friday - 1
Wednesday <- Thursday -1
Tuesday <- Wednesday -1
Monday <- Tuesday -1

#Make Them readable for NetFonds 
Friday <- format(Friday, "%Y%m%d")
Thursday<- format(Thursday, "%Y%m%d")
Wednesday<- format(Wednesday, "%Y%m%d")
Tuesday<- format(Tuesday, "%Y%m%d")
Monday<-format(Monday, "%Y%m%d")

这是pullSpread()功能:

pullSpread = function (stock1){
AAPL_FRI<- read.delim(header=TRUE, stringsAsFactor=FALSE,
                    paste(sep="",
                          "http://www.netfonds.no/quotes/posdump.php?date=",
                          Friday,"&paper=",stock1,"&csv_format=txt"))

tryit <- try(AAPL_FRI[,c(1:7)])

if(inherits(tryit, "try-error")){

rm(AAPL_FRI)

} else {



AAPL_THURS<- read.delim(header=TRUE, stringsAsFactor=FALSE,
                      paste(sep="",
                            "http://www.netfonds.no/quotes/posdump.php?date=",
                            Thursday,"&paper=",stock1,"&csv_format=txt"))

AAPL_WED<- read.delim(header=TRUE, stringsAsFactor=FALSE,
                    paste(sep="",
                          "http://www.netfonds.no/quotes/posdump.php?date=",
                          Wednesday,"&paper=",stock1,"&csv_format=txt"))

AAPL_TUES<- read.delim(header=TRUE, stringsAsFactor=FALSE,
                     paste(sep="",
                           "http://www.netfonds.no/quotes/posdump.php?date=",
                           Tuesday,"&paper=",stock1,"&csv_format=txt"))

AAPL_MON<- read.delim(header=TRUE, stringsAsFactor=FALSE,
                    paste(sep="",
                          "http://www.netfonds.no/quotes/posdump.php?date=",
                          Monday,"&paper=",stock1,"&csv_format=txt"))


SERIES <- rbind(AAPL_MON,AAPL_TUES,AAPL_WED,AAPL_THURS,AAPL_FRI)

#Write .CSV File
write.csv(SERIES,paste(sep="",stock1,"_",Friday,".csv"), row.names=FALSE) 
dbWriteTable(con2,paste0( "",str_sub(stock1, start = 1L, end = -3L),""),paste0(   
"~/Desktop/R/",stock1,"_",Friday,".csv"), append=T)
}
}
4

1 回答 1

2

使用类似这样的方法检索上周五:

Friday <- Sys.Date()
while(weekdays(Friday) != "Friday") 
{
  Friday <- Friday - 1
}

作为一种良好的做法,当从 Internet 检索数据时,将下载数据的行为与处理数据的行为分开。这样,当处理失败时,您就不会浪费时间和带宽重新下载东西。

lastWeek <- format(Friday - 0:4, "%Y%m%d")
stockDatePairs <- expand.grid(Stock = stock1, Date = lastWeek)
urls <- with(
  stockDatePairs,
  paste0(
    "http://www.netfonds.no/quotes/posdump.php?date=",
    Date,
    "&paper=",
    Stock,
    "&csv_format=txt"
  )
)
for(url in urls)
{
  # or whatever file name you want
  download.file(url, paste0("data from ", make.names(url), ".txt"))
}

确保您知道这些文件保存到哪个目录。(要么提供绝对路径,要么设置你的工作目录。)

现在尝试读取和rbind读取这些文件。

如果可行,那么您可以尝试并行执行。

另请注意,许多在线数据服务将限制您可以下载的速率,除非您为服务付费。所以并行下载可能只是意味着你更快地达到了极限。

于 2014-07-14T12:53:59.390 回答