0

我正在尝试使用 API 调用下载一些数据,但我确信代码可以在很大程度上得到优化。到目前为止,我只打了 47 个这样的电话,但将来可能会达到 20000 个。这是代码。编辑:由于每个人都无法访问该链接,因此我已将链接保存raw_dataR 对象编辑结束

library(RJSONIO)
library(RCurl)
library(data.table)
url = "http://172.31.101.107:11000/wantedapi-v4.0/segments/occ4?usecache=true&responsetype=json&engine=sphinx&country=JP&showrepost=false&msa=5685-id&date=2013-10-20-2017-05-04&passkey=wanted&showstaffing=false&showanonymous=false&showbulk=false&showfree=true&showduplicate=false&showexpired=true&showaggregator=true&showactive=true&usestemming=false&market=country%2C116&methodology=available&pagesize=1000"
raw_data <- getURL(url)
# Then covert from JSON into a list in R
data1 <- fromJSON(raw_data)
data2 <- do.call(rbind, data1[[1]]$segments)
# data2 <- rbindlist(data1[[1]]$segments) #produces error
data3 <- transpose(data2)
data4 <- data.table(
                    count = data3[[1]],
                    id = data3[[2]],
                    official_Occ_code = data3[[3]],
                    translation = data3[[4]],
                    official_occ_name = data3[[5]]
                    )
4

1 回答 1

0

我没有尝试打开该文件,因为它没有扩展名且不可预览(我公司的 IT 安全小组应该感到自豪),但我使用下面的管道来处理我认为可能是类似问题的问题:

library(magrittr)
library(data.table)
library(jsonlite)
library(curl)

DI <- curl::new_handle()
curl::handle_setheaders(DI,"X-API-KEY" = "my_key_for_the_API")
DI_Request <- "https://api.somewebsite.com/v1/direct-access/foobar?format=json&page=1&pagesize=10000"
curl_fetch_memory(DI_Request,DI)$content %>% 
  rawToChar() %>% 
  fromJSON(flatten = TRUE) %>% 
  setDT() -> Output_Table
于 2017-10-27T21:01:04.240 回答