0

使用 discogs,我获得了一个给定爵士音乐家的发行列表,如下所示:

releases <- list()
artists <- list()
artistURL <- "https://api.discogs.com/artists/"
library(jsonlite)
a <- function(artistcode){
  for(i in 0:3){
    artistset <- fromJSON(paste0(artistURL, artistcode, "/releases?page=", i))
    message("Retrieving page ", i)

    releases[[i+1]] <- (as.data.frame(artistset$releases.main_release))
      }
  return(artistset)
  message("Total rows=", dim(artistset[[2]])[1] )
}

temp<-a('265634') # art tatum 265634
temp$releases$title # shows first 50 albums...where's the rest?

检查后,您会看到temp两个列表,第二个称为发布。发行中包含 50 张专辑。但是,我在fromJSON命令中要求提供三页输出,但我有 22 页的结果temp

str(temp$pagination)  # there are 22 pages of 50 lines per page

如何将这位艺术家的所有标题和其他数据(价值 22 页)提取到数据框中?折腾了purrr没用。谢谢你的帮助!

4

1 回答 1

1

这应该会更好。releases仅在您的函数范围内定义,并没有返回到全局环境。还更改了函数以使用 JSON 中的 pages 变量来构造循环:

a <- function(artistcode){
  releases <- list()
  metadata <- fromJSON(paste0(artistURL, artistcode, "/releases?page=", 1))
  for (i in 1:metadata$`pagination`$pages){
    message("Retrieving page ", i)
    Sys.sleep(2) #added as I was being rate limited
    releases[[i]] <- fromJSON(paste0(artistURL, artistcode, "/releases?page=", i))$releases
  }
  return(releases)
}

temp<-a('265634') # art tatum 265634

temp[[1]] # page 1
temp[[2]] # page 2
于 2019-08-09T18:01:37.147 回答