r - 如何在 JSON 对象中循环遍历 JSON 数组

Question

我一直在尝试学习 R，并且我有一个包含单行 JSON 对象的 JSON 文件，每个对象都有一个帐户数据数组。我要做的是解析每一行，然后从解析的 JSON 对象中获取 JSON 数组，提取帐户类型和金额。但我的问题是我不知道如何最好地把这两个属性拉出来。

我尝试使用 dplyr 包从我的每条 JSON 行中提取“accountHistory”，但出现控制台错误。当我尝试：

select(JsonAcctData, "accountHistory.type", "accountHistory.amount")

发生的情况是，我的代码只返回每行类型和金额的最后一个帐户。

现在我的代码正在写入一个 csv 文件，我可以看到我需要的所有数据，但我只想删除 ext

library("rjson")
library("dplyr")

parseJsonData <- function (sourceFile, outputFile) 
{
  #Get all total lines in the source file provided
  totalLines <- readLines(sourceFile)

  #Clean up old output file
  if(file.exists(outputFile)){
    file.remove(outputFile)
  }

  #Loop over each line in the sourceFile, 
  #parse the JSON and append to DataFrame
  JsonAcctData <- NULL
  for(i in 1:length(totalLines)){
    jsonValue <- fromJSON(totalLines[[i]])
    frame <- data.frame(jsonValue)
    JsonAcctData <- rbind(JsonAcctData, frame)
  }

  #Try to get filtered data
  filteredColumns <- 
    select(JsonAcctData, "accountHistory.type", "accountHistory.amount")
  print(filteredColumns)

  #Write the DataFrame to the output file in CSV format
  write.csv(JsonAcctData, file = outputFile)

}

测试 JSON 文件数据：

{"name":"Test1", "accountHistory":[{"amount":"107.62","date":"2012-02- 
  02T06:00:00.000Z","business":"CompanyA","name":"Home Loan Account 
  6220","type":"payment","account":"11111111"}, 
  {"amount":"650.88","date":"2012-02- 
  02T06:00:00.000Z","business":"CompanyF","name":"Checking Account 
  9001","type":"payment","account":"123123123"}, 
  {"amount":"878.63","date":"2012-02- 
  02T06:00:00.000Z","business":"CompanyG","name":"Money Market Account 
  8743","type":"deposit","account":"123123123"}]}
  {"name":"Test2", "accountHistory":[{"amount":"199.29","date":"2012-02-            
  02T06:00:00.000Z","business":"CompanyB","name":"Savings Account 
  3580","type":"invoice","account":"12312312"}, 
  {"amount":"841.48","date":"2012-02- 
  02T06:00:00.000Z","business":"Company","name":"Home Loan Account 
  5988","type":"payment","account":"123123123"}, 
  {"amount":"116.55","date":"2012-02- 
  02T06:00:00.000Z","business":"Company","name":"Auto Loan Account 
  1794","type":"withdrawal","account":"12312313"}]}

我期望得到一个仅包含帐户类型和每个帐户中持有的金额的 csv。

score 1 · Accepted Answer

这是一种使用regex(in base R)的方法

# read json 
json <- readLines('test.json', warn = FALSE)
# extract with regex
amount <- grep('\"amount\":\"\\d+\\.\\d+\"', json, value = TRUE)
amount <- as.numeric(gsub('.*amount\":\"(\\d+\\.+\\d+)\".*', '\\1', amount, perl = TRUE))
type   <- grep('\"type\":\"\\w+\"', json, value = TRUE)
type   <- gsub('.*type\":\"(\\w+)\".*', '\\1', type, perl = TRUE)
# output
data.frame(type, amount)
#         type amount
# 1    payment 107.62
# 2    payment 650.88
# 3    deposit 878.63
# 4    invoice 199.29
# 5    payment 841.48
# 6 withdrawal 116.55

r - 如何在 JSON 对象中循环遍历 JSON 数组

1 回答 1

Related

Reference