我需要从 pdf 中提取表格。这是链接
https://www.acea.be/uploads/statistic_documents/ACEA_Report_Vehicles_in_use-Europe_2018.pdf
我想要这个 pdf 中的第一个表。
这是我的代码
Sys.setenv(JAVA_HOME='C:\\Program Files\\Java\\jre1.8.0_201') # for 64-bit version
# install.packages("devtools")
library(tabulizer)
library(tabulizerjars)
library(tidyverse)
tab <- extract_tables("https://www.acea.be/uploads/statistic_documents/ACEA_Report_Vehicles_in_use-Europe_2018.pdf")
tab[[1]]
head(tab[[1]])
但在 2012、2013、2015、2016 年的 o/p 列中,正在追加到一列中。我想要 pdf 文件中的表格。
我的代码的o/p。
[,1] [,2] [,3]
[1,] "Croatia" "1,445,0001,433,5631,458,1491,489,3381,540,2603.4" ""
[2,] "Czech Republic" "4,698,8004,787,8494,893,5625,115,3165,368,6605.0" ""
[3,] "Denmark" "2,225,1642,265,3492,320,9822,391,7552,477,4783.6" ""
[4,] "Estonia" "602,133628,562652,949676,592703,1513.9" ""
[5,] "Finland" "2,560,1902,575,9512,595,8672,612,9222,629,4320.6" ""
[6,] "France" "31,600,00031,650,00031,799,00031,915,49331,999,9530.3" ""