3

我读入一个文件,R 返回一个列表,如下所示:
list 1: "1)" "Seo " "agad " "na " "ciad " "faclan " "a "
list 2: "cannteil " "(canntail) " “a-staigh” “dhan” “
列表 3: “2)” “Seo” “sinn” “sin” “direach” “fuirich…”

我想要的是得到一个向量,如果 [[i]] 中的第一个元素有一个数字,那么 [[i]] 中的其他元素也得到相同的数字,如果 [[i]] 中的第一个元素没有数字,则 [[i]] 中的所有元素都将具有上一行中显示的数字,如下所示:

"1\t1)" "1\tSeo " "1\tagad " "1\tna " "1\tciad " "1\tfaclan " "1\ta " .... "2\t2)" "2\tseo ", 2\tsinn....

谁能告诉我这个的代码?而且,有没有办法获得一个只包含每个单词对应的数字的向量,而不是在每个单词之前粘贴它?

谢谢

我的代码如下,但它没有给我想要的东西(所有元素都得到数字 1,即使是列表中以数字 2 开头的元素。)代码的哪一部分是错误的?

word="" 
temp="" 
for (i in 1:length(file)) { 
       if (grepl('\\d+\)',file[[i]][1])) {       
       snum=grep('\\d+',file[[i]][1]) 
       temp=paste(snum, file[[i]], sep="\t") 
         } else { 
       temp=paste(snum,utter.short[[i]],sep="\t") 
         }   word=c(word,temp) 
     }
4

2 回答 2

4

假设您有一个列表列表,例如...

 list1 = list("1)" "Seo " "agad " "na " "ciad " "faclan " "a ")
 list2 = list("cannteil " "(canntail) " "a-staigh " "dhan ")
 list3 = list("2)" "Seo " "sinn, " "sin " "direach " "fuirich…")

 biglist = list(list1, list2, list3)

这是使用此设置的非优雅/非高效解决方案

 counter = 1
 for (i in 1:length(biglist){
 if (gsub("\\D", "", biglist[[i]][[1]])>0){
     counter = gsub("\\D", "", biglist[[i]][[1]]
     biglist[[i]] = biglist[[i]][2:length(biglist[[i]])]
     }
 lapply(counter, paste, biglist[[i]], sep="\t")
 }

这可以处理任意数量的行和行的长度,只要第一项有 1 个数字,并且这些行是一个接一个地排序。

根据它的用途,可能有更好的方法可以读取和存储数据。

于 2012-10-07T10:44:11.667 回答
1

更灵活,更容易理解(优雅?)。它以任何顺序处理数字,缺少第一项,并且易于更改/维护。

# sample data
list0a = list("cannteil " ,"(canntail) ", "a-staigh " ,"dhan ")
list0b = list("cannteil " ,"(canntail) ", "a-staigh " ,"dhan ")
list1 = list("3)","Seo ","agad ","na ", "ciad ", "faclan " ,"a ")
list2 = list("cannteil " ,"(canntail) ", "a-staigh " ,"dhan ")
list3 = list("2)", "Seo ", "sinn, ", "sin " ,"direach ", "fuirich…")

# separate lists to test on
biglist = list(list1,list2,list3)
biglist2 = list(list0a,list0b,list1, list2, list3)

# get number vector
numlist <- sapply(biglist,function(x){
  as.numeric(gsub('[^0-9]','',x[1]))
})

# fill in gaps with indexing, drops leading items without numbers
numorder <- cumsum(!is.na(numlist))
numreplaced <- na.omit(numlist)[numorder]

# handle missing first numbers however you want. omit if guaranteed first element has number
numfinal <- c(rep('0',times = sum(numorder == 0)),numreplaced)

# make the strings as desired
Map(function(x,num){
  paste0(num,'\t',x)
},x = biglist,num = numfinal)
于 2019-04-03T15:44:34.397 回答