我有一个这样定义的数据框,我正在尝试为深度学习问题创建一个序列标签输入。所以我为每个句子元素都有标签,我为句子元素创建 WordIndex 向量,将它们填充 10 个维度,对句子元素的标签执行相同的操作(为标签创建 TagIndex,将它们填充到 10方面)。然后我需要将 TagIndices 转换为分类变量。那就是错误出现的时候。任何帮助都会很棒。这是正确的方法吗?
SentenceID = c(1,1,1,1,2,2,2,3,3,3,3,3,3,3,3)
Tokens = c("I","went","to","school","nobody","can","find","some","people","know","what","they","are","doing","now")
WordIndex = c(3,4,7,8,9,10,12,54,34,66,33,89,87,23,22)
TagIndex = c(1,3,2,4,1,3,4,1,2,4,3,4,2,3,4)
df = data.frame(SentenceID, Tokens, WordIndex, TagIndex)
lst <- split(df$WordIndex, f = df$SentenceID)
lstWord2 <- lapply(lst, function(x){
if (length(x) < 10){
x2 <- c(x, rep(0, 10 - length(x)))
}
return(x2)
})
lstTag <- split(df$TagIndex, f = df$SentenceID)
lstTag2 <- lapply(lstTag, function(x){
if (length(x) < 10){
x2 <- c(x, rep(0, 10 - length(x)))
}
return(x2)
})
is.vector(lstTag2)
y <- to_categorical(lstTag2, num_classes = NULL)
我得到的错误是这个。
Error in py_call_impl(callable, dots$args, dots$keywords) :
TypeError: int() argument must be a string, a bytes-like object or a number, not 'dict'
Detailed traceback:
File "C:\Users\balak\AppData\Local\conda\conda\envs\R-TENS~1\lib\site-packages\keras\utils\np_utils.py", line 22, in to_categorical
y = np.array(y, dtype='int')