0

我是 R 的新程序员,我正在写我的论文来训练神经网络。首先,我使用 rminer 进行数据挖掘,然后使用 nnet 进行训练。现在我不知道哪个函数用于划分训练集和验证集中的数据集,因此是 k-fold 交叉验证,并且在每个使用 nnet 之后。对不起我的英语不好。提前致谢

4

2 回答 2

1

当您不知道如何着手时,这是一种在 R 中获得有关新主题/包的帮助的方法:

library(help=package.name)

这将为您提供以该语言定义的所有功能和数据集的概述,并为每个功能和数据集提供简短的标题。确定所需功能后,您可以查阅相关功能的文档,如下所示:

?function.name

在文档中,还要注意See Also通常列出与所考虑的功能一起有用的功能的部分。此外,工作的例子。你也可以使用

example(function.name)

用于演示该函数的使用和使用它的常见习语。

最后,如果你幸运的话,包作者可能已经vignette为包写了一个。您可以像这样搜索包中的所有小插曲:

vignette(package="package.name")

希望这能让您开始使用rminernnet包。

于 2013-06-21T11:17:47.073 回答
0

可能为时已晚,但我在寻找 Q 的答案时发现了这个 Q。你可以使用类似这样的东西

    # Splitting in training, Cross-Validation and test datasets
        #The entire dataset has 100% of the observations. The training dataset will have 60%, the Cross-Validation (CV) will have 20% and the testing dataset will have 20%.                                                                                                                                
        train_ind <- sample(seq_len(nrow(DF.mergedPredModels)), size = floor(0.6 * nrow(DF.mergedPredModels)))
        trainDF.mergedPredModels <- DF.mergedPredModels[train_ind, ]

        # The CV and testing datasets' observations will be built from the observations from the initial dataset excepting the ones from the training dataset
        # Cross-Validation dataset
        # The CV's number of observations can be changed simply by changing "0.5" to a fraction of your choice but the CV and testing dataset's fractions must add up to 1.
        cvDF.mergedPredModels <- DF.mergedPredModels[-train_ind, ][sample(seq_len(nrow(DF.mergedPredModels[-train_ind, ])), size = floor(0.5 * nrow(DF.mergedPredModels[-train_ind, ]))),]

        # Testing dataset
        testDF.mergedPredModels <- DF.mergedPredModels[-train_ind, ][-sample(seq_len(nrow(DF.mergedPredModels[-train_ind, ])), size = floor(0.5 * nrow(DF.mergedPredModels[-train_ind, ]))),]

        #temporal data and other will be added after the predictions are made because I don't need the models to be built on the dates. Additionally, you can add these columns to the training, CV and testing datasets and plot the real values of your predicted parameter and the respective predicitons over your time variables (half-hour, hour, day, week, month, quarter, season, year, etc.).
        # aa = Explicitly specify the columns to be used in the temporal datasets
        aa <- c("date", "period", "publish_date", "quarter", "month", "Season")
        temporaltrainDF.mergedPredModels <- trainDF.mergedPredModels[, c(aa)]
        temporalcvDF.mergedPredModels <- cvDF.mergedPredModels[, c(aa)]
        temporaltestDF.mergedPredModels <- testDF.mergedPredModels[, c(aa)]

        # bb = Explicitly specify the columns to be used in the training, CV and testing datasets
        bb <- c("quarter", "month", "Season", "period", "temp.mean", "wind_speed.mean", "solar_radiation", "realValue")
        trainDF.mergedPredModels.Orig <- trainDF.mergedPredModels[, c(bb)]
        trainDF.mergedPredModels <- trainDF.mergedPredModels[, c(bb)]
        smalltrainDF.mergedPredModels.Orig <- trainDF.mergedPredModels.Orig[1:10,] #see if the models converge without errors
        cvDF.mergedPredModels <- cvDF.mergedPredModels[, c(bb)]
        testDF.mergedPredModels <- testDF.mergedPredModels[, c(bb)]
# /Splitting in training, Cross-Validation and test datasets
于 2016-01-15T10:14:34.420 回答