“training-data”的相关标签问题

0 投票

7 回答

82418 浏览

neural-network - 用于神经网络训练的数据集

我正在寻找一些相对简单的数据集来测试和比较人工神经网络的不同训练方法。我希望不需要太多预处理的数据将其转换为输入和输出列表的输入格式（标准化为 0-1）。任何链接表示赞赏。

neural-network training-data

2009-06-07T23:41:37.127

0 投票

2 回答

817 浏览

artificial-intelligence - 神经网络对训练数据的响应是否得到保证？

我正在尝试训练一个 ANN（我使用这个库： http: //leenissen.dk/fann/），结果有些令人费解——基本上，如果我在用于训练的相同数据上运行经过训练的网络，输出是不是训练集中指定的，而是一些随机数。

例如，训练文件中的第一个条目类似于

第一行是输入值，第二行是所需的输出神经元的值。但是当我将完全相同的数据提供给经过训练的网络时，每次训练尝试都会得到不同的结果，并且它们与 1 完全不同，例如：

然后再次尝试：

我意识到训练集的大小可能不足（到目前为止我只有大约 100 个输入/输出对），但至少训练数据不应该触发正确的输出值吗？相同的代码适用于 FANN 网站上描述的“入门”XOR 函数（我已经用完了我的 1 个链接限制）

2009-09-02T18:00:55.373

0 投票

5 回答

3859 浏览

image-processing - 编写用于分析卫星图像的图像处理应用程序

我必须开始应用分析卫星图像来识别一些人造结构。我想为此使用 C 或 Java。

对于卫星，我计划使用谷歌地图数据。

我在这里有三个问题：

除了谷歌地图/地球之外，什么是 GIS 数据的最佳来源。
考虑到我将不得不使用第三方 API，编写此类应用程序的最佳语言
是否有可识别人造结构的开放式图像处理引擎？

那是很多问题，但我希望这里的聪明人可以在这里帮助我。

image-processing gis training-data satellite-image

2009-10-23T16:05:19.687

0 投票

5 回答

900 浏览

neural-network - 查找用于训练神经网络的天气数据

我正在寻找一些可用于训练神经网络进行预测的可下载天气数据，我在哪里可以找到一些？基本上，温度、湿度、风速/风向等任何可能有助于神经网络进行简单预测的事物。

neural-network forecasting training-data

2009-11-02T07:22:44.657

0 投票

2 回答

376 浏览

machine-learning - General frameworks for preparing training data?

As a student of computational linguistics, I frequently do machine learning experiments where I have to prepare training data from all kinds of different resources like raw or annotated text corpora or syntactic tree banks. For every new task and every new experiment I write programs (normally in Python and sometimes Java) to extract the features and values I need and transform the data from one format to the other. This usually results in a very large number of very large files and a very large number of small programs which process them in order to get the input for some machine learning framework (like the arff files for Weka).

One needs to be extremely well organised to deal with that and program with great care not to miss any important peculiarities, exceptions or errors in the tons of data. Many principles of good software design like design patterns or refactoring paradigms are no big use for these tasks because things like security, maintainability or sustainability are of no real importance - once the program successfully processed the data one doesn't need it any longer. This has gone so far that I even stopped bothering about using classes or functions at all in my Python code and program in a simple procedural way. The next experiment will require different data sets with unique characteristics and in a different format so that their preparation will likely have to be programmed from scratch anyway. My experience so far is that it's not unusual to spend 80-90% of a project's time on the task of preparing training data. Hours and days go by only on thinking about how to get from one data format to another. At times, this can become quite frustrating.

Well, you probably guessed that I'm exaggerating a bit, on purpose even, but I'm positive you understand what I'm trying to say. My question, actually, is this:

Are there any general frameworks, architectures, best practices for approaching these tasks? How much of the code I write can I expect to be reusable given optimal design?

machine-learning nlp code-reuse training-data

2010-01-14T17:11:03.077

0 投票

3 回答

426 浏览