9

我正在尝试在 Python 中加载 MNIST 原始数据集。该sklearn.datasets.fetch_openml功能似乎不适用于此。

这是我正在使用的代码-

from sklearn.datasets import fetch_openml
dataset = fetch_openml("MNIST Original") 

我得到这个错误-

File "generateClassifier.py", line 11, in <module>
  dataset = fetch_openml("MNIST Original")
  File "/home/inglorion/.local/lib/python3.5/site- 
packages/sklearn/datasets/openml.py", line 526, in fetch_openml
data_info = _get_data_info_by_name(name, version, data_home)
  File "/home/inglorion/.local/lib/python3.5/site- 
packages/sklearn/datasets/openml.py", line 302, in 
_get_data_info_by_name
    data_home)
  File "/home/inglorion/.local/lib/python3.5/site- 
packages/sklearn/datasets/openml.py", line 169, in 
_get_json_content_from_openml_api
    raise error
  File "/home/inglorion/.local/lib/python3.5/site- 
packages/sklearn/datasets/openml.py", line 164, in 
_get_json_content_from_openml_api
    return _load_json()
  File "/home/inglorion/.local/lib/python3.5/site- 
packages/sklearn/datasets/openml.py", line 52, in wrapper
    return f()
  File "/home/inglorion/.local/lib/python3.5/site- 
packages/sklearn/datasets/openml.py", line 160, in _load_json
    with closing(_open_openml_url(url, data_home)) as response:
  File "/home/inglorion/.local/lib/python3.5/site- 
packages/sklearn/datasets/openml.py", line 109, in _open_openml_url
with closing(urlopen(req)) as fsrc:
  File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.5/urllib/request.py", line 472, in open
    response = meth(req, response)
  File "/usr/lib/python3.5/urllib/request.py", line 582, in 
http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.5/urllib/request.py", line 510, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 444, in 
_call_chain
    result = func(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 590, in 
http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
        urllib.error.HTTPError: HTTP Error 400: Bad Request

我怎样才能解决这个问题?或者,还有其他方法可以将 MNIST 数据集加载到 Python 中吗?

我正在使用 0.20.2 版本scikit-learn

一般来说,我对编程比较陌生,所以如果我能得到一个简单的答案,我将不胜感激。谢谢!

4

5 回答 5

17

尝试

mnist = fetch_openml('mnist_784')

我在 https://www.openml.org/d/554 下通过https://www.openml.org/找到了它

于 2019-03-05T15:40:28.930 回答
3

您可以使用:

mist = fetch_openml('mnist_784', version=1)
于 2020-03-22T12:21:03.663 回答
1

fetch_mldata 自 scikit-learn v0.20 起已弃用

测试sklearn版本

import sklearn
sklearn.__version__

导入数据集

from sklearn.datasets import fetch_openml
X, y = fetch_openml('mnist_784', version=1, return_X_y=True)

例子

于 2020-02-28T10:41:08.687 回答
1

方法 fetch_openml() 从不稳定且无法连接的 mldata.org 下载数据集。另一种方法是手动从原始数据中下载数据集。您可以从 Kaggle( mnist data ) 下载数据并运行以下代码

from scipy.io import loadmat
mnist = loadmat("../input/mnist-original.loadmat")
mnist_data = mnist["data"].T
mnist_label = mnist["label"][0]
于 2021-05-23T05:33:55.780 回答
0

我也面临着类似的问题。更新 sklearn 的版本对我有用

我刚刚运行了以下命令

conda update scikit-learn

然后验证版本,你可以做这样的事情

import nltk
import sklearn

print('nltk version: {}.'.format(nltk.__version__))
print('scikit-learn version: {}.'.format(sklearn.__version__))

更新 sklearn 的版本后不要忘记重新启动内核。

于 2021-04-05T01:05:59.253 回答