0

在本地存储库中,我有几个 json 文件。当我运行命令时

from datasets import load_dataset
dataset = load_dataset('json', data_files=['./100009.json'])

我收到以下错误:

OSError: [Errno 36] File name too long: '/home/infinity/.cache/huggingface/datasets/_home_infinity_.cache_huggingface_datasets_json_default-80a93068b3a4a494_0.0.0_83d5b3a2f62630efc6b5315f00f20209b4ad91a00ac586597caee3a4da0bef02.lock'

也许这很明显,但我不知道如何解决它。你能帮我吗?

编辑

这是json文件的内容:

{
    "id": "68af48116a252820a1e103727003d1087cb21a32",
    "article": [
        "by mark duell .",
        "published : .",
        "05:58 est , 10 september 2012 .",
        "| .",
        "updated : .",
        "07:38 est , 10 september 2012 .",
        "a pet owner starved her two dogs so badly that one was forced to eat part of his mother 's dead body in a desperate attempt to survive .",
        "the mother died a ` horrendous ' death and both were in a terrible state when found after two weeks of starvation earlier this year at the home of katrina plumridge , 31 , in grimsby , lincolnshire .",
        "the barely-alive dog was ` shockingly thin ' and the house had a ` nauseating and overpowering ' stench , grimsby magistrates court heard .",
        "warning : graphic content .",
        "horrendous : the male dog , scrappy -lrb- right -rrb- , was so badly emaciated that he ate the body of his mother ronnie -lrb- centre -rrb- to try to survive at the home of katrina plumridge in grimsby , lincolnshire .",
        "the suffering was so serious that the female staffordshire bull terrier , named ronnie , died of starvation , nigel burn , prosecuting , told the court last friday .",
        "suspended jail term : the dogs were in a terrible state when found after two weeks of starvation at the home of katrina plumridge , 31 -lrb- pictured -rrb- .",
        "the male dog , her son scrappy , was so badly emaciated that he ate her body to try to survive .",
        "` the degree of suffering caused to both dogs was extreme and prolonged , ' mr burn said . ` it was as severe and extreme as it can get . '",
        "the alarm was raised when a letting agent visited her home and saw dog mess on the steps , stairs , an upstairs floor and a bed .",
        "a painfully thin dog jumped past him . he said its ribs , spine and hip bones could all be seen and it was the thinnest dog he had ever witnessed .",
        "he tried to go into the kitchen but it was blocked from the inside by the dead body of the mother dog . the letting agent then called the royal society for the prevention of cruelty to animals .",
        "mr burn said : ` every single bone in its frame was visible and the stomach was curved in . the empty dog bowls were bone dry . '",
        "a decorator who went into the house said the stench made him feel physically sick , ronnie was like a skeleton and scrappy was ` shockingly thin ' .",
        "a veterinary surgeon estimated that the dogs would have been suffering from starvation for at least two weeks .",
        "plumridge moved out of the house on march 28 but the dogs were n't found until april 19 . she had claimed a friend was supposed to be finding new homes for the dogs and left them without going back to check on them .",
    ],
    "abstract": [
        "neglect by katrina plumridge saw staffordshire bull terrier ronnie die .",
        "dog 's son scrappy was forced to eat her to survive at grimsby house .",
        "alarm raised by letting agent shocked by ` thinnest dog he 'd ever seen '",
    ]
4

2 回答 2

0

这看起来是 huggingface 库中的一个错误。它试图读取或写入对于底层文件系统来说太长的文件名(在 Ubuntu 的情况下可能是 ext4)。我在这里打开了一个问题。

于 2021-09-15T18:18:48.027 回答
0

在处理大型数据集时,使用 pandas 数据框是合适的。

import pandas as pd 
df= pd.read_json(r'Path where you saved the JSON file\File Name.json') 
print (df)
于 2021-05-26T12:06:35.427 回答