问题标签 [huggingface-datasets]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

63 问题

0 投票

0 回答

9 浏览

csv - Huggingface datasets, how to avoid the string NA from being interpreted as nan when using load_dataset

I have looked at it a number of places in the process. With panda beforehand I create a csv file and there are cells that contain the string NA which is meant to be exactly that: a string with no mathematical notion. I think that the appropriate thing is to keep the csv file as it is.

So if I do have a csv file with NA, how can I use load_dataset to recognize that as the string NA and not what I am seeing as a Python None? My code downstream breaks because I have a few unexpected None values.

csv na huggingface-datasets

2022-02-22T19:48:36.507

0 投票

0 回答

15 浏览

python - 在 Google Colab 上运行 python 脚本时，有没有办法选择设备？

我正在尝试运行 run_language_modeling.py 这是一个来自拥抱脸的 python 脚本。但是，当我尝试运行它时，我注意到我只使用我的 CPU 而不是 GPU（即使环境设置为使用它。所以我正在寻找一种方法来告诉脚本使用显卡。

这就是我所拥有的...

要验证我使用的是 GPU：!nvidia-smi

由此可见：

然后，我正在运行以下调用该.py文件的脚本：

这种情况一直持续到 CPU 使用率上升到 100%。我想可能有类似的东西，--device但我一直没能找到它。我在网上看到的其他一些帖子提到我可以做到：

选择我想要的 GPU，但它并没有真正做任何我能说的事情。我也尝试过：

有什么建议么？

python google-colaboratory training-data huggingface-transformers huggingface-datasets

2022-02-25T14:20:00.080

0 投票

0 回答

6 浏览

python - 如何为 POS 标记禁用 seqeval 标签格式

我正在尝试使用 huggingface 的seqeval度量实现来评估我的 POS 标记器，但是由于我的标记不是为 NER 制作的，因此它们的格式不符合图书馆的预期。因此，当我尝试阅读分类报告的结果时，特定类别结果的标签始终缺少第一个字符（如果我通过，则为最后一个字符suffix=True）。

有没有办法禁用标签中的实体识别，或者我必须通过我的所有标签和起始空间来解决这个问题？（鉴于该库应该适合 POS 标记，我希望有一个内置的解决方案）

SSCCE：

输出：

	精确	记起	f1-分数	支持
DV	1.00	1.00	1.00	2
ER:压力	1.00	1.00	1.00	1
新台币	1.00	1.00	1.00	1
反渗透	1.00	1.00	1.00	1
反相	1.00	1.00	1.00	1
微平均	1.00	1.00	1.00	6
宏平均	1.00	1.00	1.00	6
加权平均	1.00	1.00	1.00	6

python nlp pos-tagger huggingface-datasets

2022-03-02T18:43:54.730

1 2 3 4 5 6 7 8 9 10

问题标签 [huggingface-datasets]

csv - Huggingface datasets, how to avoid the string NA from being interpreted as nan when using load_dataset

python - 在 Google Colab 上运行 python 脚本时，有没有办法选择设备？

python - 如何为 POS 标记禁用 seqeval 标签格式

SSCCE：

输出：

Reference