问题标签 [jsonlines]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

121 问题

0 投票

1 回答

125 浏览

python - 如何从具有变化元素的 JSONL 文件中提取元素？

我想从 JSONL 文件中的标记中提取“文本”。如果存在标签，那么我也想提取它。如果它不存在，那么我想插入“O”作为标签的值

如果不存在标签，可用于从标记中提取文本和 id 的代码如下：（感谢@DeveshKumarSingh in my previous question）

预期输出：

python jsonlines

2019-05-26T15:50:37.720

0 投票

1 回答

97 浏览

scrapy - JsonLinesItemExporter outputs an array in each field

I'm using JsonLinesItemExporter to export some data and instead of

scrapy is writing the following to file:

(From debug) it seems I'm passing a correct value (not a list) and that both item.add_value and item.replace_value are replacing my strings by a single string element list.

Is this configurable? If not, how to get a different behaviour? Extend JsonLinesItemExporter or is there a better approach?

scrapy jsonlines

2019-06-20T13:19:07.627

0 投票

3 回答

17423 浏览

json - 使用 jq 过滤空值和/或空值

我有一个带有 jsonlines 的文件，想找到空值。

并想输出空和/或空值及其键：

我认为它应该类似于cat myexample | jq '. | select(. == "")'，但不起作用。

json null jq key-value jsonlines

2019-06-20T18:37:38.347

0 投票

1 回答

4381 浏览

python-3.x - 使用 Python 创建 JSONL

我不知道如何使用 Python3 创建JSONL。

我试过使用缩进选项来转储，但它似乎没有什么不同，并且分隔符选项似乎不是一个很好的用例。不确定我在这里缺少什么？

python-3.x jsonlines

2019-07-17T08:12:11.427

0 投票

0 回答

153 浏览

python - 将 Numpy 数组写入 jsonlines 文件

我想将 numpy 数组保存到 jsonlines 文件中。使用下面的代码：

但我得到这个错误：

TypeError: Object of type 'ndarray' is not JSON serializable

我想知道有没有办法以 jsonl 格式保存 numpy 数组。

python numpy-ndarray jsonlines

2019-08-22T20:56:20.723

0 投票

2 回答

132 浏览

python - Jsonlines 文件导致 KeyError Python

我有一个我正在加载的 json 文件，以便通过某个名为“sender_id”的键进行过滤。我似乎可以过滤任何其他键，但是在过滤“sende_id”时会导致 KeyError: 'sender_id'

我的python脚本如下：

我的 jsonlines 文件示例如下：

python json python-3.x python-2.7 jsonlines

2019-08-28T07:47:32.647

0 投票

2 回答

635 浏览

python - 将过滤后的 json 值写入 csv

我正在循环一个 json 行文件，我只是过滤发件人 ID 和状态并将其输出到终端。列表中有多个发件人 ID，而发件人只是一个字符串。我希望能够将输出写入一个 csv 文件，其中第一列为 STATUS，第二列为 SENDER_ID。我在我的脚本顶部尝试过这个，但不确定这是否是正确的做法。

我的脚本如下。此时我需要将其写入 csv。我已经阅读了文档，但仍然有点不确定。

python json python-2.7 csv jsonlines

2019-08-28T12:45:41.607

0 投票

1 回答

2230 浏览

json - Google Apps 脚本 - 如何将 JSON 数据流式传输到 BigQuery？

在此参考https://developers.google.com/apps-script/advanced/bigquery中，

为了将 CSV 数据加载到 BigQuery，他们使用：

据我了解，他们向 BigQuery 发送了一个file.getBlob().setContentType('application/octet-stream');不友好的 blob

如何在 Apps 脚本中将 JSON 发送到 BigQuery？

使用库@google-cloud/bigquery（在 Apps 脚本之外的项目中使用），我可以执行以下操作：

https://cloud.google.com/bigquery/streaming-data-into-bigquery#streaminginsertexamples

json google-apps-script google-bigquery google-apps-script-addon jsonlines

2019-09-20T09:46:37.327

0 投票

1 回答

221 浏览

python - If 语句基于 jsonlines 文件中存在的值

我的代码可以通过 Beautiful Soup 从网站上提取 400 多个 PDF。PyPDF2 将 PDF 转换为文本，然后将其保存为名为“output.jsonl”的 jsonlines 文件。

当我在未来的更新中保存新的 PDF 时，我希望 PyPDF 仅将新的 PDF 转换为文本并在 jsonlines 文件中附加该新文本，这正是我苦苦挣扎的地方。

jsonlines 文件如下所示：

PDF 被命名为“1234”、“1235”等，并保存在 file_path_PDFs 中。我试图识别“id”是否是 jsonlines 文件中的值，那么 PyPDF2 不需要将其转换为文本。如果它不存在，则照常处理。

照原样，我相信这段代码没有找到任何值，并且每次运行它时都会转换所有文本。显然，这是一个相当漫长的过程，每个文档跨越 200 或 300 页。

python jsonlines

2019-09-26T21:29:33.993

0 投票

1 回答

62 浏览

json - 根据另一个键特定值获取特定键的所有值

我有一个超过 100 万行的jsonlines格式文件（比如说 BIG.json）。我想根据一些键/值依赖项过滤这个文件（解释如下）。

当然，所有行的结构都是相同的，这里是这个文件的 5 行：

该文件是解析多个 XML 文件并从中提取数据的效果。

我想根据“person”键值过滤一些行并将它们放到另一个jsonlines文件中，最好与“person”键值命名相同。例如，名为“Senator Andrzej Szczypiorski.json”的文件应包含 BIG.json 的每一行，在“person”键下具有完全“Senator Andrzej Szczypiorski”值。

json jq jsonlines

2019-10-01T09:27:28.810

1 2 3 4 5 6 7 8 9 10

问题标签 [jsonlines]

Reference