6

我一直在清理 Open Refine 上的桌子。我现在有这样的:

REF                 Handle      Size        Price
2002, 2003          t-shirt1    M, L        23
3001, 3002, 3003    t-shirt2    S, M, L     24

我需要在 REF 和 Size 中拆分这些多值单元格,以便得到:

REF                 Handle      Size        Price
2002                t-shirt1    M           23
2003                t-shirt1    L           23  
3001                t-shirt2    S           24  
3002                t-shirt2    M           24
3003                t-shirt2    L           24

是否可以在 Open Refine 中执行此操作?“拆分多值单元格...”命令只处理一列。谢谢你,安娜丽塔

4

2 回答 2

4

是的,有可能:

  • 使用“,”作为分隔符拆分第一列。
  • 将第 2 列移动到位置一
  • 将您的项目显示为记录(不是行)
  • 使用“,”作为分隔符拆分第 3 列
  • 填写第 4 列和第 2 列
  • 重新排序列

这是我在 GREL 中的食谱:

[
  {
    "op": "core/row-removal",
    "description": "Remove rows",
    "engineConfig": {
      "facets": [
        {
          "invert": false,
          "expression": "row.starred",
          "selectError": false,
          "omitError": false,
          "selectBlank": false,
          "name": "Starred Rows",
          "omitBlank": false,
          "columnName": "",
          "type": "list",
          "selection": [
            {
              "v": {
                "v": true,
                "l": "true"
              }
            }
          ]
        }
      ],
      "mode": "row-based"
    }
  },
  {
    "op": "core/multivalued-cell-split",
    "description": "Split multi-valued cells in column Column 1",
    "columnName": "Column 1",
    "keyColumnName": "Column 1",
    "separator": ", ",
    "mode": "plain"
  },
  {
    "op": "core/column-move",
    "description": "Move column Column 2 to position 0",
    "columnName": "Column 2",
    "index": 0
  },
  {
    "op": "core/multivalued-cell-split",
    "description": "Split multi-valued cells in column Column 3",
    "columnName": "Column 3",
    "keyColumnName": "Column 2",
    "separator": ", ",
    "mode": "plain"
  },
  {
    "op": "core/fill-down",
    "description": "Fill down cells in column Column 4",
    "engineConfig": {
      "facets": [],
      "mode": "record-based"
    },
    "columnName": "Column 4"
  },
  {
    "op": "core/fill-down",
    "description": "Fill down cells in column Column 2",
    "engineConfig": {
      "facets": [],
      "mode": "record-based"
    },
    "columnName": "Column 2"
  },
  {
    "op": "core/column-reorder",
    "description": "Reorder columns",
    "columnNames": [
      "Column 1",
      "Column 2",
      "Column 3",
      "Column 4"
    ]
  }
]

埃尔韦

于 2015-09-28T22:08:23.047 回答
0

刚刚找到了一个不错的免费 OpenRefine 插件,它提供“未配对的支点”: VIB-Bits 插件

他们的文档中

3.2.1 不成对的枢轴... 不成对的枢轴是将按行组织的数据转换为在单独列中表示该数据的数据。一个简单的例子是转换

类别 价值
一个 1
一个 2
b 3
C 2

进入

值 b 值 c
1 3 2
2
于 2021-04-05T21:14:47.610 回答