2

我有一个如下所示的 JSON 对象:

{
  “名称”:“培根”
  “类别”:[“食物”,“肉类”,“好”]
  “卡路里”:“巨大”
}

我试图将其展平为一组独特的值。我需要为无法直接处理交叉制表数据或 JSON 数据的 Tableau 构建事实表。

我对我是用 Python 还是 Ruby 来做这件事并不挑剔,但到目前为止,我一直在尝试用 Ruby 来做。我能够轻松解析 JSON 并从中获取 Ruby 哈希,这似乎是首先要做的正确事情。

{"name"=>"bacon", "category"=>["food", "meat", "good"], "calories" => "huge"}

我需要制作这个:

name,category,calories
bacon,food,huge
bacon,meat,huge
bacon,good,huge

所以我认为我需要遍历该哈希并尝试取消嵌套它。我一直在尝试这样的事情:

def Flatten(inHash)
    inHash.each do |key,value|
        if value.kind_of?(Hash)
            Flatten(value)
        else
            puts "#{value}"
        end 
    end 
end

但是,这似乎打印了所有的值,但它并没有重复之前的值。所以我得到的输出看起来像

bacon
food
meat
good
huge

是否有内置的方法、gem 或库可以实现这个或者我正在从头开始构建?关于如何获得我想要的输出的任何想法?我会说 Ruby 和 Python,所以如果你有 Python 答案,请分享。

4

3 回答 3

2
>>> #Assuming your json data is correctly formatted as is as follows
>>> data = '{ "name":"bacon", "category":["food","meat","good"], "calories":"huge" }'
>>> #Lets call our json parser as foo (I am bad with names)
>>> def foo(data):
    #You first need to parse it to a Py Object
    json_data = json.loads(data)
    from collections import namedtuple
    #Now create a namedtuple with the given keys of the dictionary
    food_matrix = namedtuple('food_matrix',json_data.keys())
    #And create a tuple out of the values
    data_tuple = food_matrix(*json_data.values())
    #Now with itertools.product create a cross product
    from itertools import product
    data_matrix = list(product([data_tuple.name],data_tuple.category, [data_tuple.calories]))
    # Now display the heading
    print "{:15}{:15}{:15}".format(["name","category","calories")
    # Now display the values
    for e in data_matrix:
        print "{:15}{:15}{:15}".format(*e)


>>> #Now call it
>>> foo(data)
name           category       calories                  
bacon          food           huge           
bacon          meat           huge           
bacon          good           huge           
>>> 
于 2013-03-06T15:30:27.513 回答
0

这将是我的解决方案:

require 'json'

# Given a json object
json = JSON.parse('{"name":"bacon", "category":["food","meat","good"], "calories":"huge"}')

# First, normalize all the values to arrays
hash = Hash[json.map{|k, v| [k, [v].flatten]}]

# We now have a hash like {"name" => ["bacon"], ...}

# Then we'll make the product of the first array of values 
# (in this case, ["bacon"]) with the other values
permutations = hash.values[0].product(*hash.values[1..-1])

# Now just need to output
puts hash.keys.join(",")
permutations.each{ |group| puts group.join(",") }
于 2013-03-06T19:01:06.280 回答
0

假设您的 JSON 有逗号(使其有效 JSON),您可以使用itertools.product枚举所有可能的组合:

import itertools as IT
import json

text = '{ "name":"bacon", "category":["food","meat","good"], "calories":"huge" }'
data = json.loads(text)

# Sort the keys in the order they appear in `text`
keys = sorted(data.keys(), key = lambda k: text.index(k))

# Promote the values to lists if they are not already lists
values = [data[k] if isinstance(data[k], list) else [data[k]] for k in keys]

print(','.join(keys))
for row in IT.product(*values):
    print(','.join(row))

产量

name,category,calories
bacon,food,huge
bacon,meat,huge
bacon,good,huge
于 2013-03-06T15:36:27.067 回答