python - 索引其对象元素属性的 Python 列表类

Question

我正在寻找在 python 中创建列表的最佳方法，该列表为放入列表中的对象的所有属性创建散列索引（dicts）。

>>> foo = IndexingList([{ 'id': 1, 'name': 'cat' }, { 'id': 2, 'name': 'dog' }])
>>> foo[0]
{'id': 1, 'name': 'cat'}

>>> foo.findall('id', 2)
[{'id': 2, 'name': 'dog'}]

>>> foo += {'id': 3, 'name': 'dog'}
>>> foo.findall('name', 'dog')
[{'id': 2, 'name': 'dog'}, {'id': 3, 'name': 'dog'}]

我想 IndexingList 的数据结构将如下所示：

{
    'items': [
        { 'id': 1, 'name': 'cat' }, 
        { 'id': 2, 'name': 'dog' }
    ],
    'indexes': {
        'id': {
            1: [{ 'id': 1, 'name': 'cat' }],
            2: [{ 'id': 2, 'name': 'dog' }]
        },
        'name': {
            'cat': [{ 'id': 1, 'name': 'cat' }],
            'dog': [
                { 'id': 2, 'name': 'dog' },
                { 'id': 3, 'name': 'dog' }
            ]
        }
    }
}

其中“索引”节点中的对象指的是“项目”中的相同对象。

我认为本身是对象的属性值可以通过使用 str(property) 来获得唯一的索引键来获得一些东西来粘贴在“索引”中。

score 3 · Accepted Answer

使用 some s 实际上很容易做到这一点collections.defaultdict()- 尽管如果您经常使用它，您可能会考虑使用实际的数据库。

from collections import defaultdict
from functools import partial

class IndexingList:
    def __init__(self, items):
        self.items = []
        self.indices = defaultdict(partial(defaultdict, list))
        self.extend(items)

    def append(self, item):
        try:
            for index, value in item.items():
                self.indices[index][value].append(item)
        except AttributeError as e:
            raise ValueError("All children of an IndexingList must be "
                             "dict-like. '{0}' is not.".format(item)) from e
        self.items.append(item)

    def extend(self, iterable):
        for item in iterable:
            self.append(item)

    def __iadd__(self, other):
        self.extend(other)
        return self

    def __getitem__(self, item):
        return self.items[item]

    def __setitem__(self, item, value):
        self.items[item] = value

    def __delitem__(self, item):
        del self.items[item]
        for index, value in item.items():
            self.indices[index][value].remove(item)

    def find_all(self, index, value):
        return self.indices[index][value]

    def __repr__(self):
        return repr(self.items)

像这样使用：

>>> foo = IndexingList([{ 'id': 1, 'name': 'cat' }, { 'id': 2, 'name': 'dog' }])
>>> foo[0]
{'id': 1, 'name': 'cat'}
>>> foo.find_all("id", 2)
[{'id': 2, 'name': 'dog'}]
>>> foo += [{'id': 3, 'name': 'dog'}]
>>> foo.find_all('name', 'dog')
[{'id': 2, 'name': 'dog'}, {'id': 3, 'name': 'dog'}]

score 0 · Accepted Answer

I must say that Lattyware provides a really nice solution. I'll still provide my own quick and dirty approach, as when when indexing on unique items it's a simple one liner. Instead of building a nice wrapper container I sometimes create an index on a certain column:

my_list = [('aap', 123), ('noot', 234), ('mies', 345), ('mies', 456)]

Provided the key in that column is unique and we don't add any new elements to the list nor modify the value we indexed on we may use:

def mk_unique_index(data, col):
  g = ((elem[col], elem) for elem in data)
  return dict(g)

So we can use it like:

>>> idx = mk_unique_index(my_list, 1)
>>> idx[123]
('aap', 123)

However, if we wish to index on the 0th column we must use a defaultdict

from collections import defaultdict
def mk_index(data, col):
  d = defaultdict(list)
  for elem in data:
    d[elem[col]].append(elem)
  return d

Usage:

>>> idx = mk_index(my_list, 0)
>>> idx['mies']
[('mies', 345), ('mies', 456)]

If instead of tuples you're using dictionaries or even named tuples (provided all elements have the field you're indexing on) you could just provide the field name for the column Obviously one might also choose to use a temporary sqlite database in memory.

python - 索引其对象元素属性的 Python 列表类

2 回答 2

Related

Reference