python - Python：用于存储键/值的数组与数据库

Question

问：在这种情况下哪个更快？

我的场景：我的应用程序将在数组或 postgresql db 中存储链接列表，因此它可能如下所示：

1) mysite.com
   a)    /users/login
   b)    /users/registration/
   c)    /contact/
   d)    /locate/search
   e)    /priv/admin-login

上面的条目1)- 我将string在这些 url 上进行搜索，以查找例如包含以下内容的任何路径：

'login'

例如。

对于给定的域，上述字母 a) 到 e) 可能有 5-100 个以上的条目。

*用法： *这种数据结构每天都可能发生变化，但每天只能变化一次。一些键/值将被删除，其他的将被修改。一个单独的集合，如：

dict2 = { 'thesite.com': 123, 98.6: 37 };

每个key将代表 1 个且仅代表 1 个域。

我已经尝试过对此进行一些搜索，但似乎找不到一个真正好的答案：什么时候应该array使用，什么时候应该使用类似的数据库postgresql？

我一直使用 db 来处理数据（使用 mysql，而不是 postgresql），但我现在正试图从现在开始做得更好，所以我想知道数组或其他数据结构是否会在循环中更好地工作，并且在循环时尝试匹配给定的字符串。

一如既往，谢谢！

score 2 · Accepted Answer

一个完整的 SQL 数据库可能是矫枉过正。如果您可以将所有内容都放入内存中，请将其全部放入 dict 中，然后使用pickle模块对其进行序列化并将其写入磁盘。

另一个不错的选择是使用其中一个 dbm 模块（/ 或dbm）dbm.ndbm将数据存储在磁盘绑定哈希表中。它将具有 O(1) 查找时间，而无需像在更大的数据库中那样连接和形成查询。gdbmanydbm

编辑：如果每个键有多个值并且您不想要一个成熟的数据库，SQLite 将是一个不错的选择。已经有一个内置模块，sqlite3（如评论中所述）

score 0 · Accepted Answer

Test it. It's your dataset, your hardware, your available disk and network IO, your usage pattern. There's no one true answer here. We don't even know how many queries are you planning - are we talking about one per minute or thousands per second?
If your data fits nicely in memory and doesn't take a massive amount of time to load the first time, sticking it into a dictionary in memory will probably be faster.
If you're always looking for full words (like in the login case), you will gain some speed too from splitting the url into parts and indexing those separately.

2 回答 2