1

I have the following three pieces of information. A group name, a group type, and group ranking.

As a quick example

"Mom's cats", "Cats", "Top10"

The example is way off from what I'm doing with this, but you get the basic idea.

The group name is a large selection of possible values (like around 20k) and the group type and group ranking are smaller amounts (like 10 each)

Trying to find a better way to come up with a short unique identifier for these group of things rather than having to use a sha1 with a huge ugly URL.

Any better ideas?

Open to all language solutions, so just pinning a lot of programmers here since I can't think of a better tag to assign to this.

Thanks.

EDIT: One solution that I found elsewhere a while back stated about taking the last few characters in the SHA-1 and converting them to a decimal value. Not sure how reliable this idea is and the chance of collision.

EDIT2: Using mongoDB and storing this sha1 value in the DB along with the members to make querying easy at the moment. Trying to find an alternative solution to creating an autoincrement field in a seperate table/collection which means a lot more queries when running updating scripts.

4

1 回答 1

2

对于 python 映射,您可以使用 (grouptype, groupranking, groupname) 作为字典键,或者您可以通过使用 grouptype -> groupranking -> groupname 的键拆分字典之类的内容来减小字典的大小。

为了生成唯一的 url,grouptype.rank.name 有什么问题,或者与 / 作为分隔符有什么问题 - 您可以使用有效的 url 类型函数来用 %nn 格式替换每个无效字符。

您可以使用urllib.quote('/'.join([baseurl, grouptype, groupranking, groupname])生成这样的路径,甚至baseurl + urllib.urlencode({'grouptype':grouptype,'groupranking':groupranking,'groupname':groupname})- 后者将导致典型的查询格式 baseurl?grouptype=Whatever&....

于 2013-07-10T07:03:42.013 回答