5

MongoDB_id字段是否足够随机/不可猜测以充当秘密数据?

例如:如果我正在构建服务器端 OAuth,我可以使用 _id 作为用户的 OAuth 令牌吗?我想这样做是因为它为数据库提供了清洁度和可索引性(例如,“tokens._id”=> oauth_token)。

检查 MongoDB _id 对象的结构,它们似乎是相当随机的,但我确实对恶意实体蛮力猜测一个存在一些挥之不去的担忧。

4

2 回答 2

12

In short, no. Mongo ObjectIds are easy to guess. In particular, under high load, these are often consecutive numbers, because timestamp, machine and process id don't change. If you look at the structure of Objectid, they are composed of

a 4-byte timestamp, 
a 3-byte machine identifier, 
a 2-byte process id, and 
a 3-byte counter, starting with a random value.

Hence, they have very little randomness. I often see consecutive ids in the database, for instance if some controller action writes a domain object, and a log entry in quick succession.

If the timestamp can be guessed and the machine id is determinable (which it is unless you have a huge cluster), there are only five bytes left. By looking at a number of generated ids, I can probably reduce that to like 50 processes so the effective entropy is somewhere in the 28 bit range. This is still hard to guess, but it's way too risky for an access token.

Use a cryptographically strong pseudo random number generator instead and create a token from that. For example, in .NET, the RNGCryptoServiceProvider allows to create arbitrary length random data.

As a sidenote, I suggest to have an additional cryptographic wrapper around your OAuthTokens, for two reasons:

a) You want to be able to determine invalid tokens quickly. A valid cryptographic shell might still include an invalid token (a revoked or expired grant), but you don't have to hit the database on brute force attacks every time. Also, the client

b) Clients can request tokens over and over. While it's not a requirement, almost all systems I know return different tokens every time (no matter if they are self-validating or not). Usually, that's because the token itself has a limited validity period. That is not the same validity period that the OAuth grant has.

In the database, what you really want to store is the grant, i.e. the permission that was given by some user to some client. If this grant is removed, all tokens become invalid. Inserting a new token every time is very unhandy because the user would have to delete all of them to effective remove the application grant.

于 2013-03-15T15:25:54.450 回答
1

尽管非常困难,但猜测它是可能的。实际上,猜测大多数其他 OAuth 令牌可能会更好。

它如何变得非常困难的一个例子是它包含 PID。如果您使用的是 PHP 之类的语言,那么 PID 会随着生成的每个 PHP 进程而变化,这意味着每个进程都_id可能有自己的移动 PID,并且是完全随机的,当然这取决于您运行 PHP 的模式;如果在 fcgi 模式下,那么 PID 可能是恒定的。

mahcine id 也是另一个。大多数网站都有为其数据库自动扩展的服务器集群,所以即使在重负载等情况下,唯一真正可预测的变量是很多情况下的时间戳,即使那样,由于它是毫秒级的,它仍然不是很容易猜测; 您需要确切地知道该网站需要多少流量才能构建一种算法来以合理的方式计算时间。当然,首先装饰该信息的唯一方法是实际获取它的_id,catch 22。

然而,考虑到这一点,我能想到暴力破解的唯一方法_id是获得足够的计算能力来迭代所有可能存在的 ObjectId(考虑到 ObjectId 变量的随机性,这里可能是数万亿/无限),然后 ping 那些到数据库中的每个 App ID(您应该始终要求 OAuth 令牌和 App ID),这再次提供了另一个层次的神秘感。

所以在我看来,是的,有人可以蛮力,但需要大量的计算能力,甚至可能需要几年时间才能破解一些不值得的东西。

于 2013-03-15T15:22:32.550 回答