1

In my case I need to ensure the uniqueness of files via SHA1 (stored as the filename)

db = pymongo.MongoClient('localhost', 27017).test
gfs = gridfs.GridFS(db)

# How may I create a unique index in GridFS?
gfs.files.create_index([('filename', 1)], unique=True)

And find the file by SHA1 if the file has already been stored.

sha1 = hashlib.sha1(file_content).hexdigest()
try:
    return gfs.put(file_content, filename=sha1)
except pymongo.errors.DuplicateKeyError:

    # How may I find files via criterion?
    return gfs.find( { 'filename': sha1 } )['_id']

Could anybody tell me how to do those things? Thanks in advance.

4

1 回答 1

1

_id您可以手动为具有自身哈希值的文件提供密钥,而不是创建索引。

import pymongo 
db = pymongo.MongoClient('localhost', 27017).test
gfs = gridfs.GridFS(db)

def hash(file):
   #some code to extract hash of a file from its content..

file_hash = hash(file)
if gfs.exists(_id=file_hash):
    #file exists!
else:
    #file does not exist in the database.
    gfs.put(file, _id=file_hash) #or do something else..

http://api.mongodb.org/python/current/api/gridfs/

于 2013-04-17T04:06:54.030 回答