2

我为 Django 编写了一个函数,它允许用户输入一个单词或短语,并获取指定模型中的所有实例,其中该实例的所有这些单词在一系列指定字段中以任意顺序出现。我选择使用 objects.raw 方法并为此编写自定义 SQL,因为使用 Django Q 对象构建正确的查询存在问题。

def fuzzy_search(objmodel,columns,q='',limit=None,offset=0):
    """
        TEMPORARY PATCH version for fuzzy_search, gets around a native Django bug.
    """ 
    if len(q)<3:
        return [] #No results until you reach 3 chars
    words = q.strip().split(" ")
    #Get model table name:
    print "All results: %s" % objmodel.objects.all() 
    db_table = objmodel._meta.db_table
    print("DB_table = %s" % db_table)
    #Construct fields into list of kwarguments!
    sql = "SELECT * FROM %s" % (db_table,)
    userparams = []
    whereands = []
    #Construct the SQL as 
    for word in words:
        if len(word)<2:
            continue #Ignore computationally expensive single char strings
        whereors = []
        for col in columns:
            whereors.append('`%s`.`%s` LIKE "%s##*##"' % (db_table,col,"##P##"))    #STARTSWITH word... The third param will be converted via injection proof means 
            whereors.append('`%s`.`%s` LIKE "(%s##*##"' % (db_table,col,"##P##")) #STARTSWITH (word... The third param will be converted via injection proof means
            whereors.append('`%s`.`%s` LIKE "##*## %s##*##"' % (db_table,col,"##P##"))  #CONTAINS word... The third param will be converted via injection proof means 
            whereors.append('`%s`.`%s` LIKE "##*## (%s##*##"' % (db_table,col,"##P##")) #CONTAINS (word... The third param will be converted via injection proof means
        if whereors not in boolfalse:
            whereorstr= "(" + " OR ".join(whereors) + ")"
            for i in range(0,len(whereors)):
                userparams.append(word) #Need to match the number of supplied params to the number of clauses
            whereands.append(whereorstr)    #Build into an SQL string
        else:
            continue
    #Build the final sql:
    results = []
    if whereands not in boolfalse:
        sql+= " WHERE " + " AND ".join(whereands)
        sql = sql.replace("##P##","%s") #Necessary to get around %s persistence restriction
        sql = sql.replace("##*##","%%") #Makes everything a bit clearer!
        #For big datasets, it is more efficient to apply LIMITS and OFFSETS at SQL level:
        if limit:
            sql+= " LIMIT %s" % int(limit)  #This is injection proof as only ints are accepted
        if offset:
            sql+= " OFFSET %s" % int(offset)    #This is injection proof as only ints are accepted  
        #Perform the raw query, but with params carefully passed in via SQLi escaped method:
        ### objects.raw method ###
        resultsqset = objmodel.objects.raw(sql,userparams)
        print("Fuzzy SQL: \n%s\n" % resultsqset.query.__str__())    #View SQL
        results = list(resultsqset)
        print("Results: %s" % results)
        ### direct cursor method ###
        #cursor = connection.cursor()
        #cursor.execute(sql,userparams)
        #results = dictfetchall(cursor) #Ensures the results are fetched as a dict of fieldname => value
        return results
    return results

这个函数是这样调用的:

from modules.documents.models import Data_icd10_en
results = fuzzy_search(Data_icd10_en,["code","long_label"],"diab mel",30)

型号为:

class Data_icd10_en(models.Model):
    code = models.CharField(max_length=10)
    short_label = models.CharField(max_length=100)
    long_label = models.CharField(max_length=100)

当我调用该函数时,我可以在控制台中看到实际的 SQL 转储:

print("Fuzzy SQL: \n%s\n" % resultsqset.query.__str__())    #View SQL
Fuzzy SQL: 
<RawQuery: u'SELECT * FROM documents_data_icd10_en WHERE (`documents_data_icd10_en`.`code` LIKE "diabetes%" OR `documents_data_icd10_en`.`code` LIKE "(diabetes%" OR `documents_data_icd10_en`.`code` LIKE "% diabetes%" OR `documents_data_icd10_en`.`code` LIKE "% (diabetes%" OR `documents_data_icd10_en`.`long_label` LIKE "diabetes%" OR `documents_data_icd10_en`.`long_label` LIKE "(diabetes%" OR `documents_data_icd10_en`.`long_label` LIKE "% diabetes%" OR `documents_data_icd10_en`.`long_label` LIKE "% (diabetes%") AND (`documents_data_icd10_en`.`code` LIKE "mell%" OR `documents_data_icd10_en`.`code` LIKE "(mell%" OR `documents_data_icd10_en`.`code` LIKE "% mell%" OR `documents_data_icd10_en`.`code` LIKE "% (mell%" OR `documents_data_icd10_en`.`long_label` LIKE "mell%" OR `documents_data_icd10_en`.`long_label` LIKE "(mell%" OR `documents_data_icd10_en`.`long_label` LIKE "% mell%" OR `documents_data_icd10_en`.`long_label` LIKE "% (mell%") LIMIT 30'>

如果我将此 SQL 直接复制并粘贴到数据库后端 (MySQL) 中,则会返回正确的结果(诊断“糖尿病”的 30 行变体)。但是,python 函数本身无法返回任何内容(结果只是一个空列表)。我试过 print(resultsqset),这只是揭示了这个 RawQuerySet:

Results: <RawQuerySet: u'SELECT * FROM documents_data_icd10_en WHERE (`documents_data_icd10_en`.`code` LIKE "diab%" OR `documents_data_icd10_en`.`code` LIKE "(diab%" OR `documents_data_icd10_en`.`code` LIKE "% diab%" OR `documents_data_icd10_en`.`code` LIKE "% (diab%" OR `documents_data_icd10_en`.`long_label` LIKE "diab%" OR `documents_data_icd10_en`.`long_label` LIKE "(diab%" OR `documents_data_icd10_en`.`long_label` LIKE "% diab%" OR `documents_data_icd10_en`.`long_label` LIKE "% (diab%") AND (`documents_data_icd10_en`.`code` LIKE "mel%" OR `documents_data_icd10_en`.`code` LIKE "(mel%" OR `documents_data_icd10_en`.`code` LIKE "% mel%" OR `documents_data_icd10_en`.`code` LIKE "% (mel%" OR `documents_data_icd10_en`.`long_label` LIKE "mel%" OR `documents_data_icd10_en`.`long_label` LIKE "(mel%" OR `documents_data_icd10_en`.`long_label` LIKE "% mel%" OR `documents_data_icd10_en`.`long_label` LIKE "% (mel%") LIMIT 30'>

我还尝试将 rawqueryset 转换为列表,并手动对其进行迭代并打印行。两者都不产生任何东西。

最后,为了检查模型对象是否真的是我认为的那样,尝试print "All results: %s" % objmodel.objects.all()给我一个 40 左右的列表,<Data_icd10_en: Data_icd10_en object>这是我所期望的。

那么,这里发生了什么?为什么我的代码在通过 modelname.objects.raw() 运行时没有产生任何结果,但是当在数据库 shell 中运行完全相同的 SQL 时正在获取结果,并且当相同的模型名在其中获取所有行时也正确地获取结果那个功能?

---- 编辑 ---- 测试确认是的,我确实通过 Django 应用程序和外壳访问同一个数据库。此外,一行中的简单原始查询确实有效。

4

2 回答 2

3

经过进一步调查,打开 MySQL 日志记录并给 Django 开发人员发送电子邮件,结果证明我的代码没有问题。

而是有一个原生的小错误QuerySet.query.__str__():当这将实际的 SQL 内容输出到控制台时,它无法打印封装用户提供的参数的引号。

因此,当控制台声明时:

<RawQuery: u'SELECT * FROM documents_data_icd10_en WHERE (`documents_data_icd10_en`.`code` LIKE "(diabetes%"...

实际执行的是:

"<RawQuery: u'SELECT * FROM documents_data_icd10_en WHERE (`documents_data_icd10_en`.`code` LIKE "("diabetes%""...

...这是无效的。

故事的精神:不要相信什么QuerySet.query.__str__()告诉你,不要将用户提供的字符串封装在引号中Model.objects.raw(sql,PARAMS),因为这将为你完成。

于 2013-03-26T18:19:26.660 回答
0

#这将执行任何类型的原始查询,它可以是子查询等..\

 query ="""this will be the raw query"""\ 
 X=cursor.execute(query) \
 answers = cursor.fetchall()\
 print ("answers---",answers)\
于 2021-06-12T20:22:34.337 回答