0

我目前正在研究如何将 sphinx 集成到现有网站上。

我正在为一个出租度假屋的客户工作。他们有一个网站,客户可以在线预订。在他们的首页上,他们有一个搜索引擎,可以搜索他们所有的家(20k)并在各个领域对他们进行排序。过去,我们对每次搜索都进行 MySQL 查询。现在数据库已经增长了分配,这使得查询的允许速度比过去慢。出于这个原因,我们正在研究如何改进搜索引擎。我目前正在与 sphinx 合作,看看它是否适合我们。

我安装了 Sphinx 并具有以下源和索引:

source huisjesSource
{
  type = mysql
  sql_host = localhost
  sql_user = user
  sql_pass = password
  sql_db = database

   sql_query= SELECT a.huis_id as huis_id, a.huis_code as huis_code, a.land_code,    a.regi_code, a.huis_naam, a.huis_plaats,a.multimedia,a.foto_a, a.foto_w,a.foto_a_full, a.foto_w_full, a.huis_van, a.huis_tm, a.hd, a.a_hd, st,a.sl, a.beds, a.bathr, a.baths,a.airport, a.huis_enqe_vr13_aantal, a.huis_enqe_vr13_punten, a.huis_longitude, a.huis_latitude, a.huis_catering_verplicht, 1w_min, 1w_max, 2w_min, 2w_max, 3w_min, 3w_max, wk_min, wk_max, lw_min, lw_max, mw_min, mw_max, age,(CASE WHEN vz200p239 = '1' or vz238 = '1' or vz235 = '1' THEN 1 ELSE 0 END) as relax,ph.plaatsnaam as hplaats,ph.hplaatsid,ps.subplaatsid,ps.plaatsnaam as splaats,ty20,ty30,ty40,ty50,ty60,ty70,ty90,ty160,becdir,huis_hbes_small, a.regi_oms_nl as regi_oms \
   FROM as_search a \
   left join bv_myisam.huis_plaats hpl on a.huis_code = hpl.huis_code \
   left join bv_myisam.plaatsen_head ph on hpl.hplaatsid = ph.hplaatsid and ph.lang = 'nl' \
   left join bv_myisam.plaatsen_sub ps on hpl.subplaatsid = ps.subplaatsid and hpl.subplaatsid != 'null' and ps.lang = 'nl' \
   left join huis_oms o on a.huis_code = o.huis_code AND o.lang = 'nl' \
   inner join huis_sort so on a.huis_code = so.huis_code \
   inner join bbpr_n b on a.huis_code = b.huis_code \
   WHERE a.avail = '1' AND a.demo = '0' AND a.bvdir = '1' \
   GROUP BY a.huis_code 

   # (not needed) sql_attr_uint = huis_id # int(11)
   sql_attr_string = huis_code # varchar(14)
   sql_attr_string = land_code # char(2)
   sql_attr_string = regi_code # varchar(10)
   sql_attr_string = huis_naam # varchar(50)
   sql_attr_string = huis_plaats # varchar(40)
   sql_attr_bool = multimedia # enum('1','0')
   sql_attr_string = foto_a # varchar(90)
   sql_attr_string = foto_w # varchar(90)
   sql_attr_string = foto_a_full # varchar(90)
   sql_attr_string = foto_w_full # varchar(90)
   sql_attr_uint = huis_van # tinyint(4)
   sql_attr_uint = huis_tm # tinyint(4)
   sql_attr_string = hd # char(1)
   sql_attr_uint = a_hd # tinyint(4)
   sql_attr_string = st # char(1)
   sql_attr_uint = sl # tinyint(3) unsigned
   sql_attr_uint = beds # tinyint(3) unsigned
   sql_attr_uint = bathr # tinyint(3) unsigned
   sql_attr_uint = baths # tinyint(3) unsigned
   sql_attr_string = airport # char(3)
   sql_attr_uint = huis_enqe_vr13_aantal # smallint(6)
   sql_attr_uint = huis_enqe_vr13_punten # smallint(6)
   sql_attr_float = huis_longitude # double(8,5)
   sql_attr_float = huis_latitude # double(8,5)
   sql_attr_bool = huis_catering_verplicht # enum(0,1)
   sql_attr_float = 1w_min # decimal(8,2)
   sql_attr_float = 1w_max # decimal(8,2)
   sql_attr_float = 2w_min # decimal(8,2)
   sql_attr_float = 2w_max # decimal(8,2)
   sql_attr_float = 3w_min # decimal(8,2)
   sql_attr_float = 3w_max # decimal(8,2)
   sql_attr_float = wk_min # decimal(8,2)
   sql_attr_float = wk_max # decimal(8,2)
   sql_attr_float = lw_min # decimal(8,2)
   sql_attr_float = lw_max # decimal(8,2)
   sql_attr_float = mw_min # decimal(8,2)
   sql_attr_float = mw_max # decimal(8,2)
   sql_attr_uint = age # tinyint(3) unsigned
   sql_attr_uint = relax # boolean
   sql_attr_string = hplaats # varchar(100)
   sql_attr_uint = hplaatsid # int(11)
   sql_attr_uint = subplaatsid # int(11)
   sql_attr_string = splaats # varchar(100)
   sql_attr_bool = ty20 # enum(1,0)
   sql_attr_bool = ty30 # enum(1,0)
   sql_attr_bool = ty40 # enum(1,0)
   sql_attr_bool = ty50 # enum(1,0)
   sql_attr_bool = ty60 # enum(1,0)
   sql_attr_bool = ty70 # enum(1,0)
   sql_attr_bool = ty90 # enum(1,0)
   sql_attr_bool = ty160 # enum(1,0)
   sql_attr_bool = becdir # enum(0,1)
   sql_attr_string = huis_hbes_small # varchar(2000)
   sql_attr_string = regi_oms # varchar(50)

   }

   #############################################################################
   ## index definition
   #############################################################################

   index huisjesIndex
   {
     type = plain
     source = huisjesSource
     path = /var/lib/sphinxsearch/data/huisjes
     charset_type = utf-8
     preopen = 1
   }

索引创建良好:

# indexer --all
Sphinx 2.0.4-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file '/etc/sphinxsearch/sphinx.conf'...
indexing index 'huisjesIndex'...
collected 17059 docs, 0.0 MB
total 17059 docs, 0 bytes
total 98.422 sec, 0 bytes/sec, 173.32 docs/sec
total 1 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 27 writes, 0.008 sec, 312.9 kb/call avg, 0.3 msec/call avg

# indextool --check huisjesIndex
Sphinx 2.0.4-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file '/etc/sphinxsearch/sphinx.conf'...
checking index 'huisjesIndex'...
checking dictionary...
checking data...
checking kill-list...
check passed, 0.0 sec elapsed

但是当我执行 SELECT * FROM huisjesIndex 时,我得到一个空集,但应该有超过 17k 条记录。难道我做错了什么?

# mysql -h localhost -P 9306 --protocol=tcp
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 2.0.4-release (r3135)

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SELECT * FROM huisjesIndex;
Empty set (0.00 sec)

任何帮助表示赞赏!:)

4

1 回答 1

0

我没有仔细检查,但看起来你已经将 sql_query 中的所有列定义为Attributes

斯芬克斯不喜欢那样。它是一个文本搜索引擎,因此它需要一些全文Fields

一个简单的解决方案(如果您确实想要所有这些属性)是使用 sql_field_string 使您的至少一个列成为字段和属性。


另一个 odity,但很可能仍然有效,是您使用 a.huis_id 作为 document_id,但将 a.huis_code 分组。如果它们不是 1:1 映射(并且是唯一的),那么您将遇到问题。我认为按document_id分组更常见。

于 2012-08-08T19:41:43.323 回答