您好,我在搜索用户(来自用户表)的查询中遇到执行时间问题,这些用户至少具有来自一个指定兴趣集的一个兴趣和来自指定位置集的一个位置。所以我有这个测试数据库:
CREATE TABLE IF NOT EXISTS `interests` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=10 ;
--
-- Dumping data for table `interests`
--
INSERT INTO `interests` (`id`, `name`) VALUES
(1, 'auto'),
(2, 'moto'),
(3, 'health'),
(4, 'garden'),
(5, 'house'),
(6, 'music'),
(7, 'video'),
(8, 'games'),
(9, 'it');
-- --------------------------------------------------------
--
-- Table structure for table `locations`
--
CREATE TABLE IF NOT EXISTS `locations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `locations`
--
INSERT INTO `locations` (`id`, `name`) VALUES
(1, 'engalnd'),
(2, 'austia'),
(3, 'germany'),
(4, 'france'),
(5, 'belgium'),
(6, 'italy'),
(7, 'russia'),
(8, 'poland'),
(9, 'norway'),
(10, 'romania');
-- --------------------------------------------------------
--
-- Table structure for table `users`
--
CREATE TABLE IF NOT EXISTS `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`email` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `users`
--
INSERT INTO `users` (`id`, `email`) VALUES
(1, 'email1@test.com'),
(2, 'email2@test.com'),
(3, 'email3@test.com'),
(4, 'email4@test.com'),
(5, 'email5@test.com'),
(6, 'email6@test.com'),
(7, 'email7@test.com'),
(8, 'email8@test.com'),
(9, 'email9@test.com'),
(10, 'email10@test.com');
-- --------------------------------------------------------
--
-- Table structure for table `users_interests`
--
CREATE TABLE IF NOT EXISTS `users_interests` (
`user_id` int(11) NOT NULL,
`interest_id` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`interest_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--
-- Dumping data for table `users_interests`
--
INSERT INTO `users_interests` (`user_id`, `interest_id`) VALUES
(1, 1),
(1, 2),
(2, 5),
(2, 7),
(2, 8),
(3, 1),
(4, 1),
(4, 5),
(4, 6),
(4, 7),
(4, 8),
(5, 1),
(5, 2),
(5, 8),
(6, 3),
(6, 7),
(6, 8),
(7, 7),
(7, 9),
(8, 5);
-- --------------------------------------------------------
--
-- Table structure for table `users_locations`
--
CREATE TABLE IF NOT EXISTS `users_locations` (
`user_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`location_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--
-- Dumping data for table `users_locations`
--
INSERT INTO `users_locations` (`user_id`, `location_id`) VALUES
(2, 5),
(2, 7),
(2, 8),
(3, 1),
(4, 1),
(4, 5),
(4, 6),
(4, 7),
(4, 8),
(5, 1),
(5, 2),
(5, 8),
(6, 3),
(6, 7),
(6, 8),
(7, 7),
(7, 9),
(8, 5);
有没有比这更好的查询方法:
SELECT email,
GROUP_CONCAT( DISTINCT ui.interest_id ) AS interests,
GROUP_CONCAT( DISTINCT ul.location_id ) AS locations
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
GROUP BY u.id
HAVING IF( interests IS NOT NULL , FIND_IN_SET( 2, interests )
OR FIND_IN_SET( 3, interests ) , 1 )
AND IF( locations IS NOT NULL , FIND_IN_SET( 2, locations )
OR FIND_IN_SET( 3, locations ) , 1 )
这是我找到的最佳解决方案,但在关系表(位置和兴趣)中的 500k 和 100 万行上它仍然很慢。尤其是当您匹配大量值时(假设超过 50 个位置和兴趣)。
所以我试图实现这个查询产生的结果,但要快一点:
email interests locations
email1@test.com 1,2 [BLOB - 0B]
email5@test.com 1,2,8 1,2,8
email6@test.com 3,7,8 3,7,8
email9@test.com [BLOB - 0B] [BLOB - 0B]
email10@test.com [BLOB - 0B] [BLOB - 0B]
我还尝试加入一个 SELECT UNION 表 - 用于匹配集 - 但它甚至更慢。像这样:
SELECT *
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
LEFT JOIN (SELECT 2 as interest UNION SELECT 3 as interest) as `is` ON ui.interest_id = is.interest
LEFT JOIN (SELECT 2 as location UNION SELECT 3 as location ) as `ls` ON ul.location_id = ls.location
WHERE IF(ui.user_id IS NOT NULL, `is`.interest IS NOT NULL,1) AND
IF(ul.user_id IS NOT NULL, ls.location IS NOT NULL,1)
GROUP BY u.id
我将其用于基本定位系统。我将非常感激,任何建议!谢谢!