我正在研究一种视频存档,wfsu.org/dimensions,并试图为使用 mysql 和 php 的相关视频提出一个很好的查询/算法组合。该数据库具有标题、关键字、描述、类别和另一个标准化的生产者 1:m 表。我有一个简单的算法,但如果有任何机智的人查看它,他们会发现它产生了一组非常糟糕的“相关”视频。任何想法或帮助将不胜感激!
根据请求,这是我正在使用的简单算法:
//if the segment isnt a generic dimensions use this query that makes sure they're in the same category
if($segType != 2)
{
$query = "SELECT `title`, `description`, `air_date`, `keywords`, `post_id`, `img_filename`
FROM `archive_post`
WHERE `segment_type` = $segType
AND `post_id` != $id
AND NOW() > ADDTIME(`air_date`, '20:0:0')
ORDER BY `air_date` DESC LIMIT 5";
}
else //otherwise we want a query that checks to see if there are any similar keywords
{
$query = "SELECT `title`, `description`, `air_date`, `keywords`, `post_id`, `img_filename`
FROM `archive_post`
WHERE (";
$kwArray = preg_split("/[\s,-]+/", mysql_real_escape_string($keywords));
foreach($kwArray as $kw)
{
$query .= "`keywords` LIKE '%$kw%' OR";
}
$query = substr($query, 0, -3);
$query .= ")
AND `post_id` != $id
AND NOW() > ADDTIME(`air_date`, '20:0:0')
ORDER BY `air_date` DESC LIMIT 5";
}
$result = $dbConnection->runQuery($query);
if(mysql_num_rows($result) == 0) //if we can't find any 'related' videos what do?
{
}
else
{
while($row = mysql_fetch_array($result))
{
$moreTitle = $row['title'];
$moreID = $row['post_id'];
$moreDescription = cleanDescription($row['description']);
$moreDescription = substr($moreDescription, 0, 50).'...';
$moreDate = strtotime( $row['air_date'], time() );
$moreDate = date( "F j, Y" , $moreDate );
$relatedVideos .= "<li> <a href='viewvideo.php?num=$moreID'></a><h3>$moreTitle</h3>
<div class='featuredStory'><span class='featuredDate'>$moreDate</span> ⋅ $moreDescription</div></li>";
}
}