1

我的问题涉及使用 SQL 通过使用脚本为多组重复值分配组 ID。我已经手动做了一段时间,并意识到,随着数据库的扩展(几千个元素),这将需要很长时间。

这是我的数据库结构:

id  | db quesition         | db keywords           | answer id  | db answer                    |
------------------------------------------------------------------------------------------------
 0  | Why is Mars red?     | [why,mars,red]        | 0          | Mars is red because blah     |

 1  | How is Mars red?     | [how,mars,red]        | 0          | Mars is red because blah     |

 2  | What makes Mars red? | [what,makes,mars,red] | 0          | Mars is red because blah     |

 3  | Is Mars very rocky?  | [is,mars,rocky]       | 0          | Yes Mars is rocky blahbla    |

 4  | Does Mars have rocks?| [mars,have,rocks]     | 0          | Yes Mars is rocky blahbla    |

 5  | What is the Sun?     | [what,is,sun]         | 0          | The Sun is our solar blah    |

 6  | What is a star?      | [what,is,star]        | 0          | A star is a ball of hot blah |

现在,如您所见,一个答案可以有多个问题,因此数据库中的db_answer列中将有重复项。我希望每个db_answer人都有一个单数answer_id,如果答案被多次使用,它会被重复。为了说明,我希望我的数据库看起来像:

id  | db quesition         |  db keywords          | answer id | db answer                    |
-----------------------------------------------------------------------------------------------
 0  | Why is Mars red?     | [why,mars,red]        | 1         | Mars is red because blah     |

 1  | How is Mars red?     | [how,mars,red]        | 1         | Mars is red because blah     |

 2  | What makes Mars red? | [what,makes,mars,red] | 1         | Mars is red because blah     |

 3  | Is Mars very rocky?  | [is,mars,rocky]       | 2         | Yes Mars is rocky blahbla    |

 4  | Does Mars have rocks?| [mars,have,rocks]     | 2         | Yes Mars is rocky blahbla    |

 5  | What is the Sun?     | [what,is,sun]         | 3         | The Sun is our solar blah    |

 6  | What is a star?      | [what,is,star]        | 4         | A star is a ball of hot blah |

我已经广泛寻找执行此操作的脚本,但没有任何运气。就像说明我一直在尝试做的事情一样,我一直在为每个我想添加 id 的答案组使用 SQL:

UPDATE elements SET answer_id = '1' WHERE db_answer = 'Mars is red because blah' 
4

3 回答 3

3

使用 PHP 脚本将非常容易:

$query = mysql_query("SELECT DISTINCT db_answer FROM elements");
$i = 1;
while ($row = mysql_fetch_row($query))
{
    mysql_query("UPDATE elements SET answer_id = {$i} WHERE db_answer = '{$row[0]}'");
    $i++;
}

但是,我认为将答案存储在单独的表格中并将其保留answer_idelements表格中可能是明智之举。这样您就可以避免不必要的重复信息。


编辑 :

正如@mdoyle 建议的那样,我认为最好使用四个表:

CREATE TABLE questions (
    questionID INT NOT NULL AUTO_INCREMENT,
    question VARCHAR(128),
    answerID INT,
    PRIMARY KEY (questionID),
    FOREIGN KEY (answerID) REFERENCES answers (answerID)
);

CREATE TABLE answers (
    answerID INT NOT NULL AUTO_INCREMENT,
    answer VARCHAR(128),
    PRIMARY KEY (answerID)
);

CREATE TABLE keywords (
    keywordID INT NOT NULL AUTO_INCREMENT,
    keyword VARCHAR(16),
    PRIMARY KEY (keywordID)
);

CREATE TABLE question_keywords (
    questionID INT,
    keywordID INT,
    FOREIGN KEY (questionID) REFERENCES questions (questionID),
    FOREIGN KEY (keywordID) REFERENCES keywords (keywordID)
);

answers表和questions表之间的关系是一对多的一个答案可能适用于许多问题),所以你有两个表。这假设每个问题都可以有一个且只有一个答案。如果不是这种情况,并且一个问题可能有两个可接受的答案,那么关系就变成了多对多(继续阅读如何为多对多关系设置表)。

questions表和表之间的关系keywords多对多的很多问题可能使用很多关键字),所以你有三个表。一个保存问题(每个问题一行),一个保存关键字(每个关键字一行),第三个将两者联系在一起。该question_keywords表将有多行具有相同的 questionID 和多行具有相同的关键字 ID。所以如果 questionID 5 有 3 个关键字,那么question_keywords表中将有 3 个 questionID 为 5 的条目。

对于任何一对一的关系,通常只需在同一个表中创建一个附加列是安全的,因此您将拥有一个用于该关系的表。

注意:随意更改VARCHAR列的长度。根据您的示例,我选择了可​​能没问题的值,但如果问题和/或答案可能更长,那么您可能需要增加此大小。


创建这些表后,您可以通过执行以下操作来填充它们:

$query = $mysql_query("SELECT * FROM elements") or die(mysql_error());
echo "About to enter while-loop<br />";
$i = 1;
while ($row = mysql_fetch_assoc($query))
{
    echo "loop ". $i++ ."<br />";
    $answerID = -1;

    $querystr = "SELECT answerID FROM answers WHERE answer = '{$row["db_answer"]}'";
    echo "Getting answerID. query: {$querystr}<br />";
    $query = mysql_query($querystr) or die($mysql_error());
    if (!(list($answerID) = mysql_fetch_row($query)))
    {
        $querystr = "INSERT INTO answers (answer) VALUES ('{$row["db_answer"]}')";
        echo "Answer did not exist, inserting now. query: {$querystr}<br />";
        mysql_query($querystr) or die(mysql_error());
        $answerID = mysql_insert_id();
    }

    $querystr = "INSERT INTO questions (questionID, question, answerID) VALUES ('{$row["id"]}', '{$row["db_question"]}', '{$answerID}')";
    echo "Inserting question. query: {$querystr}<br />";
    mysql_query($querystr) or die(mysql_error());

    $keywords = explode(",", trim($row["db_keywords"], "[]"));
    echo "keywords = ". print_r($keywords, true) ."<br />";
    foreach ($keywords as $keyword)
    {
        $keywordID = -1;
        $querystr = "SELECT keywordID FROM keywords WHERE keyword = '{$keyword}'";
        echo "Getting keywordID. query: {$querystr}<br />";
        $query = mysql_query($querystr) or die(mysql_error());
        if (!(list($keywordID) = mysql_fetch_row($query)))
        {
            $querystr = "INSERT INTO keywords (keyword) VALUES ('{$keyword}')";
            echo "Keyword did not exist, inserting now. query: {$querystr}<br />";
            mysql_query($querystr) or die(mysql_error());
            $keywordID = mysql_insert_id();
        }

        $querystr = "INSERT INTO question_keywords (questionID, keywordID) VALUES ('{$row["id"]}', '{$keywordID}')";
        echo "Inserting question keyword. query: {$querystr}<br />";
        mysql_query($querystr) or die(mysql_error());
    }
}

完成此操作并验证四个表已正确填充后,您就不再需要使用该elements表了。只需使用这四个表(questionsanswerskeywordsquestion_keywords)。

于 2012-04-18T19:54:03.103 回答
2

在 mysql 的范围内,您可以为答案分配一个 id,如下所示:

select answer, min(id) as answer_id
from table
group by answer

因此,完整的解决方案是在表中创建一个 answerid 列,然后执行以下操作:

with aid as 
(
  select answer, min(id) as answer_id
  from table
  group by answer
)
update table
set answer_id = aid.answer_id
where table.answer = aid.answer
于 2012-04-18T20:33:48.203 回答
1

您需要在查询中执行此操作的是 SQL Server 函数 ROW_NUMBER()。不幸的是,MySQL 没有这个。但是,您可以通过利用变量的内联赋值来模拟该函数。这是一篇解释所涉及逻辑的文章:http ://www.explodybits.com/2011/11/mysql-row-number/

于 2012-04-18T20:00:33.257 回答