2

我有一个有很多字段的表,但我想计算该表的任何三个字段中的每个单词

在一个表中找到不止一次存在的所有标题......所以我可以发出这个声明

SELECT title, COUNT(title) AS NumOccurrences FROM users
GROUP BY titleHAVING ( COUNT(title) > 1 )

假设我的表有三个名为title,url,description的字段。基本上我不知道哪个单词已存储在该表的哪个 3 个字段中的最大时间。

我想发出一条 sql 语句,它可以告诉我哪个单词找到了最长时间……比如

word-name          occurance
---------          -------
sqlserver           300
jquery              120
ajax                110

请指导我使用 sql server 2000/2005 的示例 sql 谢谢

这是我更新的完整代码.....请看

    IF OBJECT_ID('tempdb..#tempSearch') IS NOT NULL
BEGIN
    DROP TABLE #tempSearch
END

CREATE TABLE #tempSearch(
    ID INT,
    Title nvarchar(4000),
    Description ntext,
    Url nvarchar(4000),
    Type char(1))

INSERT INTO #tempSearch 
    SELECT * from vwProductSearch

INSERT INTO #tempSearch 
    SELECT * from vwContentSearch

    SELECT  Word, 
    COUNT(Word) AS TotalOccurrences,
    COUNT(CASE WHEN Field = 'Title' THEN Word END) AS OccurancesInTitle,
    COUNT(CASE WHEN Field = 'URL' THEN Word END) AS OccurancesInURL,
    COUNT(CASE WHEN Field = 'Description' THEN Word END) AS OccurancesInDescription
    FROM    (   SELECT  CONVERT(NTEXT, Title) AS Word, 'Title' AS Field
        FROM    #tempSearch
        UNION ALL
        SELECT  CONVERT(NTEXT, URL), 'URL' AS Field
        FROM    #tempSearch
        UNION ALL
        SELECT  CONVERT(NTEXT, Description), 'Description' AS Field
        FROM    #tempSearch
    ) As Fields
    GROUP BY Word
    HAVING  COUNT(Word) > 1

   DROP TABLE #tempSearch
4

2 回答 2

2

您需要使用 UNION 将您的 3 个字段组合成一列,以便您可以使用它来分组。我还添加了更多计数,以防您需要深入了解单词出现最多的位置。

SELECT  Word, 
        COUNT(Word) AS TotalOccurrences,
        COUNT(CASE WHEN Field = 'Title' THEN Word END) AS OccurancesInTitle,
        COUNT(CASE WHEN Field = 'URL' THEN Word END) AS OccurancesInURL,
        COUNT(CASE WHEN Field = 'Description' THEN Word END) AS OccurancesInDescription
FROM    (   SELECT  Title AS Word, 'Title' AS Field
            FROM    Users
            UNION ALL
            SELECT  URL, 'URL' AS Field
            FROM    Users
            UNION ALL
            SELECT  Description, 'Description' AS Field
            FROM    Users
        ) As Fields
GROUP BY Word
HAVING  COUNT(Word) > 1

编辑

我知道您询问过 SQL_Server 2005 和 2000,但如果您要升级到 2008 或更高版本,有一个更简洁的解决方案:

SELECT  Word, 
        COUNT(Word) AS TotalOccurrences,
        COUNT(CASE WHEN Field = 'Title' THEN Word END) AS OccurancesInTitle,
        COUNT(CASE WHEN Field = 'URL' THEN Word END) AS OccurancesInURL,
        COUNT(CASE WHEN Field = 'Description' THEN Word END) AS OccurancesInDescription
FROM    Users
        CROSS APPLY 
        (   VALUES 
                (Title, 'Title'), 
                (URL, 'URL'), 
                (Description, 'Description')
        ) AS T (Word, Field)
GROUP BY Word
HAVING  COUNT(Word) > 1

编辑 2

如果所有列都是不同的数据类型,则需要显式转换它们:

SELECT  Word, 
        COUNT(Word) AS TotalOccurrences,
        COUNT(CASE WHEN Field = 'Title' THEN Word END) AS OccurancesInTitle,
        COUNT(CASE WHEN Field = 'URL' THEN Word END) AS OccurancesInURL,
        COUNT(CASE WHEN Field = 'Description' THEN Word END) AS OccurancesInDescription
FROM    (   SELECT  CONVERT(NTEXT, Title) AS Word, 'Title' AS Field
            FROM    Users
            UNION ALL
            SELECT  CONVERT(NTEXT, URL), 'URL' AS Field
            FROM    Users
            UNION ALL
            SELECT  CONVERT(NTEXT, Description), 'Description' AS Field
            FROM    Users
        ) As Fields
GROUP BY Word
HAVING  COUNT(Word) > 1

编辑 3

无法解决您遇到的错误,您无法按 NTEXT 分组。我能想出的最佳解决方案感觉很肮脏,我对它并不是特别满意......

SELECT  COALESCE(Title, URL, Description) AS Word,
        COALESCE(Title.Occurances, 0) + COALESCE(URL.Occurances, 0) + COALESCE(Description.Occurances, 0) AS TotalOccurances,
        COALESCE(Title.Occurances, 0) AS TitleOccurances,
        COALESCE(URL.Occurances, 0) AS URLOccurances,
        COALESCE(Description.Occurances, 0) AS DescriptionOccurances
FROM    (   SELECT  CONVERT(NTEXT, Title) AS Title, COUNT(*) AS Occurances
            FROM    #tempSearch
            GROUP BY Title
        ) AS Title
        FULL JOIN 
        (   SELECT  CONVERT(NTEXT, URL) AS URL, COUNT(*) AS Occurances
            FROM    #tempSearch
            GROUP BY URL
        ) AS URL
            ON URL LIKE Title
        FULL JOIN
        (   SELECT  Description, 1 AS Occurances
            FROM    #tempSearch
        ) AS Description
            ON Description LIKE Title

这行得通,但就像我说的那样,它并不完美,可能不会表现得很好。强烈考虑升级到更高版本的 SQL-Server!

于 2012-07-04T06:56:07.330 回答
1

您可以使用差异来计算单词在三个字段中出现的LEN()次数,以说明该单词在同一字段中出现多次:

SELECT
    a.wordname,
    SUM(
        ((LEN(b.title) - LEN(REPLACE(b.title, a.wordname, ''))) / LEN(a.wordname)) +
        ((LEN(b.url) - LEN(REPLACE(b.url, a.wordname, ''))) / LEN(a.wordname)) +
        ((LEN(b.description) - LEN(REPLACE(b.description, a.wordname, ''))) / LEN(a.wordname))
    ) AS occurrence
FROM
    (
        SELECT 'sqlserver' AS wordname UNION ALL
        SELECT 'jquery' AS wordname UNION ALL
        SELECT 'ajax' AS wordname
    ) a
CROSS JOIN
    users b
GROUP BY
    a.wordname
ORDER BY
    occurrence DESC

如果您有一个词名表,只需将该表名代替SELECT ... UNION ALL我为了示例而放置的许多语句。

于 2012-07-04T07:12:25.563 回答