0

似乎无法正确调整我的查询;任何帮助,将不胜感激。

这是我的查询:

SELECT 
  wordlist.Word,
  SUM( worddocfreq.Freq ) AS wordFreq
FROM sourceparsed
  LEFT JOIN worddocfreq ON sourceparsed.ParsedID = worddocfreq.ParsedID
  LEFT JOIN wordlist ON worddocfreq.WordID = wordlist.WordID
WHERE
  sourceparsed.SrcID = 30032
GROUP BY
  wordlist.Word

这按预期工作,作为示例结果集,我得到两列:第一列是不同单词的列表,第二列是每个单词的频率。

但是,我宁愿调整查询,使第二列改为比例(即每个单词出现次数的总和除以单词总数)。单词的总数将由第二列的总和给出,因为它是上面写的查询的输出。

所以,我的问题是我不确定如何计算单词总数的总和,因为查询末尾的“分组依据”追溯性地强制计算每个单词的总和。所以,我不知道如何将我的第二列除以计算的总和,而与“分组依据”术语无关。

我觉得需要嵌套选择,但我不确定如何以最佳方式集成它。

提前感谢您的任何建议。

干杯,

布赖恩

4

3 回答 3

1

我不确定这是最有效的方法,但试一试:

SELECT 
  wordlist.Word,
  SUM( worddocfreq.Freq ) / ( SELECT SUM( Freq ) 
                              FROM worddocfreq 
                                JOIN sourceparsed ON 
                                      sourceparsed.SrcID = sp1.SrcID
                                  AND sourceparsed.ParsedID = worddocfreq.ParsedID
                            ) AS proportion
FROM sourceparsed sp1
  LEFT JOIN worddocfreq ON sourceparsed.ParsedID = worddocfreq.ParsedID
  LEFT JOIN wordlist ON worddocfreq.WordID = wordlist.WordID
WHERE
  sourceparsed.SrcID = 30032
GROUP BY
  wordlist.Word
于 2012-07-19T21:01:38.030 回答
0

子查询的 ACROSS JOIN可能(或可能不会)比 SetFreeByTruth 的方法更有效:

SELECT 
  wordlist.Word,
  SUM( worddocfreq.Freq ) / TotalFreq.TotalFreq AS wordFreq
FROM sourceparsed
  LEFT JOIN worddocfreq ON sourceparsed.ParsedID = worddocfreq.ParsedID
  LEFT JOIN wordlist ON worddocfreq.WordID = wordlist.WordID
  CROSS JOIN ( SELECT SUM( Freq ) AS TotalFreq FROM worddocfreq ) AS TotalFreq
WHERE
  sourceparsed.SrcID = 30032
GROUP BY
  wordlist.Word
于 2012-07-19T21:03:06.857 回答
0

小心除以零误差。可能有更好的方法,但您可以尝试以下方法:

select c,wordFreq,sum_all, wordFreq/sum_all as proportion from 
(

    (

        select wordlist.Word,
        sum(worddocfreq.Freq) as wordFreq
        from sourceparsed
        left join worddocfreq on sourceparsed.ParsedID = worddocfreq.ParsedID
        left join wordlist on worddocfreq.WordID = wordlist.WordID
        where sourceparsed.SrcID = 30032
        group by wordlist.Word

    ) c
   LEFT OUTER JOIN
   (select SUM(worddocfreq.Freq) sum_all  
    from sourceparsed
    left join worddocfreq on sourceparsed.ParsedID = worddocfreq.ParsedID
    left join wordlist on worddocfreq.WordID = wordlist.WordID 
    where sourceparsed.SrcID = 30032
   ) t
   ON 1=1
) 
于 2012-07-19T21:03:26.940 回答