2

我有一个这样的查询,它查看大量不同的 URL,并按主机名对它们进行分组。它非常丑陋,但似乎足够快可以使用。

我如何编写它以便以更简洁的方式编写丑陋的子字符串(它抓住了域的第一部分)?我正在从一系列社交媒体网站生成查询,因此那里可能会有更多网站。

SELECT substring(r.name, 8, locate("/",substring(r.name FROM 8))-1) AS referer_domain,
       count(USER) AS hits,
       r.id
FROM core c,
     referer r
WHERE c.site_url = 12
    AND r.name LIKE '%/%'
    AND c.referer = r.id
    AND (substring(r.name, 8, locate("/",substring(r.name FROM 8))-1) = "www.delicious.com"
    OR substring(r.name, 8, locate("/",substring(r.name FROM 8))-1) = "www.facebook.com"
    OR substring(r.name, 8, locate("/",substring(r.name FROM 8))-1) = "m.facebook.com"
    OR substring(r.name, 8, locate("/",substring(r.name FROM 8))-1) = "www.reddit.com"
    OR substring(r.name, 8, locate("/",substring(r.name FROM 8))-1) = "twitter.com"
    OR substring(r.name, 8, locate("/",substring(r.name FROM 8))-1) = "news.ycombinator.com"
GROUP BY substring(r.name, 8, locate("/",substring(r.name FROM 8))-1)
ORDER BY hits DESC
4

4 回答 4

2

在您的情况下,您已经创建了一个输出列 referer_domain,您可以为 GROUP BY 引用它。

为了在 WHERE 子句中使用它,尽管您需要一个视图。

CREATE VIEW ref_domain_view AS SELECT *,substring(name, 8, locate("/",substring(name FROM 8))-1) as referer_domain FROM referer;

SELECT r.referer_domain,
   count(USER) AS hits,
   r.id
FROM core c,
 ref_domain_view r
WHERE c.site_url = 12
AND r.name LIKE '%/%'
AND c.referer = r.id
AND referer_domain = "www.delicious.com"
OR referer_domain = "www.facebook.com"
OR referer_domain = "m.facebook.com"
OR referer_domain = "www.reddit.com"
OR referer_domain = "twitter.com"
OR referer_domain = "news.ycombinator.com"
GROUP BY referer_domain
ORDER BY hits DESC
于 2012-06-02T13:01:08.867 回答
0

你可以使用CREATE FUNCTION它。

于 2012-06-02T12:56:44.110 回答
0

一种方法是编写一个执行字符串提取的用户定义函数 (UDF)。UDF 很快,但写起来更痛苦。另一种选择是编写存储函数,这是一种更易于编写且更符合 SQL 语法的存储过程。

于 2012-06-02T13:12:05.903 回答
0

虽然以上两个答案都有很好的意义,但我不得不问你一些事情。

你真的需要所有的子字符串函数调用吗?

你有没有尝试过这样的事情:

SELECT substring(r.name, 8, locate("/",substring(r.name FROM 8))-1) AS referer_domain,
   count(USER) AS hits,
   r.id
FROM core c,
 referer r
WHERE c.site_url = 12
AND r.name LIKE '%/%'
AND c.referer = r.id
AND r.name    = "http://www.delicious.com"
OR r.name     = "http://www.facebook.com"
OR r.name     = "http://m.facebook.com"
OR r.name     = "http://www.reddit.com"
OR r.name     = "http://twitter.com"
OR r.name     = "http://news.ycombinator.com"
GROUP BY substring(r.name, 8, locate("/",substring(r.name FROM 8))-1)
ORDER BY hits DESC

上述查询的唯一问题是,如果您需要更多域来跟踪,则需要更改查询。如果您将域放入表中并加入该表,您可能会完全摆脱子字符串,并且永远不必更改查询。

于 2012-06-02T13:33:19.327 回答