-1

在 testOne() 我使用正则表达式使用判断字符串是否包含某些特定字符串

在 testTwo() 我使用 if else 语句来做同样的事情

我想知道为什么在我的测试用例中 testTwo() 总是比 testOne() 快

正则表达式不适合这个问题吗?还是我的正则表达式写得不好?

我的测试代码如下,非常感谢!

public class TestReg {

    static final Pattern PATT = Pattern
            .compile("(tudou|video.sina|v.youku|v.ku6|tv.sohu|v.163|tv.letv|v.ifeng|v.qq|iqiyi|(5)?6)\\.(com|cn)");

    @Test
    public void testOne() {
        int count = 0;
        for (int i = 0; i < 10000; i++) {
            for (String vurl : TESTCASES) {
                if (PATT.matcher(vurl).find())
                    count++;
            }
        }
        System.out.println("testOne:" + count);
    }

    @Test
    public void testTwo() {
        int count = 0;
        for (int i = 0; i < 10000; i++) {
            for (String vurl : TESTCASES) {
                if (vurl.indexOf("tudou.com") != -1
                        || vurl.indexOf("video.sina.com") != -1
                        || vurl.indexOf("v.youku.com") != -1
                        || vurl.indexOf("v.ku6.com") != -1
                        || vurl.indexOf("56.com") != -1
                        || vurl.indexOf("tv.sohu.com") != -1
                        || vurl.indexOf("v.163.com") != -1
                        || vurl.indexOf("tv.letv.com") != -1
                        || vurl.indexOf("v.ifeng.com") != -1
                        || vurl.indexOf("v.qq.com") != -1
                        || vurl.indexOf("iqiyi.com") != -1
                        || vurl.indexOf("6.cn") != -1) {
                    count++;
                }
            }
        }
        System.out.println("testOne:" + count);
    }

    static final String[] TESTCASES = {
            "http://blog.csdn.net/v_july_v/article/details/7624837",
            "http://jobs.douban.com/intern/apply/?type=dev&position=intern_sf",
            "https://class.coursera.org/ml/lecture/index",
            "http://blog.csdn.net/v_july_v/article/details/7624837",
            "http://jobs.douban.com/intern/apply/?type=dev&position=intern_sf",
            "https://class.coursera.org/ml/lecture/index",
            "http://blog.csdn.net/v_july_v/article/details/7624837",
            "http://jobs.douban.com/intern/apply/?type=dev&position=intern_sf",
            "https://class.coursera.org/ml/lecture/index",
            "http://blog.csdn.net/v_july_v/article/details/7624837",
            "http://jobs.douban.com/intern/apply/?type=dev&position=intern_sf",
            "https://class.coursera.org/ml/lecture/index",
            "http://www.56.com/u38/v_NjYyNTUyMjc.html",
            "http://video.sina.com.cn/v/b/69614895-2128825751.html",
            "http://www.tudou.com/programs/view/xcPewAoJ26M",
            "http://v.youku.com/v_show/id_XMzQ0OTI0MTgw.html",
            "http://www.56.com/u87/v_NjMzMjEzNTY.html",
            "http://tv.sohu/u87/v_NjMzMjEzNTY.html",
            "http://tv.letv/u38/v_NjYyNTUyMjc.html",
            "http://v.ifeng/v/b/69614895-2128825751.html",
            "http://v.qq/programs/view/xcPewAoJ26M",
            "http://v.163/v_show/id_XMzQ0OTI0MTgw.html",
            "http://iqiyi/u87/v_NjMzMjEzNTY.html",
            "http://v.6.cn/u87/v_NjMzMjEzNTY.html" };

}
4

1 回答 1

3

我也不会使用:

  • 正则表达式旨在匹配模式;他们对于精确匹配来说太过分了
  • ||声明有点痛苦。

我只会用一个HashSet<String>. 对于每个 URL,您首先使用类之URL类的东西来提取主机名,然后查看它是否在您感兴趣的主机集中。

除此之外,这将防止误报 - 您当前的方法将匹配

http://www.someotherhost.com/something/tudou.com

...您实际上并不想这样做。

于 2013-06-15T12:28:59.083 回答