0

我正在分析具有各种域名的日志文件。我想从输出报告中排除/忽略任何包含“macys”一词的域。这是一个示例输出:

l.macys.com        87516
www.google.com     3016
search.yahoo.com   584
www.bing.com       166
macys-L0135874392.htm   1

如果我看不到任何带有“macys”一词的域,我想拥有和输出文件。

4

1 回答 1

0

This sounds like the perfect use case for a Cascading Filter

You would set this up with a RegexFilter:

Pipe pipe = new Pipe(incomingPipe, new Fields("UrlColumn"), 
     new RegexFilter(".*macys.*", true), Fields.All);

Tailor the regex to your matching use case. The one above would remove all tuples (rows) that contain the word "macys"

于 2013-06-14T13:02:50.830 回答