我写了一个 HiveQL 脚本,例如:
create temporary function getdomainips as 'com.is.mail.domainspf.IpFromDomainExtract';
select Domain,getdomainips(Domain) as ips from tmp_domain;
IpFromDomainExtract 类如下:
package com.is.mail.domainspf;
import org.apache.hadoop.hive.ql.exec.UDF;
import java.util.ArrayList;
import java.util.Arrays;
public class IpFromDomainExtract extends UDF {
public ArrayList<String> evaluate(String domain) {
try {
if (domain == null || "".equals(domain)) {
return new ArrayList<String>();
}
return new ArrayList<String>(Arrays.asList(UrlDomainDigHelper.getARecord(domain).split(",")));
} catch (Exception e) {
return new ArrayList<String>();
}
}
}
import java.net.UnknownHostException;
import org.xbill.DNS.ARecord;
import org.xbill.DNS.ExtendedResolver;
import org.xbill.DNS.Lookup;
import org.xbill.DNS.Record;
import org.xbill.DNS.TextParseException;
import org.xbill.DNS.Type;
/**
*use dnsjava jar package
*/
class UrlDomainDigHelper {
private static ExtendedResolver resolver;
public static String getARecord(String d) throws TextParseException, UnknownHostException {
if (resolver == null) {
synchronized (UrlDomainDigHelper.class) {
if (resolver == null) {
ExtendedResolver tmpresolver = new ExtendedResolver();
tmpresolver.setTimeout(5);
resolver = tmpresolver;
}
}
}
Lookup lookup = new Lookup(d, Type.A);
if (resolver != null)
lookup.setResolver(resolver);
Record[] records = lookup.run();
StringBuilder sb = new StringBuilder();
if (records != null) {
for (int i = 0; i < records.length; i++) {
ARecord mx = (ARecord) records[i];
sb.append(mx.getAddress().getHostAddress()).append(",");
}
}
if (sb.length() > 0)
sb.setLength(sb.length() - 1);
return sb.toString();
}
}
当我运行 HiveQL 脚本时,我卡在了 map=12%,如下所示:
2016-01-06 16:14:06,701 Stage-1 map = 12%, reduce = 0%, Cumulative CPU 105.67 sec 2016-01-06 16:15:07,172 Stage-1 map = 12%, reduce = 0%,累积 CPU 106.7 秒 2016-01-06 16:16:07,317 阶段 1 映射 = 12%,减少 = 0%,累积 CPU 107.87 秒 2016-01-06 16:17:07,501 阶段 1 映射 = 12%,减少= 0%,累积 CPU 108.84 秒 2016-01-06 16:18:07,680 第一阶段地图 = 12%,减少 = 0%,累积 CPU 109.71 秒 2016-01-06 16:19:07,870 第一阶段地图 = 12%,减少 = 0%,累积 CPU 110.37 秒 2016-01-06 16:20:08,014 阶段 1 映射 = 12%,减少 = 0%,累积 CPU 111.5 秒 2016-01-06 16:21:08,234 阶段-1 map = 12%, reduce = 0%, Cumulative CPU 112.76 sec 2016-01-06 16:22:08,494 Stage-1 map = 12%, reduce = 0%, Cumulative CPU 113.78 sec 2016-01-06 16: 23:08,789 阶段 1 映射 = 12%,减少 = 0%,累积 CPU 114。97 秒 2016-01-06 16:24:09,191 第一阶段地图 = 12%,减少 = 0%,累积 CPU 115.84 秒 2016-01-06 16:25:09,537 第一阶段地图 = 12%,减少 = 0 %,累积 CPU 116.79 秒 2016-01-06 16:26:09,779 阶段 1 映射 = 12%,减少 = 0%,累积 CPU 117.69 秒 2016-01-06 16:27:10,106 阶段 1 映射 = 12% , 减少 = 0%, 累积 CPU 118.91 秒 2016-01-06 16:28:10,213 第一阶段地图 = 12%, 减少 = 0%, 累积 CPU 119.94 秒 2016-01-06 16:29:10,826 第一阶段map = 12%,reduce = 0%,Cumulative CPU 120.76 sec 2016-01-06 16:30:11,158 Stage-1 map = 12%,reduce = 0%,Cumulative CPU 122.2 sec 2016-01-06 16:31: 11,433 Stage-1 map = 12%, reduce = 0%, Cumulative CPU 123.26 sec 2016-01-06 16:32:11,564 Stage-1 map = 12%, reduce = 0%, Cumulative CPU 124.28 sec 2016-01-06 16:33:12,093 阶段 1 映射 = 12%,减少 = 0%,累积 CPU 124。9 sec 2016-01-06 16:34:12,319 Stage-1 map = 12%, reduce = 0%, Cumulative CPU 125.82 sec 2016-01-06 16:35:12,556 Stage-1 map = 12%, reduce = 0 %,累积 CPU 126.79 秒 2016-01-06 16:36:12,978 阶段 1 映射 = 12%,减少 = 0%,累积 CPU 127.57 秒
当我使用 jstack 查看 MapReduce 作业流程信息时,我得到以下结果:
“main”prio=10 tid=0x000000000122e000 nid=0x6908 in Object.wait() [0x00007fc884fa7000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - 等待<0x00000000eab6b780>(org.xbill.DNS.ExtendedResolver$Resolution)在 org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:111) 的 java.lang.Object.wait(Object.java:502) -在 org.xbill.DNS.Lookup.lookup(Lookup.java:477) 的 org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358) 锁定 <0x00000000eab6b780> (一个 org.xbill.DNS.ExtendedResolver$Resolution)在 org.xbill.DNS.Lookup.resolve(Lookup.java:529) 在 org.xbill.DNS.Lookup.run(Lookup.java:546)
我想我遇到了死锁,如果我已经使用了最新的dnsjava jar包(2.1.7版本),我该如何解决这个问题?非常感谢。