我有一个大型 Java 应用程序,我试图在 AWS 的 Fargate 集群上运行。图像在我本地机器的 docker 上成功运行。当我在 fargate 中运行它时,它成功启动,但最终遇到以下错误,之后应用程序卡住:
! java.net.UnknownHostException: 690bd678bcf4: 690bd678bcf4: Name or service not known
! at java.net.InetAddress.getLocalHost(InetAddress.java:1505) ~[na:1.8.0_151]
! at tracelink.misc.SingletonTokenDBO$.<init>(SingletonTokenDBO.scala:34) ~[habari.jar:8.4-QUARTZ-SNAPSHOT]
! at tracelink.misc.SingletonTokenDBO$.<clinit>(SingletonTokenDBO.scala) ~[habari.jar:8.4-QUARTZ-SNAPSHOT]
!... 10 common frames omitted
Caused by: ! java.net.UnknownHostException: 690bd678bcf4: Name or service not known
! at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[na:1.8.0_151]
! at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) ~[na:1.8.0_151]
! at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) ~[na:1.8.0_151]
! at java.net.InetAddress.getLocalHost(InetAddress.java:1500) ~[na:1.8.0_151]
!... 12 common frames omitted
Scala 代码的违规行是:
private val machineName = InetAddress.getLocalHost().getHostName()
一些初步研究表明该错误与容器中 /etc/hosts 文件的内容有关。所以我创建了一个小型测试程序,它表现出与我的真实应用程序相同的行为,并将 /etc/hosts 的内容转储到标准输出:
import java.net.*;
import java.io.*;
public class NetworkTest {
public static void main(String[] args) throws InterruptedException, IOException, FileNotFoundException {
while(true) {
networkDump();
Thread.sleep(10000);
}
}
private static void networkDump() throws IOException, FileNotFoundException {
System.out.println("/etc/hosts:");
System.out.println("");
FileReader f = new FileReader("/etc/hosts");
BufferedReader reader = new BufferedReader(f);
String line = null;
while((line = reader.readLine()) != null) {
System.out.println(line);
}
System.out.println("");
dumpHostname();
}
private static void dumpHostname() {
try {
String hostname = InetAddress.getLocalHost().getHostName();
System.out.printf("Hostname: %s\n\n", hostname);
} catch(UnknownHostException e) {
System.out.println(e.getMessage());
}
}
}
Dockerfile:
FROM openjdk:8
WORKDIR /site
ADD . /site
CMD ["java", "NetworkTest"]
我在 AWS 中得到的输出如下所示:
/etc/hosts:
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
3a5a4271a6e3: 3a5a4271a6e3: Name or service not known
与在我的本地机器上的 docker 中运行的这个输出相比:
> docker run networktest
/etc/hosts:
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.17.0.4 82691e2fb948
Hostname: 82691e2fb948
未获得异常的本地版本在 /etc/hosts 中有一个主机名条目,而 AWS 主机文件没有主机名条目。我尝试添加一个 /etc/rc.local 文件以手动将主机名添加到 localhost 行的末尾,并在 Dockerfile 中添加一个 RUN 命令来执行相同的操作。两者都没有任何效果。
有谁知道是否有办法配置映像或 ECS 任务定义以在 AWS 中正确配置主机名?