9

AIX 64 位,7G 内存

$ uname -a
AIX server3 1 7 00036073D600

$ java -version
java version "1.6.0"
Java(TM) SE Runtime Environment (build pap6460_26sr1-20111114_01(SR1))
IBM J9 VM (build 2.6, JRE 1.6.0 AIX ppc64-64 20111113_94967 (JIT enabled, AOT enabled)
J9VM - R26_Java626_SR1_20111113_1649_B94967
JIT  - r11_20111028_21230
GC   - R26_Java626_SR1_20111113_1649_B94967
J9CL - 20111113_94967)
JCL  - 20111112_01

$ time /opt/IBM/WebSphere/AppServer/java/bin/java
...
real    0m40.62s
user    0m0.43s
sys     0m0.04s

在没有任何应用程序的情况下启动 java 命令需要 40 秒。JRE(64 位)是 WebSphere V8 的一部分,需要 20 分钟。启动应用程序服务器。

在同一个 AIX 中,还有另一个 JRE(32 位),它工作得很好。

$ /usr/java6/bin/java -version
java version "1.6.0"
Java(TM) SE Runtime Environment (build pap3260sr9fp2-20110627_03(SR9 FP2))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 AIX ppc-32 jvmap3260sr9-20110624_85526 (JIT enabled, AOT enabled)
J9VM - 20110624_085526
JIT  - r9_20101028_17488ifx17
GC   - 20101027_AA)
JCL  - 20110530_01

$ time /usr/java6/bin/java
real    0m0.70s
user    0m0.64s
sys     0m0.05s

我发现了一些使用 truss 的东西。当 java 要启动时,它在 thread_tsleep() 处被阻塞了很长时间。为什么?如何纠正它?

kopen("/etc/irs.conf", O_RDONLY)                Err#2  ENOENT
_thread_self()                                  = 26738737
getdomainname(0x09001000A00E44F8, 1024)         = 0
_thread_self()                                  = 26738737
_thread_self()                                  = 26738737
_thread_self()                                  = 26738737
getdomainname(0x09001000A00E44F8, 1024)         = 0
_thread_self()                                  = 26738737
_thread_self()                                  = 26738737
_thread_self()                                  = 26738737
kopen("/etc/hesiod.conf", O_RDONLY)             Err#2  ENOENT
_thread_self()                                  = 26738737
getdomainname(0x09001000A00E44F8, 1024)         = 0
_thread_self()                                  = 26738737
_thread_self()                                  = 26738737
_thread_self()                                  = 26738737
getdomainname(0x09001000A00E44F8, 1024)         = 0
_thread_self()                                  = 26738737
_thread_self()                                  = 26738737
_thread_self()                                  = 26738737
getdomainname(0x09001000A00E44F8, 1024)         = 0
_thread_self()                                  = 26738737
_thread_self()                                  = 26738737
socket(2, 2, 0)                                 = 4
getsockopt(4, 65535, 4104, 0x000001001012A934, 0x000001001012A930) = 0
connext(4, 0x00000100103F37B8, 16)              = 0
_esend(4, 0x000001001012B860, 40, 0, 0x0000000000000000) = 40
_poll(0x000001001012AA00, 1, 5000)              = 1
_enrecvfrom(4, 0x000001001012CBB0, 1024, 0, 0x000001001012B1C0, 0x000001001012A9E8, 0x0000000000000000) = 56
close(4)                                        = 0
socket(2, 2, 0)                                 = 4
getsockopt(4, 65535, 4104, 0x000001001012A934, 0x000001001012A930) = 0
connext(4, 0x00000100103F37B8, 16)              = 0
_esend(4, 0x000001001012B860, 40, 0, 0x0000000000000000) = 40
_poll(0x000001001012AA00, 1, 5000)              = 1
_enrecvfrom(4, 0x000001001012CBB0, 1024, 0, 0x000001001012B1C0, 0x000001001012A9E8, 0x0000000000000000) = 94
_esend(4, 0x000001001012B860, 25, 0, 0x0000000000000000) = 25
_poll(0x000001001012AA00, 1, 5000)              = 1
_enrecvfrom(4, 0x000001001012CBB0, 1024, 0, 0x000001001012B1C0, 0x000001001012A9E8, 0x0000000000000000) = 25
close(4)                                        = 0
socket(2, 2, 0)                                 = 4
_esendto(4, 0x000001001012B860, 25, 0, 0x00000100103F37C8, 16, 0x0000000000000000) = 25
thread_tsleep(0, 0x09001000A030F400, 0x0000000000000000, 0x0000000000000000) (sleeping...)
_poll(0x000001001012AA00, 1, 5000)              = 0
close(4)                                        = 0
socket(2, 2, 0)                                 = 4
_esendto(4, 0x000001001012B860, 25, 0, 0x00000100103F37C8, 16, 0x0000000000000000) = 25
thread_tsleep(0, 0x09001000A030F400, 0x0000000000000000, 0x0000000000000000) (sleeping...)
_poll(0x000001001012AA00, 1, 5000)              = 0
close(4)                                        = 0
socket(2, 2, 0)                                 = 4
_esendto(4, 0x000001001012B860, 25, 0, 0x00000100103F37C8, 16, 0x0000000000000000) = 25
thread_tsleep(0, 0x09001000A030F400, 0x0000000000000000, 0x0000000000000000) (sleeping...)
_poll(0x000001001012AA00, 1, 10000)             = 0
close(4)                                        = 0
socket(2, 2, 0)                                 = 4
_esendto(4, 0x000001001012B860, 25, 0, 0x00000100103F37C8, 16, 0x0000000000000000) = 25
thread_tsleep(0, 0x09001000A030F400, 0x0000000000000000, 0x0000000000000000) (sleeping...)
_poll(0x000001001012AA00, 1, 20000)             = 0
close(4)                                        = 0
getdomainname(0x000001001012CD10, 256)          = 0
kopen("/etc/hosts", O_RDONLY)                   = 4
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kfcntl(4, F_SETFD, 0x0000000000000001)          = 0
kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kread(0x0000000000000004, 0x0000010010416538, 0x0000000000001000) = 0x00000000000007B0
     0x00000000: " #   = = = = = = = = = ="..
kread(0x0000000000000004, 0x0000010010416538, 0x0000000000001000) = 0x0000000000000000
     0x00000000: " #   = = = = = = = = = ="..
close(4)                                        = 0
__libc_sbrk(0x0000000000020020)                 = 0x0000010010421C20

带时间戳

1.8662:        socket(2, 2, 0)                  = 4
1.8666:        _esendto(4, 0x000001001012B860, 25, 0, 0x00000100103F38A8, 16, 0x0000000000000000) = 25
3.8671:        thread_tsleep(0, 0x09001000A030F400, 0x0000000000000000, 0x0000000000000000) (sleeping...)
1.8669:        _poll(0x000001001012AA00, 1, 5000) = 0
6.8705:        close(4)                         = 0
6.8710:        socket(2, 2, 0)                  = 4
6.8715:        _esendto(4, 0x000001001012B860, 25, 0, 0x00000100103F38A8, 16, 0x0000000000000000) = 25
8.8723:        thread_tsleep(0, 0x09001000A030F400, 0x0000000000000000, 0x0000000000000000) (sleeping...)
6.8720:        _poll(0x000001001012AA00, 1, 5000) = 0
11.8726:        close(4)                        = 0
11.8729:        socket(2, 2, 0)                 = 4
11.8732:        _esendto(4, 0x000001001012B860, 25, 0, 0x00000100103F38A8, 16, 0x0000000000000000) = 25
13.8738:        thread_tsleep(0, 0x09001000A030F400, 0x0000000000000000, 0x0000000000000000) (sleeping...)
11.8736:        _poll(0x000001001012AA00, 1, 10000) = 0
21.8741:        close(4)                        = 0
21.8744:        socket(2, 2, 0)                 = 4
21.8748:        _esendto(4, 0x000001001012B860, 25, 0, 0x00000100103F38A8, 16, 0x0000000000000000) = 25
23.8754:        thread_tsleep(0, 0x09001000A030F400, 0x0000000000000000, 0x0000000000000000) (sleeping...)
21.8752:        _poll(0x000001001012AA00, 1, 20000) = 0
41.8756:        close(4)                        = 0
41.8760:        getdomainname(0x000001001012CD10, 256) = 0
41.8763:        kopen("/etc/hosts", O_RDONLY)   = 4
41.8767:        kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
41.8770:        kfcntl(4, F_SETFD, 0x0000000000000001) = 0
41.8773:        kioctl(4, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
kread(0x0000000000000004, 0x0000010010416618, 0x0000000000001000) = 0x00000000000007B0
     0x00000000: " #   = = = = = = = = = ="..
kread(0x0000000000000004, 0x0000010010416618, 0x0000000000001000) = 0x0000000000000000
     0x00000000: " #   = = = = = = = = = ="..
41.8782:        close(4)                        = 0

tp

Quit in sem_wait at 0x9000000002632d8 ($t3)
0x9000000002632d8 (sem_wait+0x98) e8410028          ld   r2,0x28(r1)

 thread  state-k     wchan            state-u    k-tid mode held scope function
>$t1     run                          blocked  31391751   u   no   sys  _event_sleep
_event_sleep(??, ??, ??, ??, ??, ??) at 0x9000000008365c4
_event_wait(??, ??) at 0x900000000837064
_cond_wait_local(??, ??, ??) at 0x90000000084521c
_cond_wait(??, ??, ??) at 0x900000000845808
pthread_join(??, ??) at 0x90000000082d2b0
unnamed block in ContinueInNewThread(continuation = 0x6c6f636b6c002e66, stack_size = 7310577395057127012, args = 0x5f6f72002e666574), line 2286 in "java_md.c"
ContinueInNewThread(continuation = 0x6c6f636b6c002e66, stack_size = 7310577395057127012, args = 0x5f6f72002e666574), line 2286 in "java_md.c"
unnamed block in main(argc = 0, argv = 0x000001001000c178), line 532 in "java.c"
main(argc = 0, argv = 0x000001001000c178), line 532 in "java.c"

 thread  state-k     wchan            state-u    k-tid mode held scope function
>$t2     run                          running  54329549   u   no   sys  __fd_poll
__fd_poll(??, ??, ??) at 0x90000000012c0d4
res_send.poll(??, ??, ??) at 0x9000000001021ac
res_nsend(0x100103f37a8, 0x1001012b860, 0x1900000019, 0x1001012cbb0, 0x40000000400) at 0x9000000001011d4
res_nquery(??, ??, ??, ??, ??, ??) at 0x90000000012b5d8
res_nquerydomain(??, ??, ??, ??, ??, ??, ??) at 0x90000000012ab14
res_nsearch(??, ??, ??, ??, ??, ??) at 0x90000000012af08
res_search(??, ??, ??, ??, ??) at 0x900000000107b50
dns_ho.ho_byname2(??, ??, ??) at 0x90000000013a384
gen_ho.ho_byname2(??, ??, ??) at 0x900000000163888
gethostbyname2(??, ??) at 0x9000000001060ec
getaddrinfo2(??, ??, ??, ??, ??) at 0x900000000102ba0
getaddrinfo(??, ??, ??, ??) at 0x900000000104d34
j9sock_getaddrinfo() at 0x9000000053caad8
populateRASNetData() at 0x900000005347fc0
VMInitStages() at 0x9000000053086b0
runJ9VMDllMain() at 0x90000000530bb9c
pool.pool_do() at 0x9000000052f29c8
runInitializationStage() at 0x90000000530b890
protectedInitializeJavaVM() at 0x900000005306478
j9sig_protect() at 0x9000000053a2b9c
initializeJavaVM() at 0x90000000530585c
jniinv.JNI_CreateJavaVM() at 0x90000000530ea14
jvm.JNI_CreateJavaVM() at 0x900000000ccc510
redirector.JNI_CreateJavaVM() at 0x900000000cb5574
InitializeJVM(pvm = (nil), penv = (nil), ifn = (nil)), line 1801 in "java.c"
main(argc = 0, argv = (nil)), line 810 in "java.c"

 thread  state-k     wchan            state-u    k-tid mode held scope function
>$t3     run                          running  52887625   k   no   sys  sem_wait
sem_wait(??) at 0x9000000002632d8
asynchSignalReporter() at 0x9000000053a3dcc
thread_wrapper() at 0x900000000d00ad0

!!!! ThreadDump Completed- detached from debugger !!!!

更新

我找到了这个主题,并在我的 AIX 上测试了该程序。使用主机名时,DNS 查询延迟了大约 40 秒。IP 地址正常。

Java 无法从 AIX 解析 DNS 地址:UnknownHostException

顺便说一句,主机名显示 server3,当我 ping 名称时,它显示: PING server3.cf1.xx.xxxx.com .....我使用 server3.cf1.xx.xxxx.com 运行程序,有没有延迟。

更新2:

这是IPV6的问题。如果我添加 -Djava.net.preferIPv4Stack=true,java 应用程序运行良好。但是java命令仍然使用IPV6来查询DNS。所以java命令有40s延迟,java应用程序在DNS查询上没有延迟。

我可以在 AIX 中将默认协议设置为 IPV4,而不是 java 参数吗?

4

3 回答 3

5

我对此不是 100% 确定,但至少值得一试:

在 AIX 上,32 位 Java VM 使用 IPv4 作为默认 IP 协议,而 64 位 VM 使用 IPv6。如果在操作系统中启用了 IPv6,但配置不正确,则 64 位 VM 可能会在不同的点上挂起,实际上它正在等待网络超时。这似乎与您的分析与桁架相匹配。

您可以通过在 java 命令中添加以下选项来尝试以 IPv4 作为首选网络堆栈来启动 64 位 VM:

-Djava.net.preferIPv4Stack=true

于 2012-04-23T14:08:04.673 回答
1

知道了。

要解决此问题,请通过在 /etc/netsvc.conf 文件中添加以下行来禁用 IPV6 查找:

主机=本地4,绑定4

参考: AIX 上的客户端为新连接生成过多的 IPV6 DNS 查找

于 2012-04-24T02:03:55.710 回答
1

很难从你的 truss 输出中弄清楚什么需要很长时间(也许看看你是否可以在其输出中添加时间戳,我认为 truss 可以做到这一点),但是有很多 getdomainname 活动,这始终是调查时的主要嫌疑人奇怪的启动延迟。检查是否有分配给此框的无效主机名,确保主主机名在 /etc/hosts 中解析为 127.0.0.1。

于 2012-04-23T14:05:55.400 回答