1

我有一个 Maven 构建,我使用来自 Docker 中心的官方 Maven 映像在 Docker 容器中运行。.m2 目录安装到 NFS 共享。

这在一个环境中有效,但在另一个相同的环境中,它总是会在写入锁定文件后挂起。它永远不会完成下载,而​​是永远挂在那里。由于 maven debug 在挂起之后没有给我任何细节,我决定查看 .m2 目录,看看发生了什么。

ubuntu@kubernetes-dev-nfs-pv:/nfs-shares/jenkins/.m2$ inotifywait -m -r .
Setting up watches.  Beware: since -r was given, this may take a while!
Watches established.
./ CREATE,ISDIR repository
./ OPEN,ISDIR repository
./ CLOSE_NOWRITE,CLOSE,ISDIR repository
./repository/ CREATE,ISDIR org
./repository/ OPEN,ISDIR org
./repository/ CLOSE_NOWRITE,CLOSE,ISDIR org
./repository/org/ CREATE,ISDIR springframework
./repository/org/ OPEN,ISDIR springframework
./repository/org/ CLOSE_NOWRITE,CLOSE,ISDIR springframework
./repository/org/springframework/ CREATE,ISDIR boot
./repository/org/springframework/ OPEN,ISDIR boot
./repository/org/springframework/ CLOSE_NOWRITE,CLOSE,ISDIR boot
./repository/org/springframework/boot/ CREATE,ISDIR spring-boot-starter-parent
./repository/org/springframework/boot/ OPEN,ISDIR spring-boot-starter-parent
./repository/org/springframework/boot/ CLOSE_NOWRITE,CLOSE,ISDIR spring-boot-starter-parent
./repository/org/springframework/boot/spring-boot-starter-parent/ CREATE,ISDIR 1.3.7.RELEASE
./repository/org/springframework/boot/spring-boot-starter-parent/ OPEN,ISDIR 1.3.7.RELEASE
./repository/org/springframework/boot/spring-boot-starter-parent/ CLOSE_NOWRITE,CLOSE,ISDIR 1.3.7.RELEASE
./repository/org/springframework/boot/spring-boot-starter-parent/1.3.7.RELEASE/ CREATE spring-boot-starter-parent-1.3.7.RELEASE.pom.part.lock

Maven 似乎正在工作,它创建了许多文件夹甚至锁定文件,但随后它挂起。我怎样才能让 maven 完成或找到一些额外的信息来帮助我解决这个问题。

顺便说一句,如果我在容器内使用临时存储,它会按预期下载包。

UDPATE:其中一条评论建议使用线程转储。您可以在下面看到我附加到正在运行的容器。我确认容器可以修改 .m2 目录中的文件,然后我使用 jstack 来获取进程的线程转储。

ec2-user@murano-necrhj0ld3vkx4-kube-3-gm5dmsfnftxn:~$ sudo docker ps
CONTAINER ID        IMAGE                                                     COMMAND                  CREATED             STATUS              PORTS               NAMES
c7d1f4c91559        maven:alpine                                              "cat"                    About an hour ago   Up About an hour                        agitated_cori
ec2-user@murano-necrhj0ld3vkx4-kube-3-gm5dmsfnftxn:~$ sudo docker exec -ti c7d1f4c91559 /bin/bash
bash-4.3$ ps
PID   USER     TIME   COMMAND
    1 1000       0:00 cat
    6 1000       0:00 sh -c echo $$ > '/var/jenkins_home/workspace/api-product@tmp/durable-ca9825bd/pid'; jsc=durable-04ba6b757bca34373f180bd01ef64ca1; JENKINS_SERVER_COOKIE=$jsc '/var/jenkins_home/workspace/api-product@tmp/durable-ca
   12 1000       0:00 {script.sh} /bin/sh -xe /var/jenkins_home/workspace/api-product@tmp/durable-ca9825bd/script.sh
   13 1000       0:07 /usr/lib/jvm/java-1.8-openjdk/bin/java -classpath /usr/share/maven/boot/plexus-classworlds-2.5.2.jar -Dclassworlds.conf=/usr/share/maven/bin/m2.conf -Dmaven.home=/usr/share/maven -Dmaven.multiModuleProjectDirecto
 1584 1000       0:00 /bin/bash
 1589 1000       0:00 ps
bash-4.3$ cat /var/jenkins_home/workspace/api-product@tmp/durable-ca9825bd/script.sh
#!/bin/sh -xe
mvn -Dmaven.repo.local="$PWD"/../../.m2/repository clean compile
bash-4.3$ ls -la /var/jenkins_home/.m2/
total 16
drwxr-xr-x    3 1000     1000          4096 May  2 21:14 .
drwxrwxr-x   23 1000     1000          4096 May  3 11:55 ..
-rw-r--r--    1 1000     1000             6 May  2 21:14 file.txt
drwxr-xr-x    3 1000     1000          4096 May  2 20:50 repository
bash-4.3$ cat /var/jenkins_home/.m2/file.txt
hello
bash-4.3$ vi /var/jenkins_home/.m2/file.txt
bash-4.3$ cat /var/jenkins_home/.m2/file.txt
hello
another

bash-4.3$ jstack 13
2017-05-03 13:04:37
Full thread dump OpenJDK 64-Bit Server VM (25.121-b13 mixed mode):

"Attach Listener" #11 daemon prio=9 os_prio=0 tid=0x00007fc4a4956800 nid=0x6a7 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007fc4a4343000 nid=0x2c runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007fc4a4311800 nid=0x2b waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fc4a4302000 nid=0x2a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fc4a42ff000 nid=0x29 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007fc4a42fc800 nid=0x28 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fc4a42d5000 nid=0x27 in Object.wait() [0x00007fc48ba4b000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000dab108d8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
    - locked <0x00000000dab108d8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fc4a42ca800 nid=0x26 in Object.wait() [0x00007fc48bb4c000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000dab18178> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:502)
    at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
    - locked <0x00000000dab18178> (a java.lang.ref.Reference$Lock)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"main" #1 prio=5 os_prio=0 tid=0x00007fc4a4179800 nid=0x20 runnable [0x00007fc4a3426000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.FileDispatcherImpl.lock0(Native Method)
    at sun.nio.ch.FileDispatcherImpl.lock(FileDispatcherImpl.java:90)
    at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1115)
    at org.eclipse.aether.connector.basic.PartialFile$LockFile.tryLock(PartialFile.java:135)
    at org.eclipse.aether.connector.basic.PartialFile$LockFile.lock(PartialFile.java:80)
    at org.eclipse.aether.connector.basic.PartialFile$LockFile.<init>(PartialFile.java:67)
    at org.eclipse.aether.connector.basic.PartialFile$Factory.newInstance(PartialFile.java:219)
    at org.eclipse.aether.connector.basic.BasicRepositoryConnector$GetTaskRunner.runTask(BasicRepositoryConnector.java:441)
    at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run(BasicRepositoryConnector.java:359)
    at org.eclipse.aether.util.concurrency.RunnableErrorForwarder$1.run(RunnableErrorForwarder.java:76)
    at org.eclipse.aether.connector.basic.BasicRepositoryConnector$DirectExecutor.execute(BasicRepositoryConnector.java:590)
    at org.eclipse.aether.connector.basic.BasicRepositoryConnector.get(BasicRepositoryConnector.java:258)
    at org.eclipse.aether.internal.impl.DefaultArtifactResolver.performDownloads(DefaultArtifactResolver.java:529)
    at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve(DefaultArtifactResolver.java:430)
    at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts(DefaultArtifactResolver.java:255)
    at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifact(DefaultArtifactResolver.java:232)
    at org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveArtifact(DefaultRepositorySystem.java:303)
    at org.apache.maven.project.ProjectModelResolver.resolveModel(ProjectModelResolver.java:193)
    at org.apache.maven.project.ProjectModelResolver.resolveModel(ProjectModelResolver.java:243)
    at org.apache.maven.model.building.DefaultModelBuilder.readParentExternally(DefaultModelBuilder.java:1051)
    at org.apache.maven.model.building.DefaultModelBuilder.readParent(DefaultModelBuilder.java:829)
    at org.apache.maven.model.building.DefaultModelBuilder.build(DefaultModelBuilder.java:331)
    at org.apache.maven.project.DefaultProjectBuilder.build(DefaultProjectBuilder.java:429)
    at org.apache.maven.project.DefaultProjectBuilder.build(DefaultProjectBuilder.java:398)
    at org.apache.maven.project.DefaultProjectBuilder.build(DefaultProjectBuilder.java:361)
    at org.apache.maven.graph.DefaultGraphBuilder.collectProjects(DefaultGraphBuilder.java:400)
    at org.apache.maven.graph.DefaultGraphBuilder.getProjectsForMavenReactor(DefaultGraphBuilder.java:391)
    at org.apache.maven.graph.DefaultGraphBuilder.build(DefaultGraphBuilder.java:78)
    at org.apache.maven.DefaultMaven.buildGraph(DefaultMaven.java:511)
    at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:221)
    at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:194)
    at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:107)
    at org.apache.maven.cli.MavenCli.execute(MavenCli.java:993)
    at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:345)
    at org.apache.maven.cli.MavenCli.main(MavenCli.java:191)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)

"VM Thread" os_prio=0 tid=0x00007fc4a42c0000 nid=0x25 runnable

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fc4a4190800 nid=0x21 runnable

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fc4a4192000 nid=0x22 runnable

"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007fc4a4194000 nid=0x23 runnable

"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fc4a4195800 nid=0x24 runnable

"VM Periodic Task Thread" os_prio=0 tid=0x00007fc4a4382800 nid=0x2d waiting on condition

JNI global references: 235

bash-4.3$

在容器内,我刚刚确认我可以访问 pom 并显示调试输出,它只是在下载时挂起。

https://gist.github.com/dwatrous/34e1edc1db5e4756d4b33c83a9c2ccd0

4

2 回答 2

1

它很可能与 NFS 和文件锁定错误和/或语义有关。

FileChannel#tryLock其他人报告了over NFS的类似问题;参见例如JDK-8156026JDK-8065927

方法的合同说不会tryLock阻塞,因此发生的任何阻塞都是由于本机系统调用没有在应该返回时返回。Maven 可能会尝试以某种方式解决此类错误,但我认为这样做的任何尝试都将是非常糟糕的,并且可能会引入比它避免的更多的错误。

您可以在不同的发行版本中尝试不同版本的 Java,包括 Oracle 和 OpenJDK...

于 2017-05-03T14:27:21.677 回答
0

在我们的案例中,问题肯定与 NetApp ONTAP NFS 有关,并且是由于防火墙中未打开用于 nlockmgr 的端口 4045 引起的。写入文件工作正常,在写入之前锁定文件,就像 maven 对其缓存所做的那样。

很难调试,因为没有超时,进程只是挂起,直到端口打开。真的很好奇为什么如果端口关闭它不会超时。

存储系统上使用了哪些网络文件系统 (NFS) TCP 和 NFS UDP 端口?

于 2020-02-07T13:51:09.530 回答