我是 hadoop 的新手,过去几个小时我一直在尝试用谷歌搜索这个问题,但我找不到任何有用的东西。我的问题是 HDFS 说文件仍然打开,即使写入它的进程已经死了。这使得无法从文件中读取。
我在目录上运行 fsck,它报告一切正常。但是,当我运行“hadoop fsck -fs hdfs://hadoop /logs/raw/directory_containing_file -openforwrite”时,我得到了
Status: CORRUPT
Total size: 222506775716 B
Total dirs: 0
Total files: 630
Total blocks (validated): 3642 (avg. block size 61094666 B)
********************************
CORRUPT FILES: 1
MISSING BLOCKS: 1
MISSING SIZE: 30366208 B
********************************
Minimally replicated blocks: 3641 (99.97254 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 2.9991763
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 23
Number of racks: 1
对 openforwrite 的文件再次执行 fsck 命令,我得到
.Status: HEALTHY
Total size: 793208051 B
Total dirs: 0
Total files: 1
Total blocks (validated): 12 (avg. block size 66100670 B)
Minimally replicated blocks: 12 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 23
Number of racks: 1
有谁知道发生了什么以及我该如何解决?