caching - hadoop-2.3 中的集中式缓存失败

Question

我想在 hadoop-2.3 中使用集中式缓存。

这是我的步骤。（10个节点，每个节点6g内存）

1.我的文件（45M）被缓存

[hadoop@Master ~]$ hadoop fs -ls /input/pics/bundle
Found 1 items
-rw-r--r--   1 hadoop supergroup   47185920 2014-03-09 19:10 /input/pics/bundle/bundle.chq

2.创建缓存池

[hadoop@Master ~]$ hdfs cacheadmin -addPool myPool -owner hadoop -group supergroup 
Successfully added cache pool myPool.
[hadoop@Master ~]$ hdfs cacheadmin -listPools -stats  
Found 1 result.
NAME    OWNER   GROUP       MODE            LIMIT  MAXTTL  BYTES_NEEDED  BYTES_CACHED  BYTES_OVERLIMIT  FILES_NEEDED  FILES_CACHED
myPool  hadoop  supergroup  rwxr-xr-x   unlimited   never             0             0                0             0             0

3.add指令

[hadoop@Master ~]$ hdfs cacheadmin -addDirective -path /input/pics/bundle/bundle.chq -pool myPool -force -replication 3 
Added cache directive 2

4.listDirectives

[hadoop@Master ~]$ hdfs cacheadmin -listDirectives -stats -path /input/pics/bundle/bundle.chq -pool myPool
Found 1 entry
ID POOL     REPL EXPIRY  PATH                            BYTES_NEEDED  BYTES_CACHED  FILES_NEEDED  FILES_CACHED
2 myPool      3 never   /input/pics/bundle/bundle.chq      141557760             0             1             0

BYTES_NEEDED 是正确的，但 BYTES_CACHED 为零。似乎已经计算了大小，但是将文件放入内存的缓存操作尚未完成。那么如何将我的文件缓存到内存中。非常感谢。

score 0 · Accepted Answer

我们在 Hadoop 2.3 中修复了许多错误。我建议至少使用 Hadoop 2.4 来使用 HDFS 缓存。

要了解更多详细信息，我需要查看日志消息。

score 0 · Accepted Answer

包括的输出hdfs dfsadmin -report也很有用，并确保您已按照此处的设置说明进行操作（即增加 ulimit 并设置 dfs.datanode.max.locked.memory）：

http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html

caching - hadoop-2.3 中的集中式缓存失败

2 回答 2

Related

Reference