-2

任何人都可以帮助我了解 Hadoop 灾难恢复吗?

我应该将数据从集群复制到另一个集群作为备份使用 distcp 吗?或者我可以使用 copyToLocal 将我的数据复制到我的本地机器?

有人知道吗?

4

1 回答 1

3

DRP plan goes beyond just the technology and the requirements can greatly affect the solution.

for instance if you can't afford to lose any data you'd want an active/active setup and send data to two hadoop clusters simultaneously. on the other side of the spectrum hadoop's replication (default is 3 copies but you can change that) and rack awareness can give you a copy on a secondary rack. In between you can use things like distcp that you mention to copy data from cluster to cluster.

Additionally you might want to follow project falcon which is a new initiative for hadoop data life-cycle management

于 2013-04-04T07:54:20.157 回答