hadoop - why hadoop capacity scheduler uses 200% of Capacity

Question

I encountered the same problem on our cluster and returned to my pc to do some simple experiments hoping to figure it out.I configured hadoop in Pseudo-distributed mode and used the default capacity-scheduler.xml and configured the mapred-site.xml as the following:

<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
  <name>io.sort.mb</name>
  <value>5</value>
</property>
 <property>
<name>mapred.job.tracker</name>
 <value>localhost:9001</value>
 </property>
<property>
 <name>mapred.child.java.opts</name>
 <value>-Xmx10m</value>
 </property>
<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.CapacityTaskScheduler</value>
</property>
<property>
<name>mapred.queue.names</name>
<value>default</value>
</property>
<property>
<name>mapred.cluster.map.memory.mb</name>
<value>100</value>
</property>
<property>
<name>mapred.cluster.max.map.memory.mb</name>
<value>200</value>
</property>
</configuration>

The web UI looks like this :

Queue Name  default      
Scheduling Information
Queue configurationfatal
Capacity Percentage: 100.0%
User Limit: 100%
Priority Supported: NO
-------------
Map tasks
Capacity: 2 slots
Used capacity: 2 (100.0% of Capacity)
Running tasks: 1
Active users:
User 'luo': 2 (100.0% of used capacity)
-------------
Reduce tasks
Capacity: 2 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
-------------
Job info
Number of Waiting Jobs: 0
Number of users who have submitted jobs: 1

Actually , it did work without anything wrong when I submitted a streaming job with one map task which occupies 2 slots and no reduce task.The streaming script is rather simple

~/hadoop/hadoop-0.20.2/bin/hadoop jar Streaming_blat.jar -D mapred.job.map.memory.mb=199 -D mapred.job.name='memory alloc' -D mapred.map.tasks=1 -input file://pwd/input/ -mapper ' /home/luo/hadoop/hadoop-0.20.2/bin/a.out' -output file://pwd/output/ -reducer NONE

a.out is just a C program simply outputting the pid and ppid to a specified file.

And problems came when I set mapred.map.tasks=3. The web UI showed

Map tasks
Capacity: 2 slots
Used capacity: 4 (200.0% of Capacity)
Running tasks: 2
Active users:
User 'luo': 4 (100.0% of used capacity)

which means it already exceeds the limit of map slots I set in mapred-site.xml. As a result, it prompted something like this again and again

Killing one of the least progress tasks - attempt_201210121915_0012_m_000000_0, as the cumulative memory usage of all the tasks on the TaskTracker exceeds virtual memory limit 207618048.

What I want it to do is suspend the map task until there are available slots without exceeding the capacity.So what's wrong have I done ? Could any one provide some solutions? Thanks a lot.

score 1 · Accepted Answer

好吧，我自己回答。破解后知道这4个属性必须全部设置在mapred-site.xml中，否则调度器不会进行内存检查（我只设置了其中两个）。

mapred.cluster.map.memory.mb
mapred.cluster.reduce.memory.mb
mapred.cluster.max.map.memory.mb
mapred.cluster.max.reduce.memory.mb

hadoop - why hadoop capacity scheduler uses 200% of Capacity

1 回答 1

Related

Reference