3

我正在使用 HDInsight .NET Hadoop API 在 asp.net 应用程序中提交 Map Reduce 作业。

使用 Microsoft.Hadoop.Mapreduce;

var hadoop = Hadoop.Connect();

var 结果 = hadoop.MapReduceJob.ExecuteJob ();

//也试过这个,但同样的例外

//var 结果 = hadoop.MapReduceJob.ExecuteJob(config);

ExecuteJob() 调用失败并在运行时引发异常。这个世界上有没有人能够成功地运行这个电话。是否可以通过添加更多输入参数或对象来自定义 Map() 函数(除了 Microsoft 在 MapperBase 类中给出的)?Mapper 和 Reducer 方法中的逻辑可以访问缓存/数据库吗?

4

1 回答 1

1

此处发布了使用 HDInsight .NET SDK 提交 MapReduce 作业的示例:

http://www.windowsazure.com/en-us/manage/services/hdinsight/submit-hadoop-jobs-programmatically/#mapreduce-sdk

// Define the MapReduce job
MapReduceJobCreateParameters mrJobDefinition = new MapReduceJobCreateParameters()
{
    JarFile = "wasb:///example/jars/hadoop-examples.jar",
    ClassName = "wordcount"
};

mrJobDefinition.Arguments.Add("wasb:///example/data/gutenberg/davinci.txt");
mrJobDefinition.Arguments.Add("wasb:///example/data/WordCountOutput");

// Get the certificate object from certificate store using the friendly name to identify it
X509Store store = new X509Store();
store.Open(OpenFlags.ReadOnly);
X509Certificate2 cert = store.Certificates.Cast<X509Certificate2>().First(item => item.FriendlyName == certfrientlyname);
JobSubmissionCertificateCredential creds = new JobSubmissionCertificateCredential(new Guid(subscriptionID), cert, clusterName);

// Create a hadoop client to connect to HDInsight
var jobClient = JobSubmissionClientFactory.Connect(creds);

// Run the MapReduce job
JobCreationResults mrJobResults = jobClient.CreateMapReduceJob(mrJobDefinition);

// Wait for the job to complete
WaitForJobCompletion(mrJobResults, jobClient);
于 2013-10-24T12:25:51.080 回答