0

I hoping someone can help me, if have a question about writing into a file using multiple threads/Tasks. See my code sample below...

AddFile return a array of longs holding the values, blobNumber, the offset inside the blob and the size of the data writing into the blob

public long[] AddFile(byte[] data){
    long[] values = new long[3];

    values[0] = WorkingIndex = getBlobIndex(data); //blobNumber
    values[1] = blobFS[WorkingIndex].Position; //Offset
    values[2] = length = data.length; //size

    //BlobFS is a filestream
    blobFS[WorkingIndex].Write(data, 0, data.Length);

    return values;
}

So lets say I use the AddFile function inside a foreach loop like the one below.

List<Task> tasks = new List<Task>(System.Environment.ProcessorCount);
foreach(var file in Directory.GetFiles(@"C:\Documents"){
    var task = Task.Factory.StartNew(() => {
        byte[] data = File.ReadAllBytes(file);
        long[] info = blob.AddFile(data);
        return info
    });
    task.ContinueWith(// do some stuff);
    tasks.Add(task);
}
Task.WaitAll(tasks.ToArray);
return result;

I can imagine that this will totally fail, in the way that files will override each other inside the blob due to the fact that the Write function hasn't finished writing file1 and an other task is writing file2 at the same time.

So what is the best way to solve this problem? Maybe using asynchronous write functions...

Your help would be appreciated! Kind regards, Martijn

4

1 回答 1

1

我的建议是不要并行运行这些任务。磁盘 IO 很可能会成为任何基于文件的操作的瓶颈,因此并行运行它们只会导致每个线程被阻止访问磁盘。最终,您很可能会发现您编写的代码比串行运行的代码运行速度要慢得多。

您是否有特殊原因需要这些并行?您可以串行处理磁盘写入并只调用ContinueWith()单独的线程吗?这也有利于消除您发布的问题。

编辑:for一个简单的循环重新实现示例:

foreach(var file in Directory.GetFiles(@"C:\Documents"){
    byte[] data = File.ReadAllBytes(file); // this happens on the main thread

    // processing of each file is handled in multiple threads in parallel to disk IO
    var task = Task.Factory.StartNew(() => {
        long[] info = blob.AddFile(data);
        return info
    });
    task.ContinueWith(// do some stuff);
    tasks.Add(task);
}
于 2012-07-05T08:27:14.160 回答