2

几天前,我试图在我的磁盘上执行快速搜索,但很少做一些事情,比如属性、扩展、在文件中执行更改等......

这个想法是让它几乎没有限制/锁定,以避免大文件或目录中包含大量文件等的“延迟”......我知道它对于“最佳实践”来说还很远,因为我没有使用诸如“MaxDegreeOfParallelism”或带有“while(true)”的拉循环之类的东西

尽管如此,代码运行得非常快,因为我们有支持它的架构。

如果有人想检查发生了什么,我试图将代码转移到一个虚拟控制台项目。

class Program
{
    static ConcurrentQueue<String> dirToCheck;
    static ConcurrentQueue<String> fileToCheck;
    static int fileCount; //

    static void Main(string[] args)
    {
        Initialize();

        Task.Factory.StartNew(() => ScanDirectories(), TaskCreationOptions.LongRunning);
        Task.Factory.StartNew(() => ScanFiles(), TaskCreationOptions.LongRunning);

        Console.ReadLine();
    }

    static void Initialize()
    {
        //Instantiate caches
        dirToCheck = new ConcurrentQueue<string>();
        fileToCheck = new ConcurrentQueue<string>();

        //Enqueue Directory to Scan here
        //Avoid to Enqueue Nested/Sub directories, else they are going to be dcan at least twice
        dirToCheck.Enqueue(@"C:\");

        //Initialize counters
        fileCount = 0;
    }

    static void ScanDirectories()
    {
        String dirToScan = null;

        while (true)
        {
            if (dirToCheck.TryDequeue(out dirToScan))
            {
                ExtractDirectories(dirToScan);
                ExtractFiles(dirToScan);
            }

            //Just here as a visual tracker to have some kind an idea about what's going on and where's the load
            Console.WriteLine(dirToCheck.Count + "\t\t" + fileToCheck.Count + "\t\t" + fileCount);
        }
    }

    static void ScanFiles()
    {
        while (true)
        {
            String fileToScan = null;
            if (fileToCheck.TryDequeue(out fileToScan))
            {
                CheckFileAsync(fileToScan);
            }
        }
    }

    private static Task ExtractDirectories(string dirToScan)
    {
        Task worker = Task.Factory.StartNew(() =>
        {
            try
            {
                Parallel.ForEach<String>(Directory.EnumerateDirectories(dirToScan), (dirPath) =>
                {
                    dirToCheck.Enqueue(dirPath);
                });

            }
            catch (UnauthorizedAccessException) { }
        }, TaskCreationOptions.AttachedToParent);

        return worker;
    }

    private static Task ExtractFiles(string dirToScan)
    {
        Task worker = Task.Factory.StartNew(() =>
        {
            try
            {
                Parallel.ForEach<String>(Directory.EnumerateFiles(dirToScan), (filePath) =>
                {
                    fileToCheck.Enqueue(filePath);
                });
            }
            catch (UnauthorizedAccessException) { }
        }, TaskCreationOptions.AttachedToParent);

        return worker;
    }

    static Task CheckFileAsync(String filePath)
    {
        Task worker = Task.Factory.StartNew(() =>
        {
            //Add statement to play along with the file here
            Interlocked.Increment(ref fileCount);


            //WARNING !!! If your file fullname is too long this code may not be executed or may just crash
            //I just put a simple check 'cause i found 2 or 3 different error message between the framework & msdn documentation
            //"Full paths must not exceed 260 characters to maintain compatibility with Windows operating systems. For more information about this restriction, see the entry Long Paths in .NET in the BCL Team blog"
            if (filePath.Length > 260)
                return;
            FileInfo fi = new FileInfo(filePath);

            //Add statement here to use FileInfo

        }, TaskCreationOptions.AttachedToParent);

        return worker;
    }
}

问题:如何检测到我已经完成了 ScanDirectory?完成后,我可以设法将一个空字符串或其他任何内容排入文件队列,以退出它。我知道如果我使用“AttachedToParent”,我可以在父任务上有一个完成状态,然后例如做一些类似“ContinueWith(()=> { / SomeCode to notice the end /})”但仍然是父任务正在拉动并陷入一种无限循环,每个子语句都开始新任务。

另一方面,我不能简单地测试每个队列中的“计数”,因为我可能会刷新文件列表和目录列表,但可能还有另一个任务会调用“EnumerateDirectory()”。

我试图找到某种“反应式”解决方案,并避免在循环中出现一些“if()”,因为它是一个简单的 while(true){} 与 AsyncCall。

PS:我知道我可以使用 TPL 数据流,我不是因为我被困在 .net 4.0 上,无论如何,在没有数据流的 .net 4.5 中,因为 TPL 几乎没有改进,我仍然对此感到好奇

4

1 回答 1

1

而不是ConcurrentQueue<T>,您可以使用BlockingCollection<T>.

BlockingCollection<T>专门为这样的生产者/消费者场景而设计,并提供了一个CompleteAdding方法,以便生产者可以通知消费者它已经完成添加工作。

于 2012-10-16T23:43:51.883 回答