25

我有一个 List 集合,我想在多线程应用程序中对其进行迭代。每次迭代它时我都需要保护它,因为它可以更改,并且我不希望在执行 foreach 时出现“集合已修改”异常。

这样做的正确方法是什么?

  1. 每次访问或循环时都使用锁定。我比较害怕死锁。也许我只是偏执于使用锁,不应该。如果我走这条路以避免死锁,我需要知道什么?锁相当有效吗?

  2. 每次执行 foreach 时,使用 List<>.ToArray() 复制到数组。这会导致性能下降,但很容易做到。我担心内存抖动以及复制它的时间。只是显得过分。使用 ToArray 是线程安全的吗?

  3. 不要使用 foreach 而是使用 for 循环。每次我这样做以确保列表没有缩小时,我不需要进行长度检查吗?这似乎很烦人。

4

5 回答 5

48

There's little reason to be afraid of deadlocks, they are easy to detect. Your program stops running, dead giveaway. What you really should be terrified of is threading races, the kind of bug you'll get when you don't lock when you should. Very hard to diagnose.

  1. Using lock is fine, just make sure you use the exact same locking object in any code that touches that list. Like the code that adds or removes items from that list. If that code runs on the same thread that iterates the list then you don't need a lock. Generally, the only chance for deadlock here is if you have code that relies on the thread state, like Thread.Join(), while it is also holding that locking object. Which ought to be rare.

  2. Yes, iterating a copy of the list is always thread-safe, as long as you use a lock around the ToArray() method. Note that you still need the lock, no structural improvement. The advantage is that you'll hold the lock for a short amount of time, improving concurrency in your program. The disadvantages are its O(n) storage requirements, only having a safe list but not protecting the elements in the list and the tricky problem of always having a stale view of the list content. Especially the last problem is subtle and hard to analyze. If you cannot reason out the side-effects then you probably shouldn't consider this.

  3. Do make sure to treat the ability of foreach to detect a race as a gift, not a problem. Yes, an explicit for(;;) loop is not going to throw the exception, it is just going to malfunction. Like iterating the same item twice or skipping an item completely. You could avoid having to re-check the number of items by iterating it backwards. As long as other thread(s) are only calling Add() and not Remove() that would behave similarly to ToArray(), you'll get the stale view. Not that this will work in practice, indexing the list is not thread-safe either. List<> will reallocate its internal array if necessary. This just won't work and malfunction in unpredictable ways.

There are two points of view here. You can be terrified and follow common wisdom, you'll get a program that works but might not be optimal. That's wise and keeps the boss happy. Or you can experiment and find out for yourself how skewing the rules gets you in trouble. Which will make you happy, you'll be a much better programmer. But your productivity is going to suffer. I don't know what your schedule looks like.

于 2010-06-27T22:28:45.037 回答
11

If your List data is mostly read-only, you can allow multiple threads to safely access it simultaneously using a ReaderWriterLockSlim

You can find an implementation of a Thread-Safe dictionary here to get you started.

I also wanted to mention that if you are using .Net 4.0 the BlockingCollection class implements this functionality automatically. I wish I would have known about this a few months ago!

于 2010-06-27T21:46:32.487 回答
5

You could also consider using an immutable data structure - treat your list like a value type.

If it's possible, using Immutable objects can be an excellent choice for multi-threaded programming because they remove all the clunky locking semantics. Essentially any operations that would change the state of the object creates an entirely new object.

e.g. I whipped up the following to demonstrate the idea. I'll apologize that it's by no means reference code, and it started to get a bit long.

public class ImmutableWidgetList : IEnumerable<Widget>
{
    private List<Widget> _widgets;  // we never modify the list

    // creates an empty list
    public ImmutableWidgetList()
    {
        _widgets = new List<Widget>();
    }

    // creates a list from an enumerator
    public ImmutableWidgetList(IEnumerable<Widget> widgetList)
    {
        _widgets = new List<Widget>(widgetList);
    }

    // add a single item
    public ImmutableWidgetList Add(Widget widget)
    {
        List<Widget> newList = new List<Widget>(_widgets);

        ImmutableWidgetList result = new ImmutableWidgetList();
        result._widgets = newList;
        return result;
    }

    // add a range of items.
    public ImmutableWidgetList AddRange(IEnumerable<Widget> widgets)
    {
        List<Widget> newList = new List<Widget>(_widgets);
        newList.AddRange(widgets);

        ImmutableWidgetList result = new ImmutableWidgetList();
        result._widgets = newList;
        return result;
    }

    // implement IEnumerable<Widget>
    IEnumerator<Widget> IEnumerable<Widget>.GetEnumerator()
    {
        return _widgets.GetEnumerator();
    }


    // implement IEnumerable
    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return _widgets.GetEnumerator();
    }
}
  • I included IEnumerable<T> to allow for a foreach implementation.
  • You mentioned you worried about the space/time performance of creating new lists, so perhaps this won't work for you.
  • you might also want to implement IList<T>
于 2010-06-29T02:15:26.610 回答
4

In general, collections are not thread safe for performance reasons, except of Hash Table. You have to use IsSynchronized and SyncRoot to make them thread safe. See here and here

Example from msdn

ICollection myCollection = someCollection;
lock(myCollection.SyncRoot)
{
    foreach (object item in myCollection)
    {
        // Insert your code here.
    }
}

Edit: If you are using .net 4.0, you can use concurrent collections

于 2010-06-27T21:45:05.563 回答
3

lock()除非您有其他复印理由,否则请使用。仅当您以不同的顺序请求多个锁时,才会发生死锁,例如:

线程 1:

lock(A) {
  // .. stuff
  // Next lock request can potentially deadlock with 2
  lock(B) {
    // ... more stuff
  }
}

线程 2:

lock(B) {
  // Different stuff
  // next lock request can potentially deadlock with 1
  lock(A) {
    // More crap
  }
}

在这里,线程 1 和线程 2 有可能导致死锁,因为线程 1 可能A在线程 2 持有时正在持有B,并且在另一个释放它的锁之前两者都不能继续。

If you must take multiple locks, always do it in the same order. If you're only taking one lock, then you won't cause a deadlock ... unless you hold a lock while waiting for user input, but that's not technically a deadlock and leads to another point: never hold a lock for any longer than you absolutely must.

于 2010-06-27T21:07:55.387 回答