33

我非常了解处理解析文本以获取信息的不同方式。例如,对于解析整数,可以预期什么样的性能。我想知道是否有人知道这方面的任何好的统计数据。我正在从测试过这个的人那里寻找一些真实的数字。

其中哪一个在哪些情况下提供最佳性能?

Parse(...)  // Crash if the case is extremely rare .0001%

If (SomethingIsValid) // Check the value before parsing
    Parse(...)

TryParse(...) // Using TryParse

try
{
    Parse(...)
}
catch
{
    // Catch any thrown exceptions
}
4

3 回答 3

62

始终使用T.TryParse(string str, out T value)抛出异常是昂贵的,如果你可以先验地处理这种情况,应该避免。使用 try-catch 块来“节省”性能(因为您的无效数据率很低)是对异常处理的滥用,以牺牲可维护性和良好的编码实践为代价。遵循完善的软件工程开发实践,编写测试用例,运行应用程序,然后进行基准测试和优化。

“我们应该忘记小的效率,比如大约 97% 的时间:过早的优化是万恶之源。但我们不应该放弃那关键的 3% 的机会”——Donald Knuth

因此,您像在碳信用中一样任意指定 try-catch的性能更差,而 TryParse 的性能更好。只有在我们运行我们的应用程序并确定我们有某种类型的字符串解析速度变慢之后,我们才会考虑使用 TryParse 以外的任何东西。

(编辑:因为看起来提问者希望计时数据有好的建议,这里是所要求的计时数据)

来自用户的 10,000 次输入的各种失败率的时间(对于非信徒):

Failure Rate      Try-Catch          TryParse        Slowdown
  0%           00:00:00.0131758   00:00:00.0120421      0.1
 10%           00:00:00.1540251   00:00:00.0087699     16.6
 20%           00:00:00.2833266   00:00:00.0105229     25.9
 30%           00:00:00.4462866   00:00:00.0091487     47.8
 40%           00:00:00.6951060   00:00:00.0108980     62.8
 50%           00:00:00.7567745   00:00:00.0087065     85.9
 60%           00:00:00.7090449   00:00:00.0083365     84.1
 70%           00:00:00.8179365   00:00:00.0088809     91.1
 80%           00:00:00.9468898   00:00:00.0088562    105.9
 90%           00:00:01.0411393   00:00:00.0081040    127.5
100%           00:00:01.1488157   00:00:00.0078877    144.6


/// <param name="errorRate">Rate of errors in user input</param>
/// <returns>Total time taken</returns>
public static TimeSpan TimeTryCatch(double errorRate, int seed, int count)
{
    Stopwatch stopwatch = new Stopwatch();
    Random random = new Random(seed);
    string bad_prefix = @"X";

    stopwatch.Start();
    for(int ii = 0; ii < count; ++ii)
    {
        string input = random.Next().ToString();
        if (random.NextDouble() < errorRate)
        {
           input = bad_prefix + input;
        }

        int value = 0;
        try
        {
            value = Int32.Parse(input);
        }
        catch(FormatException)
        {
            value = -1; // we would do something here with a logger perhaps
        }
    }
    stopwatch.Stop();

    return stopwatch.Elapsed;
}

/// <param name="errorRate">Rate of errors in user input</param>
/// <returns>Total time taken</returns>
public static TimeSpan TimeTryParse(double errorRate, int seed, int count)
{
    Stopwatch stopwatch = new Stopwatch();
    Random random = new Random(seed);
    string bad_prefix = @"X";

    stopwatch.Start();
    for(int ii = 0; ii < count; ++ii)
    {
        string input = random.Next().ToString();
        if (random.NextDouble() < errorRate)
        {
           input = bad_prefix + input;
        }

        int value = 0;
        if (!Int32.TryParse(input, out value))
        {
            value = -1; // we would do something here with a logger perhaps
        }
    }
    stopwatch.Stop();

    return stopwatch.Elapsed;
}

public static void TimeStringParse()
{
    double errorRate = 0.1; // 10% of the time our users mess up
    int count = 10000; // 10000 entries by a user

    TimeSpan trycatch = TimeTryCatch(errorRate, 1, count);
    TimeSpan tryparse = TimeTryParse(errorRate, 1, count);

    Console.WriteLine("trycatch: {0}", trycatch);
    Console.WriteLine("tryparse: {0}", tryparse);
}
于 2008-09-29T18:56:09.240 回答
6

Try-Catch 总是较慢。TryParse 会更快。

IF 和 TryParse 是相同的。

于 2008-09-29T18:56:37.943 回答
-3
Option 1: Will throw an exception on bad data.
Option 2: SomethingIsValid() could be quite expensive - particularly if you are pre-checking a string for Integer parsability.
Option 3: I like this.  You need a null check afterwards, but it's pretty cheap.
Option 4 is definitely the worst.

异常处理相对昂贵,因此请尽可能避免。

特别是,错误的输入是意料之中的,而不是例外,所以你不应该在这种情况下使用它们。

(尽管在 TryParse 之前,它可能是最好的选择。)

于 2008-09-29T18:59:28.563 回答