c - BENCH_INNER : lmbench3.0 src代码宏查询

Question

我正在阅读MHZ - lmbench 的创建者的 Benchmark 论文剖析，并在旁边浏览代码。

论文可以在@MHz 下载：Microbenchmark 的剖析
源代码lmbench-3.0由 Carl Staelin 和 Larry McVoy 编写

在 BENCH_INNER() 宏里面我有一个疑问：

#define BENCH_INNER(loop_body, enough) {                \
    static iter_t   __iterations = 1;               \
    int     __enough = get_enough(enough);          \
    iter_t      __n;                        \
    double      __result = 0.;                  \
                                    \
    while(__result < 0.95 * __enough) {             \
        start(0);                       \
        for (__n = __iterations; __n > 0; __n--) {      \
            loop_body;                  \
        }                           \
        __result = stop(0,0);                   \
        if (__result < 0.99 * __enough              \
            || __result > 1.2 * __enough) {         \
            if (__result > 150.) {              \
                double  tmp = __iterations / __result;  \
                tmp *= 1.1 * __enough;          \
                __iterations = (iter_t)(tmp + 1);   \
            } else {                    \
                if (__iterations > (iter_t)1<<27) { \
                    __result = 0.;          \
                    break;              \
                }                   \
                __iterations <<= 3;         \
            }                       \
        }                           \
    } /* while */                           \
    save_n((uint64)__iterations); settime((uint64)__result);    \
}

据我了解， BENCH_INNER 用于自动计算所选时间间隔（“足够”）的最佳迭代次数。循环一直执行，直到我们不断迭代一段代码“loop_body”，这将占用我们选择的时间间隔的至少 95%，范围从 5 毫秒到 1 秒。
为简单起见，让我们将“足够”设为 10000 微秒
我们从 __iterations = 1 开始
假设随着时间的推移，我们达到了 __result > 1.2 * '足够' 的阶段，即 __result > 12000 微秒
现在由于 __result > 150 微秒，我们继续并缩放 __iterations 的值，以便 __result 大约等于 1.1 * '足够'
但在我们重新计算 __result 之前，我们将打破循环，因为之前的 __result > .95 * '足够了'
我们继续保存 __result 的值和修改后的 __iterations 值（这里 __result 的值不是我们保存的 __iterations 的值）

这种情况下的代码不应该重新计算 __result 吗？我错过了一些基本的东西吗？

score 6 · Accepted Answer

是的，这里有一个问题，__result 必须设置为零。

而且我可以在您的代码中看到另一个可能的问题——结果是0.99*enough在一种情况下与0.95*enough在另一种情况下进行比较，这很可能是一个错字。我建议您重写此宏，明确说明“满足”条件并简化逻辑，首先检查良好条件。像这样：

#define SEARCH_EXIT_CASE(__result, __enough) ((__result) > 0.95 * (__enough) && (__result) < 1.2 * (__enough))

#define BENCH_INNER(loop_body, enough) {                \
    static iter_t   __iterations = 1;               \
    int     __enough = get_enough(enough);          \
    iter_t      __n;                        \
    double      __result = 0.;                  \
                                    \
    while(!SEARCH_EXIT_CASE(__result, __enough)) {             \
        start(0);                       \
        for (__n = __iterations; __n > 0; __n--) {      \
            loop_body;                  \
        }                           \
        __result = stop(0,0);                   \
        /* good result */ \
        if (SEARCH_EXIT_CASE(__result, __enough)) {         \
            break; \
        } \
        /* failure cases */ \
        if (__result > 150.) {              \
            double  tmp = __iterations / __result;  \
            tmp *= 1.1 * __enough;          \
            __iterations = (iter_t)(tmp + 1);   \
        } else { \
            if (__iterations > (iter_t)1<<27) { \
                __result = 0.;          \
                break;              \
            }                   \
            __iterations <<= 3;         \
        } \
        __result = 0.;          \
    } /* while */                           \
    save_n((uint64)__iterations); settime((uint64)__result);    \
}

此外，我建议定义其他神奇的常量1<<27, 1.1, 3, 150.0，例如具有有意义的名称，如MAX_ITER, CORRECTION_RATE, INCREASE_RATE,RESULT_OVERFLOW等...

c - BENCH_INNER : lmbench3.0 src代码宏查询

1 回答 1

Related

Reference