3

我注意到我的 3D 引擎在 AMD 硬件上运行非常缓慢。经过一番调查,慢代码归结为创建带有多个附件的 FBO 并写入任何非零附件。在所有测试中,我将 AMD 性能与相同的 AMD GPU 进行了比较,但写入不受影响GL_COLOR_ATTACHMENT0,以及与我的 AMD 设备的性能差异众所周知的 Nvidia 硬件。

将片段写入非零附件比预期慢 2-3 倍。

此代码等效于我在测试应用程序中创建帧缓冲区和测量性能的方式:

    // Create a framebuffer
    static const auto attachmentCount = 6;
    GLuint fb, att[attachmentCount];
    glGenTextures(attachmentCount, att);
    glGenFramebuffers(1, &fb);
    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fb);

    for (auto i = 0; i < attachmentCount; ++i) {
        glBindTexture(GL_TEXTURE_2D, att[i]);
        glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
        glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, GL_TEXTURE_2D, att[i], 0);
    }
    GLuint dbs[] = {
        GL_NONE,
        GL_COLOR_ATTACHMENT1,
        GL_NONE,
        GL_NONE,
        GL_NONE,
        GL_NONE};
    glDrawBuffers(attachmentCount, dbs);


    // Main loop
    while (shouldWork) {
        glClear(GL_COLOR_BUFFER_BIT);
        for (int i = 0; i < 100; ++i) glDrawArrays(GL_TRIANGLES, 0, 6);
        glfwSwapBuffers(window);
        glfwPollEvents();
        showFps();
    }

有什么问题吗?

可在此处找到完全可重现的最小测试。我尝试了许多其他书写模式或 OpenGL 状态,并在AMD 社区中描述了其中的一些。

我想问题出在 AMD 的 OpenGL 驱动程序中,但如果不是,或者您遇到了同样的问题并找到了解决方法(供应商扩展?),请分享。

UPD:在此处移动问题详细信息。

我准备了一个最小的测试包,其中应用程序创建了一个带有六个 RGBA UNSIGNED_BYTE 附件的 FBO,并为它呈现每帧 100 个全屏矩形。有四种可执行文件有四种书写模式:

  1. 将着色器输出 0 写入附件 0。只有输出 0 使用 路由到帧缓冲区glDrawBuffers。所有其他输出都设置为GL_NONE

  2. 与 1 相同,但具有输出和附件 1。

  3. 将输出 0 写入附件 0,但所有六个着色器输出分别路由到附件 0..6,并且除 0 之外的所有绘制缓冲区都用glColorMaski.

  4. 与 3 相同,但针对附件 1。

我在两台具有几乎相似 CPU 和以下 GPU 的机器上运行所有测试:

AMD Radeon RX550,驱动版本 19.30.01.16

Nvidia Geforce GTX 650 Ti,比 RX550 低约 2 倍

并得到了这些结果:

Geforce GTX 650 Ti:
attachment0: 195 FPS
attachment1: 195 FPS
attachment0 masked: 195 FPS
attachment1 masked: 235 FPS
Radeon RX550:
attachment0: 350 FPS
attachment1: 185 FPS
attachment0 masked: 330 FPS
attachment1 masked: 175 FPS

预先构建的测试可执行文件附在帖子中,或者可以从Google drive下载。

测试源(带有 MSVS 友好的 cmake 构建系统)可在Github上获得

所有四个程序都显示黑色窗口和带有 FPS 计数器的控制台。

我们看到,当写入非零附件时,AMD 比功能较弱的 nvidia GPU 和它本身慢得多。绘图缓冲区输出的全局屏蔽也会降低一些 fps。

我还尝试使用渲染缓冲区而不是纹理,使用其他图像格式(而测试中的格式是最兼容的格式),渲染到两倍大小的帧缓冲区。结果是一样的。

明确关闭剪刀、模板和深度测试没有帮助。

如果我通过将顶点坐标乘以小于 1 的值来减少附件数量或减少帧缓冲区覆盖率,则测试性能会按比例增加,最终 RX550 的性能优于 GTX 650 Ti。

glClear呼叫也受到影响,并且它们在各种条件下的表现符合上述观察结果。

我的队友使用原生 Linux 并使用 Wine在Radeon HD 3000上启动了测试。两次测试运行都暴露了附件 0 和附件 1 测试之间的巨大差异。我无法确切知道他的驱动程序版本是什么,但它是由 Ubuntu 19.04 repos 提供的。

另一位队友在Radeon RX590上进行了测试,同样得到了 2 倍的差异。

最后,让我在这里复制粘贴两个几乎相同的测试示例。这个工作很快:

#include <iostream>
#include <cassert>
#include <string>
#include <sstream>
#include <chrono>

#include "GL/glew.h"
#include "GLFW/glfw3.h"
#include <vector>

static std::string getErrorDescr(const GLenum errCode)
{
    // English descriptions are from
    // https://www.opengl.org/sdk/docs/man/docbook4/xhtml/glGetError.xml
    switch (errCode) {
        case GL_NO_ERROR: return "No error has been recorded. THIS message is the error itself.";
        case GL_INVALID_ENUM: return "An unacceptable value is specified for an enumerated argument.";
        case GL_INVALID_VALUE: return "A numeric argument is out of range.";
        case GL_INVALID_OPERATION: return "The specified operation is not allowed in the current state.";
        case GL_INVALID_FRAMEBUFFER_OPERATION: return "The framebuffer object is not complete.";
        case GL_OUT_OF_MEMORY: return "There is not enough memory left to execute the command.";
        case GL_STACK_UNDERFLOW: return "An attempt has been made to perform an operation that would cause an internal stack to underflow.";
        case GL_STACK_OVERFLOW: return "An attempt has been made to perform an operation that would cause an internal stack to overflow.";
        default:;
    }
    return "No description available.";
}

static std::string getErrorMessage()
{
    const GLenum error = glGetError();
    if (GL_NO_ERROR == error) return "";

    std::stringstream ss;
    ss << "OpenGL error: " << static_cast<int>(error) << std::endl;
    ss << "Error string: ";
    ss << getErrorDescr(error);
    ss << std::endl;
    return ss.str();
}

[[maybe_unused]] static bool error()
{
    const auto message = getErrorMessage();
    if (message.length() == 0) return false;
    std::cerr << message;
    return true;
}

static bool compileShader(const GLuint shader, const std::string& source)
{
    unsigned int linesCount = 0;
    for (const auto c: source) linesCount += static_cast<unsigned int>(c == '\n');
    const char** sourceLines = new const char*[linesCount];
    int* lengths = new int[linesCount];

    int idx = 0;
    const char* lineStart = source.data();
    int lineLength = 1;
    const auto len = source.length();
    for (unsigned int i = 0; i < len; ++i) {
        if (source[i] == '\n') {
            sourceLines[idx] = lineStart;
            lengths[idx] = lineLength;
            lineLength = 1;
            lineStart = source.data() + i + 1;
            ++idx;
        }
        else ++lineLength;
    }

    glShaderSource(shader, linesCount, sourceLines, lengths);
    glCompileShader(shader);
    GLint logLength;
    glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &logLength);
    if (logLength > 0) {
        auto* const log = new GLchar[logLength + 1];
        glGetShaderInfoLog(shader, logLength, nullptr, log);
        std::cout << "Log: " << std::endl;
        std::cout << log;
        delete[] log;
    }

    GLint compileStatus;
    glGetShaderiv(shader, GL_COMPILE_STATUS, &compileStatus);
    delete[] sourceLines;
    delete[] lengths;
    return bool(compileStatus);
}

static GLuint createProgram(const std::string& vertSource, const std::string& fragSource)
{
    const auto vs = glCreateShader(GL_VERTEX_SHADER);
    if (vs == 0) {
        std::cerr << "Error: vertex shader is 0." << std::endl;
        return 2;
    }
    const auto fs = glCreateShader(GL_FRAGMENT_SHADER);
    if (fs == 0) {
        std::cerr << "Error: fragment shader is 0." << std::endl;
        return 2;
    }

    // Compile shaders
    if (!compileShader(vs, vertSource)) {
        std::cerr << "Error: could not compile vertex shader." << std::endl;
        return 5;
    }
    if (!compileShader(fs, fragSource)) {
        std::cerr << "Error: could not compile fragment shader." << std::endl;
        return 5;
    }

    // Link program
    const auto program = glCreateProgram();
    if (program == 0) {
        std::cerr << "Error: program is 0." << std::endl;
        return 2;
    }
    glAttachShader(program, vs);
    glAttachShader(program, fs);
    glLinkProgram(program);

    // Get log
    GLint logLength = 0;
    glGetProgramiv(program, GL_INFO_LOG_LENGTH, &logLength);

    if (logLength > 0) {
        auto* const log = new GLchar[logLength + 1];
        glGetProgramInfoLog(program, logLength, nullptr, log);
        std::cout << "Log: " << std::endl;
        std::cout << log;
        delete[] log;
    }
    GLint linkStatus = 0;
    glGetProgramiv(program, GL_LINK_STATUS, &linkStatus);
    if (!linkStatus) {
        std::cerr << "Error: could not link." << std::endl;
        return 2;
    }
    glDeleteShader(vs);
    glDeleteShader(fs);
    return program;
}

static const std::string vertSource = R"(
#version 330
layout(location = 0) in vec2 v;
void main()
{
    gl_Position = vec4(v, 0.0, 1.0);
}
)";

static const std::string fragSource = R"(
#version 330
layout(location = 0) out vec4 outColor0;
void main()
{
    outColor0 = vec4(0.5, 0.5, 0.5, 1.0);
}
)";

int main()
{
    // Init
    if (!glfwInit()) {
        std::cerr << "Error: glfw init failed." << std::endl;
        return 3;
    }

    static const int width = 800;
    static const int height= 600;
    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3);
    glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE);
    GLFWwindow* window = nullptr;
    window = glfwCreateWindow(width, height, "Shader test", nullptr, nullptr);
    if (window == nullptr) {
        std::cerr << "Error: window is null." << std::endl;
        glfwTerminate();
        return 1;
    }
    glfwMakeContextCurrent(window);

    if (glewInit() != GLEW_OK) {
        std::cerr << "Error: glew not OK." << std::endl;
        glfwTerminate();
        return 2;
    }

    // Shader program
    const auto shaderProgram = createProgram(vertSource, fragSource);
    glUseProgram(shaderProgram);

    // Vertex buffer
    GLuint vao;
    glGenVertexArrays(1, &vao);
    glBindVertexArray(vao);

    GLuint buffer;
    glGenBuffers(1, &buffer);
    glBindBuffer(GL_ARRAY_BUFFER, buffer);
    float bufferData[] = {
        -1.0f, -1.0f,
        1.0f, -1.0f,
        1.0f, 1.0f,
        -1.0f, -1.0f,
        1.0f, 1.0f,
        -1.0f, 1.0f
    };
    glBufferData(GL_ARRAY_BUFFER, std::size(bufferData) * sizeof(float), bufferData, GL_STATIC_DRAW);
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, (GLvoid*)(0));

    glClearColor(0.0f, 0.0f, 0.0f, 0.0f);

    // Framebuffer
    GLuint fb, att[6];
    glGenTextures(6, att);
    glGenFramebuffers(1, &fb);

    glBindTexture(GL_TEXTURE_2D, att[0]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[1]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[2]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[3]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[4]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[5]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);

    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fb);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, att[0], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, GL_TEXTURE_2D, att[1], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT2, GL_TEXTURE_2D, att[2], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT3, GL_TEXTURE_2D, att[3], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT4, GL_TEXTURE_2D, att[4], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT5, GL_TEXTURE_2D, att[5], 0);

    GLuint dbs[] = {
        GL_COLOR_ATTACHMENT0,
        GL_NONE,
        GL_NONE,
        GL_NONE,
        GL_NONE,
        GL_NONE};
    glDrawBuffers(6, dbs);

    if (GL_FRAMEBUFFER_COMPLETE != glCheckFramebufferStatus(GL_DRAW_FRAMEBUFFER)) {
        std::cerr << "Error: framebuffer is incomplete." << std::endl;
        return 1;
    }
    if (error()) {
        std::cerr << "OpenGL error occured." << std::endl;
        return 2;
    }

    // Fpsmeter
    static const uint32_t framesMax = 50;
    uint32_t framesCount = 0;
    auto start = std::chrono::steady_clock::now();

    // Main loop
    while (!glfwWindowShouldClose(window)) {
        if (glfwGetKey(window, GLFW_KEY_ESCAPE) == GLFW_PRESS) glfwSetWindowShouldClose(window, GLFW_TRUE);

        glClear(GL_COLOR_BUFFER_BIT);
        for (int i = 0; i < 100; ++i) glDrawArrays(GL_TRIANGLES, 0, 6);
        glfwSwapBuffers(window);
        glfwPollEvents();

        if (++framesCount == framesMax) {
            framesCount = 0;
            const auto now = std::chrono::steady_clock::now();
            const auto duration = now - start;
            start = now;
            const float secsPerFrame = (std::chrono::duration_cast<std::chrono::microseconds>(duration).count() / 1000000.0f) / framesMax;
            std::cout << "FPS: " << 1.0f / secsPerFrame << std::endl;
        }
    }

    // Shutdown
    glBindBuffer(GL_ARRAY_BUFFER, 0);
    glBindVertexArray(vao);
    glUseProgram(0);
    glDeleteProgram(shaderProgram);
    glDeleteBuffers(1, &buffer);
    glDeleteVertexArrays(1, &vao);
    glDeleteFramebuffers(1, &fb);
    glDeleteTextures(6, att);
    glfwMakeContextCurrent(nullptr);
    glfwDestroyWindow(window);
    glfwTerminate();
    return 0;
}

这个在 Nvidia 和 Intel GPU 上运行速度相当快,但比在 AMD GPU 上的第一个例子慢 2-3 倍:

#include <iostream>
#include <cassert>
#include <string>
#include <sstream>
#include <chrono>

#include "GL/glew.h"
#include "GLFW/glfw3.h"
#include <vector>

static std::string getErrorDescr(const GLenum errCode)
{
    // English descriptions are from
    // https://www.opengl.org/sdk/docs/man/docbook4/xhtml/glGetError.xml
    switch (errCode) {
        case GL_NO_ERROR: return "No error has been recorded. THIS message is the error itself.";
        case GL_INVALID_ENUM: return "An unacceptable value is specified for an enumerated argument.";
        case GL_INVALID_VALUE: return "A numeric argument is out of range.";
        case GL_INVALID_OPERATION: return "The specified operation is not allowed in the current state.";
        case GL_INVALID_FRAMEBUFFER_OPERATION: return "The framebuffer object is not complete.";
        case GL_OUT_OF_MEMORY: return "There is not enough memory left to execute the command.";
        case GL_STACK_UNDERFLOW: return "An attempt has been made to perform an operation that would cause an internal stack to underflow.";
        case GL_STACK_OVERFLOW: return "An attempt has been made to perform an operation that would cause an internal stack to overflow.";
        default:;
    }
    return "No description available.";
}

static std::string getErrorMessage()
{
    const GLenum error = glGetError();
    if (GL_NO_ERROR == error) return "";

    std::stringstream ss;
    ss << "OpenGL error: " << static_cast<int>(error) << std::endl;
    ss << "Error string: ";
    ss << getErrorDescr(error);
    ss << std::endl;
    return ss.str();
}

[[maybe_unused]] static bool error()
{
    const auto message = getErrorMessage();
    if (message.length() == 0) return false;
    std::cerr << message;
    return true;
}

static bool compileShader(const GLuint shader, const std::string& source)
{
    unsigned int linesCount = 0;
    for (const auto c: source) linesCount += static_cast<unsigned int>(c == '\n');
    const char** sourceLines = new const char*[linesCount];
    int* lengths = new int[linesCount];

    int idx = 0;
    const char* lineStart = source.data();
    int lineLength = 1;
    const auto len = source.length();
    for (unsigned int i = 0; i < len; ++i) {
        if (source[i] == '\n') {
            sourceLines[idx] = lineStart;
            lengths[idx] = lineLength;
            lineLength = 1;
            lineStart = source.data() + i + 1;
            ++idx;
        }
        else ++lineLength;
    }

    glShaderSource(shader, linesCount, sourceLines, lengths);
    glCompileShader(shader);
    GLint logLength;
    glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &logLength);
    if (logLength > 0) {
        auto* const log = new GLchar[logLength + 1];
        glGetShaderInfoLog(shader, logLength, nullptr, log);
        std::cout << "Log: " << std::endl;
        std::cout << log;
        delete[] log;
    }

    GLint compileStatus;
    glGetShaderiv(shader, GL_COMPILE_STATUS, &compileStatus);
    delete[] sourceLines;
    delete[] lengths;
    return bool(compileStatus);
}

static GLuint createProgram(const std::string& vertSource, const std::string& fragSource)
{
    const auto vs = glCreateShader(GL_VERTEX_SHADER);
    if (vs == 0) {
        std::cerr << "Error: vertex shader is 0." << std::endl;
        return 2;
    }
    const auto fs = glCreateShader(GL_FRAGMENT_SHADER);
    if (fs == 0) {
        std::cerr << "Error: fragment shader is 0." << std::endl;
        return 2;
    }

    // Compile shaders
    if (!compileShader(vs, vertSource)) {
        std::cerr << "Error: could not compile vertex shader." << std::endl;
        return 5;
    }
    if (!compileShader(fs, fragSource)) {
        std::cerr << "Error: could not compile fragment shader." << std::endl;
        return 5;
    }

    // Link program
    const auto program = glCreateProgram();
    if (program == 0) {
        std::cerr << "Error: program is 0." << std::endl;
        return 2;
    }
    glAttachShader(program, vs);
    glAttachShader(program, fs);
    glLinkProgram(program);

    // Get log
    GLint logLength = 0;
    glGetProgramiv(program, GL_INFO_LOG_LENGTH, &logLength);

    if (logLength > 0) {
        auto* const log = new GLchar[logLength + 1];
        glGetProgramInfoLog(program, logLength, nullptr, log);
        std::cout << "Log: " << std::endl;
        std::cout << log;
        delete[] log;
    }
    GLint linkStatus = 0;
    glGetProgramiv(program, GL_LINK_STATUS, &linkStatus);
    if (!linkStatus) {
        std::cerr << "Error: could not link." << std::endl;
        return 2;
    }
    glDeleteShader(vs);
    glDeleteShader(fs);
    return program;
}

static const std::string vertSource = R"(
#version 330
layout(location = 0) in vec2 v;
void main()
{
    gl_Position = vec4(v, 0.0, 1.0);
}
)";

static const std::string fragSource = R"(
#version 330
layout(location = 1) out vec4 outColor1;
void main()
{
    outColor1 = vec4(0.5, 0.5, 0.5, 1.0);
}
)";

int main()
{
    // Init
    if (!glfwInit()) {
        std::cerr << "Error: glfw init failed." << std::endl;
        return 3;
    }

    static const int width = 800;
    static const int height= 600;
    glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
    glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3);
    glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE);
    GLFWwindow* window = nullptr;
    window = glfwCreateWindow(width, height, "Shader test", nullptr, nullptr);
    if (window == nullptr) {
        std::cerr << "Error: window is null." << std::endl;
        glfwTerminate();
        return 1;
    }
    glfwMakeContextCurrent(window);

    if (glewInit() != GLEW_OK) {
        std::cerr << "Error: glew not OK." << std::endl;
        glfwTerminate();
        return 2;
    }

    // Shader program
    const auto shaderProgram = createProgram(vertSource, fragSource);
    glUseProgram(shaderProgram);

    // Vertex buffer
    GLuint vao;
    glGenVertexArrays(1, &vao);
    glBindVertexArray(vao);

    GLuint buffer;
    glGenBuffers(1, &buffer);
    glBindBuffer(GL_ARRAY_BUFFER, buffer);
    float bufferData[] = {
        -1.0f, -1.0f,
        1.0f, -1.0f,
        1.0f, 1.0f,
        -1.0f, -1.0f,
        1.0f, 1.0f,
        -1.0f, 1.0f
    };
    glBufferData(GL_ARRAY_BUFFER, std::size(bufferData) * sizeof(float), bufferData, GL_STATIC_DRAW);
    glEnableVertexAttribArray(0);
    glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, (GLvoid*)(0));

    glClearColor(0.0f, 0.0f, 0.0f, 0.0f);

    // Framebuffer
    GLuint fb, att[6];
    glGenTextures(6, att);
    glGenFramebuffers(1, &fb);

    glBindTexture(GL_TEXTURE_2D, att[0]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[1]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[2]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[3]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[4]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
    glBindTexture(GL_TEXTURE_2D, att[5]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);

    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fb);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, att[0], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, GL_TEXTURE_2D, att[1], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT2, GL_TEXTURE_2D, att[2], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT3, GL_TEXTURE_2D, att[3], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT4, GL_TEXTURE_2D, att[4], 0);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT5, GL_TEXTURE_2D, att[5], 0);

    GLuint dbs[] = {
        GL_NONE,
        GL_COLOR_ATTACHMENT1,
        GL_NONE,
        GL_NONE,
        GL_NONE,
        GL_NONE};
    glDrawBuffers(6, dbs);

    if (GL_FRAMEBUFFER_COMPLETE != glCheckFramebufferStatus(GL_DRAW_FRAMEBUFFER)) {
        std::cerr << "Error: framebuffer is incomplete." << std::endl;
        return 1;
    }
    if (error()) {
        std::cerr << "OpenGL error occured." << std::endl;
        return 2;
    }

    // Fpsmeter
    static const uint32_t framesMax = 50;
    uint32_t framesCount = 0;
    auto start = std::chrono::steady_clock::now();

    // Main loop
    while (!glfwWindowShouldClose(window)) {
        if (glfwGetKey(window, GLFW_KEY_ESCAPE) == GLFW_PRESS) glfwSetWindowShouldClose(window, GLFW_TRUE);

        glClear(GL_COLOR_BUFFER_BIT);
        for (int i = 0; i < 100; ++i) glDrawArrays(GL_TRIANGLES, 0, 6);
        glfwSwapBuffers(window);
        glfwPollEvents();

        if (++framesCount == framesMax) {
            framesCount = 0;
            const auto now = std::chrono::steady_clock::now();
            const auto duration = now - start;
            start = now;
            const float secsPerFrame = (std::chrono::duration_cast<std::chrono::microseconds>(duration).count() / 1000000.0f) / framesMax;
            std::cout << "FPS: " << 1.0f / secsPerFrame << std::endl;
        }
    }

    // Shutdown
    glBindBuffer(GL_ARRAY_BUFFER, 0);
    glBindVertexArray(vao);
    glUseProgram(0);
    glDeleteProgram(shaderProgram);
    glDeleteBuffers(1, &buffer);
    glDeleteVertexArrays(1, &vao);
    glDeleteFramebuffers(1, &fb);
    glDeleteTextures(6, att);
    glfwMakeContextCurrent(nullptr);
    glfwDestroyWindow(window);
    glfwTerminate();
    return 0;
}

这些示例之间的唯一区别是使用的颜色附件。

我故意编写了两个几乎相似的复制粘贴程序,以避免帧缓冲区删除和重新创建可能产生的不良影响。

UPD2:还在我在 Nvidia 和 AMD 上的测试示例中尝试了 OpenGL 4.6 调试上下文。没有性能警告。

UPD3: RX470 结果:

attachment0: 775 FPS
attachment1: 396 FPS

UPD4:我通过 emscripten 为 webgl 构建了attachment0attachment1测试,并在 Radeon RX550 上运行它们。完整源代码在问题的Github repo中,构建命令行是

emcc --std=c++17 -O3 -s WASM=1 -s USE_GLFW=3 -s USE_WEBGL2=1 ./FillRate_attachment0_webgl.cpp -o attachment0.html
emcc --std=c++17 -O3 -s WASM=1 -s USE_GLFW=3 -s USE_WEBGL2=1 ./FillRate_attachment1_webgl.cpp -o attachment1.html

两个测试程序都发出一个drawcall:glDrawArraysInstanced(GL_TRIANGLES, 0, 6, 1000);

第一个测试:默认配置的 Firefox,即支持 DirectX 的 ANGLE。

Unmasked Vendor:    Google Inc.
Unmasked Renderer:  ANGLE (Radeon RX550/550 Series Direct3D11 vs_5_0 ps_5_0)

attachment0: 38 FPS
attachment1: 38 FPS

第二个测试:禁用 ANGLE ( about:config-> webgl.disable-angle = true) 的 Firefox,使用原生 OpenGL:

Unmasked Vendor:    ATI Technologies Inc.
Unmasked Renderer:  Radeon RX550/550 Series

attachment0: 38 FPS
attachment1: 19 FPS

我们看到 DirectX 不受该问题的影响,并且 OpenGL 问题在 WebGL 中可以重现。这是意料之中的结果,因为游戏玩家和开发人员只抱怨 OpenGL 性能。

PS可能我的问题是这个问题的根源,这个性能下降了。

4

1 回答 1

1

自(至少)2019 年 12 月驱动程序以来,该问题已由 AMD 修复。该修复已通过上述测试程序和我们的游戏引擎 FPS 速率得到确认。另请参阅线程。

尊敬的 AMD OpenGL 驱动团队,非常感谢!

于 2019-12-29T11:14:27.533 回答