c++ - AWS EFS 上的 readdir 不会返回目录中的所有文件

Question

在将许多文件写入 EFS 上的一系列文件夹（10k 左右）之后。Readdir 停止返回每个目录中的所有文件。

我有一个 C++ 应用程序，它在其进程的一部分中生成大量文件，并且每个文件都有一个符号链接。之后，我需要获取文件夹中文件的列表，然后选择要重命名的子集。当我运行获取文件列表的函数时，它不会返回实际存在的所有文件。这段代码在我的本地机器上运行良好，但在安装了 EFS 驱动器的 AWS 服务器上，它会在一段时间后停止工作。

为了解决这个问题，我让我的代码一次只写一个文件。我还设置了我的代码以使用 getFiles() 来计算在写入每批文件（大约 17 个文件）后文件夹中有多少文件。当文件数量达到 ~950 个文件时，getFiles() 开始列出 ~910 个文件并且不再增加。当它写入文件时，文件是多种多样的，但相当小（2 字节 - 300K），它每秒写入大约 200 个文件。每个文件还创建了一个符号链接。

在读取和写入文件时，我使用 posix open()、write()、read() 和 close()。我已经证实我确实在读取或写入后关闭了所有文件。

我想弄清楚： 1. 为什么 readdir 不起作用？或者为什么它没有列出所有文件？2. EFS 有什么不同可能导致问题？

这些是我用来获取文件夹中文件列表的函数：

DIR * FileUtil::getDirStream(std::string path) {

bool success = false;

if (!folderExists(path)){
    return NULL;
}

DIR * dir = opendir(path.c_str());
success = dir != NULL;

int count = 0;
while(!success){

    int fileRetryDelay = BlazingConfig::getInstance()->getFileRetryDelay();
    const int sleep_milliseconds = (count+1)*fileRetryDelay;
    std::this_thread::sleep_for(std::chrono::milliseconds(sleep_milliseconds));

    std::cout<<"Was unable to get Dir stream for "<<path<<std::endl;
    dir = opendir(path.c_str());
    success = dir != NULL;

    count++;
    if(count > 6){
        break;
    }
}

if(success == -1){
    std::cout<<"Can't get Dir stream for "<<path<<". Error was: "<<errno<<std::endl;
}
return dir;
}

int FileUtil::getDirEntry(DIR * dirp, struct dirent * & prevDirEntry, struct dirent * & dirEntry){

bool success = false;

if (dirp == NULL){
    return -1;
}

int returnCode = readdir_r(dirp, prevDirEntry, &dirEntry);
success = (dirEntry == NULL && returnCode == 0) || dirEntry != NULL;

int count = 0;
while(!success){

    int fileRetryDelay = BlazingConfig::getInstance()->getFileRetryDelay();
    const int sleep_milliseconds = (count+1)*fileRetryDelay;
    std::this_thread::sleep_for(std::chrono::milliseconds(sleep_milliseconds));

    std::cout<<"Was unable to get dirent with readdir"<<std::endl;

    returnCode = readdir_r(dirp, prevDirEntry, &dirEntry);
    success = (dirEntry == NULL && returnCode == 0) || dirEntry != NULL;

    count++;
    if(count > 6){
        break;
    }
}

if(success == -1){
    std::cout<<"Can't get dirent with readdir. Error was: "<<errno<<std::endl;
}
return returnCode;
}

std::vector<std::string> FileUtil::getFiles(std::string baseFolder){
DIR *dir = getDirStream(baseFolder);
std::vector <std::string> subFolders;
if (dir != NULL) {
    struct dirent *prevDirEntry = NULL;
    struct dirent *dirEntry = NULL;
    int len_entry = offsetof(struct dirent, d_name) + fpathconf(dirfd(dir), _PC_NAME_MAX) + 1;
    prevDirEntry = (struct dirent *)malloc(len_entry);

    int returnCode = getDirEntry(dir, prevDirEntry, dirEntry);

    while (dirEntry != NULL) {
        if( dirEntry->d_type == DT_REG || dirEntry->d_type == DT_LNK){
            std::string name(dirEntry->d_name);
            subFolders.push_back(name);
        }
        returnCode = getDirEntry(dir, prevDirEntry, dirEntry);
    }

    free(prevDirEntry);
    closedir (dir);
} else {
    std::cout<<"Could not open directory err num is"<<errno<<std::endl;
    /* could not open directory */
    perror ("");

}

return subFolders;
}

以这种方式编写函数是为了尽可能地健壮，因为可以有许多线程执行文件操作，我希望能够在出现任何故障时重试代码。不幸的是，当 getFiles() 返回错误的结果时，它并没有给我任何失败的迹象。

注意：当我使用 readdir 而不是 readdir_r 时，我仍然遇到同样的问题。

c++ - AWS EFS 上的 readdir 不会返回目录中的所有文件

0 回答 0

Related

Reference