linux - linux中的可靠写入

Question

我的要求是将输入可变大小的二进制消息的永无止境的流写入文件系统。平均大小为 2KB 的消息以 1000 条消息/秒的速度到达。因此，在一小时内，消息总数将为 3600*1000*2 = 6.8 GB。消息的主要目的如下 1. 存档以供审核 2. 提供搜索界面

我的问题是

有没有解决这个问题的开源软件
如果进程写入块大小的倍数并且进程在写入块的过程中崩溃会发生什么样的错误
可能会发生什么样的错误，应用程序已写入块大小，但文件系统尚未将数据刷新到磁盘。
inode 在任何情况下都会损坏吗
linux中有文件大小限制吗？
是否有理想的文件大小？大文件（以 GB 为单位）与中文件（以 MB 为单位）的优缺点是什么
还有什么要注意的吗？
我的偏好是使用 C++，但如果需要，我可以切换到 C。

score 2 · Accepted Answer

Once write or writev returns (i.e. the OS has accepted it), the operating system is responsible for writing data to disk. It's not your problem any more, and it's happening irrespectively of your process crashing. Note that you have no control over the exact amount of data accepted or actually written at a time, nor whether it happens in multiples of filesystem blocks or whether it's any particular size at all. You send a request to write and it tells you how much it actually accepted, and it will write that to disk, at its own discretion.
Probably this will happen in multiples of the block size because it makes sense for the OS to do that, but this is not guaranteed in any way (on many systems, Linux included, reading and writing is implemented via or tightly coupled with file mapping).

The same "don't have to care" guarantee holds for file mapping (with the theoretical exception that a crashing application could in principle still write into a still mapped area, but once you've unmapped an area, that cannot happen even theoretically). Unless you pull the plug (or the kernel crashes), data will be written, and consistently.
Data will only ever be written in multiples of filesystem blocks, because memory pages are multiples of device blocks, and file mapping does not know anything else, it just works that way.

You can kind of (neglecting any possible unbuffered on-disk write cache) get some control over what's on the disk with fdatasync. When that function returns, what has been in the buffers before has been sent to the disk.
However, that still doesn't prevent your process from crashing in another thread in the mean time, and it doesn't prevent someone from pulling the plug. fdatasync is preferrable over fsync since it doesn't touch anything near the inode, meaning it's faster and safer (you may lose the last data written in a subsequent crash since the length has not been updated yet, but you should never destroy/corrupt the whole file).

C library functions (fwrite) do their own buffering and give you control over the amount of data you write, but having "written" data only means it is stored in a buffer owned by the C library (in your process). If the process dies, the data is gone. No control over how the data hits the disk, or if ever. (N.b.: You do have some control insofar as you can fflush, this will immediately pass the contents of the buffers to the underlying write function, most likely writev, before returning. With that, you're back at the first paragraph.)

Asynchronous IO (kernel aio) will bypass kernel buffers and usually pull the data directly from your process. Your process dies, your data is gone. Glibc aio uses threads that block on write, the same as in paragraph 1 applies.

What happens if you pull the plug or hit the "off" switch at any time? Nobody knows.
Usually some data will be lost, an operating system can give many guarantees, but it can't do magic. Though in theory, you might have a system that buffers RAM with a battery or a system that has a huge dedicated disk cache which is also battery powered. Nobody can tell. In any case, plan for losing data.
That said, what's once written should not normally get corrupted if you keep appending to a file (though, really anything can happen, and "should not" does not mean a lot).

All in all, using either write in append mode or file mapping should be good enough, they're as good as you can get anyway. Other than sudden power loss, they're reliable and efficient.
If power failure is an issue, an UPS will give better guarantees than any software solution can provide.

As for file sizes, I don't see any reason to artificially limit file sizes (assuming a reasonably new filesystem). Usual file size limits for "standard" Linux filesystems (if there is any such thing) are in the terabyte range.
Either way, if you feel uneasy with the idea that corrupting one file for whatever reason could destroy 30 days worth of data, start a new file once every day. It doesn't cost extra.

score 0 · Accepted Answer

这里的问题是，场景描述不准确。因此，一些答案是猜测：

是的 - 它被称为“g++”。;-)
很多不同。恕我直言，通过为您的程序编写好的和许多测试用例来避免这种情况。
根据您的系统和程序，“仅”写入内存缓冲区是正常的处理方式。应该没有问题。
这取决于故障场景和使用的文件系统。（也有没有 inode 的文件系统。）
每个文件系统都有其文件大小限制。正确的答案（可能对你没用）是：是的。
否 - 这在很大程度上取决于您的应用程序和您的环境（如硬盘、备份系统、IO 系统……）
需要更多信息来回答这个问题。
不是一个问题。

希望这有助于第一步。如果您决定要朝哪个方向发展，请在问题中添加信息 - 您给出的要求越多，答案就越好。

score 0 · Accepted Answer

你手头有一个有趣的问题。我不是这些领域的专家，但有足够的知识来评论这些。

如果你还没有读过这篇文章，你可以阅读这篇文章，以获得关于各种文件系统 linux 的一般概述，它们的优缺点、限制等。 Linux 中文件系统的比较

1) 我在 Python/Perl 中遇到过自动轮换日志文件库，在 C/c++ 中也可以使用。2/3/4) 日志文件系统在更大程度上防止文件系统崩溃。他们也支持记录数据，但并没有使用太多。

检查此以获取有关日记的更多信息

score 0 · Accepted Answer

你应该使用 SQLite。它解决了您需要的一切。如果您正确使用该数据库，包括速度。

linux - linux中的可靠写入

4 回答 4

Related

Reference