1

I have a sort of problem with regex, I want it to count multiple (well, in this case, one) buffer as one string.

Let's say I download a file and I want to search for a specific string, let's say "foobar". I don't know what the file size will be and I don't want to allocate a huge chunk of couple megabytes of buffer for html code.

So, the idea is, I have this small buffer, lets say 64 bytes. Let's say we write a chunk of that file and the char array looks like this

.............foobar.............

Everything seems fine, but if the array was like this

.............................foo

And on the other write of chunk it becomes

bar.............................

The problem is self explanatory. Regex will not find the strings on separate checks. We could allocate big buffer size to contain page at once, but that's a huge waste.

So, I have an idea - split buffers. Let's say on first write we get these buffers

 ............................foo
 ------------------------------- // this one is empty

then, after second write we get this

 ............................foo
 bar............................

now, if there was a regex function that would count these buffers as one that would be great. I could simply keep alternating the buffers and pull the strings I want to without allocating a lot of space in ram.

Is there a c++ regex library that would do that? Any ideas?

4

2 回答 2

1

std::regex_match(自 C++11 起可用,请参阅 header <regex>)有一个接口,该接口采用一对迭代器来分隔要搜索的“字符串”。您可以创建自己的迭代器类,该类将简单地按顺序迭代缓冲区集合。

于 2013-05-18T10:10:04.860 回答
0

在缓冲区的每次交替之后,您可以将缓冲区连接到临时缓冲区中,然后使用正则表达式进行搜索。IE。对于每对读取。

于 2013-05-18T10:16:57.077 回答