c++ - 将字符串拆分为具有多个分隔符的多个字符串而不删除？

Question

我使用 boost 框架，所以它可能会有所帮助，但我还没有找到必要的功能。

对于通常的快速拆分，我可以使用：

string str = ...;
vector<string> strs;
boost::split(strs, str, boost::is_any_of("mM"));

但它会删除 m 和 M 个字符。

我也不能简单地使用正则表达式，因为它会在字符串中搜索符合定义模式的最长值。

PS 有很多类似的问题，但他们只用其他编程语言描述了这个实现。

score 3 · Accepted Answer

未经测试，但不是使用vector<string>，您可以尝试 a vector<boost::iterator_range<std::string::iterator>>( 这样您就可以为每个标记获得一对指向主字符串的迭代器。然后从 (范围开始 -1 [只要范围开始不是begin()主字符串]，到范围的末端）

编辑：这是一个例子：

#include <iostream>
#include <string>

#include <boost/algorithm/string/classification.hpp>
#include <boost/algorithm/string/split.hpp>
#include <boost/range/iterator_range.hpp>

int main(void)
{
  std::string str = "FooMBarMSFM";

  std::vector<boost::iterator_range<std::string::iterator>> tokens;

  boost::split(tokens, str, boost::is_any_of("mM"));

  for(auto r : tokens)
  {
    std::string b(r.begin(), r.end());
    std::cout << b << std::endl;
    if (r.begin() != str.begin())
    {
      std::string bm(std::prev(r.begin()), r.end());
      std::cout << "With token: [" << bm << "]" << std::endl;
    }
  }
}

score 1 · Accepted Answer

您的需求超出了我们的概念split。如果你想保留 'm 或 M'，你可以编写一个特殊的拆分，按、strstr或函数。您可以更改一些代码以产生灵活的功能。这是一个例子：strchrstrtokfindsplit

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void split(char *src, const char *separator, char **dest, int *num)
{
    char *pNext;
    int count = 0;

    if (src == NULL || strlen(src) == 0) return;
    if (separator == NULL || strlen(separator) == 0) return; 

    pNext = strtok(src,separator);

    while(pNext != NULL)
    {
        *dest++ = pNext;
        ++count;
        pNext = strtok(NULL,separator);
    }

    *num = count;
}

另外，你可以试试boost::regex。

score 0 · Accepted Answer

我目前的解决方案如下（但它不是通用的，看起来太复杂了）。

我选择了一个不能出现在这个字符串中的字符。就我而言，它是“|”。

string str = ...;
vector<string> strs;
boost::split(strs, str, boost::is_any_of("m"));
str = boost::join(strs, "|m");
boost::split(strs, str, boost::is_any_of("M"));
str = boost::join(strs, "|M");
if (boost::iequals(str.substr(0, 1), "|") {

    str = str.substr(1);
}
boost::split(strs, str, boost::is_any_of("|"));

我加“|” 在每个符号 m/M 之前，字符串中的第一个位置除外。然后我将字符串拆分为子字符串，并删除这个额外的字符

c++ - 将字符串拆分为具有多个分隔符的多个字符串而不删除？

3 回答 3

Related

Reference