假设我有这个:
"foo bar 1 and foo bar 2"
我怎样才能把它分成:
foo bar 1
foo bar 2
?
我试过了strtok()
,strsep()
但都没有奏效。他们不将“and”识别为分隔符,他们将“a”、“n”和“d”识别为分隔符。
有什么功能可以帮助我解决这个问题,或者我必须用空格分割并进行一些字符串操作?
假设我有这个:
"foo bar 1 and foo bar 2"
我怎样才能把它分成:
foo bar 1
foo bar 2
?
我试过了strtok()
,strsep()
但都没有奏效。他们不将“and”识别为分隔符,他们将“a”、“n”和“d”识别为分隔符。
有什么功能可以帮助我解决这个问题,或者我必须用空格分割并进行一些字符串操作?
您可以使用strstr()查找第一个“和”,并自己“标记”字符串,只需向前跳过这么多字符,然后再做一次。
在 C 中拆分字符串的主要问题是它不可避免地会导致一些动态内存管理,标准库会尽可能避免这种情况。这就是为什么没有一个标准的 C 函数处理动态内存分配,只有 malloc/calloc/realloc 这样做。
但是自己做这件事并不太难。让我带你了解一下。
我们需要返回多个字符串,最简单的方法是返回一个指向字符串的指针数组,该数组以 NULL 项结束。除了最终的 NULL 之外,数组中的每个元素都指向一个动态分配的字符串。
首先我们需要几个辅助函数来处理这样的数组。最简单的一种是计算字符串的数量(最终 NULL 之前的元素):
/* Return length of a NULL-delimited array of strings. */
size_t str_array_len(char **array)
{
size_t len;
for (len = 0; array[len] != NULL; ++len)
continue;
return len;
}
另一个简单的方法是释放数组的函数:
/* Free a dynamic array of dynamic strings. */
void str_array_free(char **array)
{
if (array == NULL)
return;
for (size_t i = 0; array[i] != NULL; ++i)
free(array[i]);
free(array);
}
更复杂的是将字符串的副本添加到数组的函数。它需要处理一些特殊情况,例如当数组还不存在时(整个数组为 NULL)。此外,它需要处理不以 '\0' 结尾的字符串,以便我们的实际拆分函数更容易在追加时仅使用输入字符串的一部分。
/* Append an item to a dynamically allocated array of strings. On failure,
return NULL, in which case the original array is intact. The item
string is dynamically copied. If the array is NULL, allocate a new
array. Otherwise, extend the array. Make sure the array is always
NULL-terminated. Input string might not be '\0'-terminated. */
char **str_array_append(char **array, size_t nitems, const char *item,
size_t itemlen)
{
/* Make a dynamic copy of the item. */
char *copy;
if (item == NULL)
copy = NULL;
else {
copy = malloc(itemlen + 1);
if (copy == NULL)
return NULL;
memcpy(copy, item, itemlen);
copy[itemlen] = '\0';
}
/* Extend array with one element. Except extend it by two elements,
in case it did not yet exist. This might mean it is a teeny bit
too big, but we don't care. */
array = realloc(array, (nitems + 2) * sizeof(array[0]));
if (array == NULL) {
free(copy);
return NULL;
}
/* Add copy of item to array, and return it. */
array[nitems] = copy;
array[nitems+1] = NULL;
return array;
}
那是一个moutful。对于真正好的风格,如果输入项具有自己的功能,最好将动态副本的制作分开,但我将把它作为练习留给读者。
最后,我们有了实际的拆分功能。它还需要处理一些特殊情况:
如果分隔符紧邻输入字符串的开头或结尾,或者紧邻另一个分隔符,我选择在结果中添加一个空字符串。如果您需要其他内容,则需要调整代码。
除了特殊情况和一些错误处理之外,拆分现在相当简单。
/* Split a string into substrings. Return dynamic array of dynamically
allocated substrings, or NULL if there was an error. Caller is
expected to free the memory, for example with str_array_free. */
char **str_split(const char *input, const char *sep)
{
size_t nitems = 0;
char **array = NULL;
const char *start = input;
char *next = strstr(start, sep);
size_t seplen = strlen(sep);
const char *item;
size_t itemlen;
for (;;) {
next = strstr(start, sep);
if (next == NULL) {
/* Add the remaining string (or empty string, if input ends with
separator. */
char **new = str_array_append(array, nitems, start, strlen(start));
if (new == NULL) {
str_array_free(array);
return NULL;
}
array = new;
++nitems;
break;
} else if (next == input) {
/* Input starts with separator. */
item = "";
itemlen = 0;
} else {
item = start;
itemlen = next - item;
}
char **new = str_array_append(array, nitems, item, itemlen);
if (new == NULL) {
str_array_free(array);
return NULL;
}
array = new;
++nitems;
start = next + seplen;
}
if (nitems == 0) {
/* Input does not contain separator at all. */
assert(array == NULL);
array = str_array_append(array, nitems, input, strlen(input));
}
return array;
}
这是一个完整的程序。它还包括一个运行一些测试用例的主程序。
#include <assert.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* Append an item to a dynamically allocated array of strings. On failure,
return NULL, in which case the original array is intact. The item
string is dynamically copied. If the array is NULL, allocate a new
array. Otherwise, extend the array. Make sure the array is always
NULL-terminated. Input string might not be '\0'-terminated. */
char **str_array_append(char **array, size_t nitems, const char *item,
size_t itemlen)
{
/* Make a dynamic copy of the item. */
char *copy;
if (item == NULL)
copy = NULL;
else {
copy = malloc(itemlen + 1);
if (copy == NULL)
return NULL;
memcpy(copy, item, itemlen);
copy[itemlen] = '\0';
}
/* Extend array with one element. Except extend it by two elements,
in case it did not yet exist. This might mean it is a teeny bit
too big, but we don't care. */
array = realloc(array, (nitems + 2) * sizeof(array[0]));
if (array == NULL) {
free(copy);
return NULL;
}
/* Add copy of item to array, and return it. */
array[nitems] = copy;
array[nitems+1] = NULL;
return array;
}
/* Free a dynamic array of dynamic strings. */
void str_array_free(char **array)
{
if (array == NULL)
return;
for (size_t i = 0; array[i] != NULL; ++i)
free(array[i]);
free(array);
}
/* Split a string into substrings. Return dynamic array of dynamically
allocated substrings, or NULL if there was an error. Caller is
expected to free the memory, for example with str_array_free. */
char **str_split(const char *input, const char *sep)
{
size_t nitems = 0;
char **array = NULL;
const char *start = input;
char *next = strstr(start, sep);
size_t seplen = strlen(sep);
const char *item;
size_t itemlen;
for (;;) {
next = strstr(start, sep);
if (next == NULL) {
/* Add the remaining string (or empty string, if input ends with
separator. */
char **new = str_array_append(array, nitems, start, strlen(start));
if (new == NULL) {
str_array_free(array);
return NULL;
}
array = new;
++nitems;
break;
} else if (next == input) {
/* Input starts with separator. */
item = "";
itemlen = 0;
} else {
item = start;
itemlen = next - item;
}
char **new = str_array_append(array, nitems, item, itemlen);
if (new == NULL) {
str_array_free(array);
return NULL;
}
array = new;
++nitems;
start = next + seplen;
}
if (nitems == 0) {
/* Input does not contain separator at all. */
assert(array == NULL);
array = str_array_append(array, nitems, input, strlen(input));
}
return array;
}
/* Return length of a NULL-delimited array of strings. */
size_t str_array_len(char **array)
{
size_t len;
for (len = 0; array[len] != NULL; ++len)
continue;
return len;
}
#define MAX_OUTPUT 20
int main(void)
{
struct {
const char *input;
const char *sep;
char *output[MAX_OUTPUT];
} tab[] = {
/* Input is empty string. Output should be a list with an empty
string. */
{
"",
"and",
{
"",
NULL,
},
},
/* Input is exactly the separator. Output should be two empty
strings. */
{
"and",
"and",
{
"",
"",
NULL,
},
},
/* Input is non-empty, but does not have separator. Output should
be the same string. */
{
"foo",
"and",
{
"foo",
NULL,
},
},
/* Input is non-empty, and does have separator. */
{
"foo bar 1 and foo bar 2",
" and ",
{
"foo bar 1",
"foo bar 2",
NULL,
},
},
};
const int tab_len = sizeof(tab) / sizeof(tab[0]);
bool errors;
errors = false;
for (int i = 0; i < tab_len; ++i) {
printf("test %d\n", i);
char **output = str_split(tab[i].input, tab[i].sep);
if (output == NULL) {
fprintf(stderr, "output is NULL\n");
errors = true;
break;
}
size_t num_output = str_array_len(output);
printf("num_output %lu\n", (unsigned long) num_output);
size_t num_correct = str_array_len(tab[i].output);
if (num_output != num_correct) {
fprintf(stderr, "wrong number of outputs (%lu, not %lu)\n",
(unsigned long) num_output, (unsigned long) num_correct);
errors = true;
} else {
for (size_t j = 0; j < num_output; ++j) {
if (strcmp(tab[i].output[j], output[j]) != 0) {
fprintf(stderr, "output[%lu] is '%s' not '%s'\n",
(unsigned long) j, output[j], tab[i].output[j]);
errors = true;
break;
}
}
}
str_array_free(output);
printf("\n");
}
if (errors)
return EXIT_FAILURE;
return 0;
}
这是我刚刚写的一个很好的简短示例,展示了如何strstr
在给定字符串上拆分字符串:
#include <string.h>
#include <stdio.h>
void split(char *phrase, char *delimiter)
{
char *loc = strstr(phrase, delimiter);
if (loc == NULL)
{
printf("Could not find delimiter\n");
}
else
{
char buf[256]; /* malloc would be much more robust here */
int length = strlen(delimiter);
strncpy(buf, phrase, loc - phrase);
printf("Before delimiter: '%s'\n", buf);
printf("After delimiter: '%s'\n", loc+length);
}
}
int main()
{
split("foo bar 1 and foo bar 2", "and");
printf("-----\n");
split("foo bar 1 and foo bar 2", "quux");
return 0;
}
输出:
分隔符之前:'foo bar 1' 分隔符后:'foo bar 2' ----- 找不到分隔符
当然,我还没有完全测试过它,它可能容易受到与字符串长度相关的大多数标准缓冲区溢出问题的影响;但这至少是一个可证明的例子。
如果您知道分隔符的类型,例如逗号或分号,您可以尝试以下操作:
#include<stdio.h>
#include<conio.h>
int main()
{
int i=0,temp=0,temp1=0, temp2=0;
char buff[12]="123;456;789";
for(i=0;buff[i]!=';',i++)
{
temp=temp*10+(buff[i]-48);
}
for(i=0;buff[i]!=';',i++)
{
temp1=temp1*10+(buff[i]-48);
}
for(i=0;buff[i],i++)
{
temp2=temp2*10+(buff[i]-48);
}
printf("temp=%d temp1=%d temp2=%d",temp,temp1,temp2);
getch();
return 0;
}
输出:
temp=123 temp1=456 temp2=789