我知道我可以使用字符串substr()
中的前n
几个字符。但是,我想删除最后几个字符。像我在 Python 中那样使用-2
或作为 C 中的结束位置是否有效?-3
5 回答
您可以简单地在您希望字符串结束的位置放置一个空终止字符,如下所示:
int main()
{
char s[] = "I am a string";
int len = strlen(s);
s[len-3] = '\0';
printf("%s\n",s);
}
这会给你:
“我是一个str”
C 不像 Python;字符串索引不是“智能”的。字面str[-3]
意思是“开始前三个字节的字符”;访问此内存是未定义的行为。
如果您想将字符串的最后几个字符作为另一个字符串,只需获取指向您想要的第一个字符的指针即可:
char *endstr = str + (strlen(str) - 3); // get last 3 characters of the string
如果要删除最后几个字符,只需将第 kth-from-the-end 字符设置为 null ( \0
):
str[strlen(str)-3] = 0; // delete last three characters
这是一个substr()
函数的可能实现,包括测试代码。请注意,测试代码不会突破边界——缓冲区长度比请求的字符串短或缓冲区长度为零。
#include <string.h>
extern void substr(char *buffer, size_t buflen, char const *source, int len);
/*
** Given substr(buffer, sizeof(buffer), "string", len), then the output
** in buffer for different values of len is:
** For positive values of len:
** 0 ""
** 1 "s"
** 2 "st"
** ...
** 6 "string"
** 7 "string"
** ...
** For negative values of len:
** -1 "g"
** -2 "ng"
** ...
** -6 "string"
** -7 "string"
** ...
** Subject to buffer being long enough.
** If buffer is too short, the empty string is set (unless buflen is 0,
** in which case, everything is left untouched).
*/
void substr(char *buffer, size_t buflen, char const *source, int len)
{
size_t srclen = strlen(source);
size_t nbytes = 0;
size_t offset = 0;
size_t sublen;
if (buflen == 0) /* Can't write anything anywhere */
return;
if (len > 0)
{
sublen = len;
nbytes = (sublen > srclen) ? srclen : sublen;
offset = 0;
}
else if (len < 0)
{
sublen = -len;
nbytes = (sublen > srclen) ? srclen : sublen;
offset = srclen - nbytes;
}
if (nbytes >= buflen)
nbytes = 0;
if (nbytes > 0)
memmove(buffer, source + offset, nbytes);
buffer[nbytes] = '\0';
}
#ifdef TEST
#include <stdio.h>
struct test_case
{
const char *source;
int length;
const char *result;
};
static struct test_case tests[] =
{
{ "string", 0, "" },
{ "string", +1, "s" },
{ "string", +2, "st" },
{ "string", +3, "str" },
{ "string", +4, "stri" },
{ "string", +5, "strin" },
{ "string", +6, "string" },
{ "string", +7, "string" },
{ "string", -1, "g" },
{ "string", -2, "ng" },
{ "string", -3, "ing" },
{ "string", -4, "ring" },
{ "string", -5, "tring" },
{ "string", -6, "string" },
{ "string", -7, "string" },
};
enum { NUM_TESTS = sizeof(tests) / sizeof(tests[0]) };
int main(void)
{
int pass = 0;
int fail = 0;
for (int i = 0; i < NUM_TESTS; i++)
{
char buffer[20];
substr(buffer, sizeof(buffer), tests[i].source, tests[i].length);
if (strcmp(buffer, tests[i].result) == 0)
{
printf("== PASS == %2d: substr(buffer, %zu, \"%s\", %d) = \"%s\"\n",
i, sizeof(buffer), tests[i].source, tests[i].length, buffer);
pass++;
}
else
{
printf("!! FAIL !! %2d: substr(buffer, %zu, \"%s\", %d) wanted \"%s\" actual \"%s\"\n",
i, sizeof(buffer), tests[i].source, tests[i].length, tests[i].result, buffer);
fail++;
}
}
if (fail == 0)
{
printf("== PASS == %d tests passed\n", NUM_TESTS);
return(0);
}
else
{
printf("!! FAIL !! %d tests out of %d failed\n", fail, NUM_TESTS);
return(1);
}
}
#endif /* TEST */
函数声明应该在适当的标题中。该变量sublen
有助于代码在以下情况下干净地编译:
gcc -O3 -g -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
-Wold-style-definition -Werror -DTEST substr.c -o substr
测试输出:
== PASS == 0: substr(buffer, 20, "string", 0) = ""
== PASS == 1: substr(buffer, 20, "string", 1) = "s"
== PASS == 2: substr(buffer, 20, "string", 2) = "st"
== PASS == 3: substr(buffer, 20, "string", 3) = "str"
== PASS == 4: substr(buffer, 20, "string", 4) = "stri"
== PASS == 5: substr(buffer, 20, "string", 5) = "strin"
== PASS == 6: substr(buffer, 20, "string", 6) = "string"
== PASS == 7: substr(buffer, 20, "string", 7) = "string"
== PASS == 8: substr(buffer, 20, "string", -1) = "g"
== PASS == 9: substr(buffer, 20, "string", -2) = "ng"
== PASS == 10: substr(buffer, 20, "string", -3) = "ing"
== PASS == 11: substr(buffer, 20, "string", -4) = "ring"
== PASS == 12: substr(buffer, 20, "string", -5) = "tring"
== PASS == 13: substr(buffer, 20, "string", -6) = "string"
== PASS == 14: substr(buffer, 20, "string", -7) = "string"
== PASS == 15 tests passed
为什么这不起作用:
memcpy(new_string, old_string, strlen(old_string) - 3; &new_string[strlen(old_string) - 3] = '\0'
假设new_string
andold_string
都是char
指针 andstrlen(old_string) > 3
?
假设您删除&
,插入缺少的)
和;
,指针指向有效的非重叠位置,并且满足长度条件,那么将旧字符串中除了最后 3 个字符之外的所有字符复制到新字符串中应该是可以的,如您可以通过将其嵌入到一些测试代码中来进行测试。它不会尝试处理复制旧字符串的最后三个字符,这似乎是问题主要要问的问题。
#include <string.h>
#include <stdio.h>
int main(void)
{
char new_string[32] = "XXXXXXXXXXXXXXXX";
char old_string[] = "string";
memcpy(new_string, old_string, strlen(old_string) - 3);
new_string[strlen(old_string) - 3] = '\0';
printf("<<%s>> <<%s>>\n", old_string, new_string);
return(0);
}
输出:
<<string>> <<str>>
但是,请注意棘手的巧合;我选择了一个 6 个字符长的示例旧字符串,并且 -3 给出的 'length -3' 也等于 3。要获取最后 N 个字符,您需要的代码更像:
#include <assert.h>
#include <string.h>
#include <stdio.h>
int main(void)
{
int N = 3;
char new_string[32] = "XXXXXXXXXXXXXXXX";
char old_string[] = "dandelion";
int sublen = strlen(old_string) - N;
assert(sublen > 0);
memcpy(new_string, old_string + sublen, N);
new_string[N] = '\0';
printf("<<%s>> <<%s>>\n", old_string, new_string);
return(0);
}
输出:
<<dandelion>> <<ion>>
请注意,编写这样的小程序是一种很好的做法,并且具有教育意义。编写大量代码是更好地编写代码的一种方法。
唯一需要注意的陷阱是,如果您正在测试“未定义的行为”,您只需从单个编译器获得响应,但其他编译器可能会生成行为不同的代码。这段代码没有执行未定义的行为,所以很好。识别未定义的行为很棘手,因此您可以部分忽略此评论,但请确保您使用编译器上可以忍受的严格警告选项进行编译——它们有助于识别未定义的行为。
我有一些示例程序,我将它们(在源代码控制下)保存在一个名为vignettes
;的目录中。它们是程序的小客串,说明了一种技术,如果我认为我将来可能再次需要它,我可以参考它。它们是完整的;他们工作; (它们比这些特定示例更复杂,但我用 C 编程的时间比你长;)但它们是玩具——有用的玩具。
不,您必须像这样使用 strlen() 来获取最后一个字符。
substr(strlen(str)-4,3);
记住字符串是基于 0 的,所以这会给你最后 3 个。
所以一般的技术是
substr(strlen(str)-n-1,n);
(当然字符串必须长于n
)
如果您想获得最后 3 个,请使用以下命令:
substr(0,strlen(str)-4);
或者一般来说
substr(0,strlen(str)-n-1);
我注意到这substr
不是标准的 C 函数,因此在 C 中使用它是无效的。因此,通过消除可以使用的最后几个字符来查找子字符串memcpy(new_string, old_string, strlen(old_string) - 3)