1

我正在翻译一个小型 Java 库,以便在我正在编写的目标 c 应用程序中使用。

char[] chars = sentence.toCharArray();
int i = 0;
while (i < chars.length) { ... }

其中句子是一个 NSString。我想把上面的java代码翻译成objective c。这是我到目前为止所拥有的:

sentence = [sentence stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]]; // trims sentence off white space    
const char *chars = [sentence UTF8String];

上面的while条件怎么办?我不确定在将字符串转换为字符数组后应该如何检查字符串的长度。

4

3 回答 3

7

您的 Objective-C 字符串已经保存了它的长度,只需检索它:

sentence = [sentence stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]]; // trims sentence off white space    
NSUInteger length= sentence.length;
const char *chars = [sentence UTF8String];

但我想记住,即使你不知道长度,你也可以使用 C 的 strlen 函数:

sentence = [sentence stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]]; // trims sentence off white space    
const char *chars = [sentence UTF8String];
size_t length= strlen(chars);
于 2013-06-28T10:50:57.753 回答
4

Even there is already an accepted answer I want to warn of using strlen(), even in this case it might be without any problem. There are a differences between NSString and C-Strings.

A. -length (NSString) and strlen() has different semantics:

NSString is not(!) \0-terminated, but length based. It can store \0 characters. It is very easy to get different length, if there is a \0 character in the string instance:

NSString *sentence = @"Amin\0Negm";
NSLog( @"length %ld", [sentence length]); // 9
const char *chars = [sentence cStringUsingEncoding:NSUTF8StringEncoding];
size_t length= strlen(chars);
NSLog(@"strlen %ld", (long)length); // 4

length 9
strlen 4

But -UTF8String and even the used -cStringUsingEnocding: (both NSString) copy out the whole string stored in the string instance. (I think in case of -cStringUsingEncoding it is misleading, because standard string functions like strlen() always uses the first \0 as the termination of strings.)

B. In UTF8 a character can have multibytes. A char in C is one byte. (With byte not in the meaning of 8 bits, but smallest addressable unit.)

NSString *sentence = @"Αmin Negm";
NSLog( @"length %ld", [sentence length]);
const char *chars = [sentence UTF8String];
size_t length= strlen(chars);
NSLog(@"strlen %ld", (long)length);

length 9
strlen 10

WTF happened here? The "A" of Amin is no latin capital letter A but a greek capital letter Alpha. In UTF8 this takes two bytes and for pure C's strlen there are two characters!

NSLog(@"%x-%x %x-%x", 'A', 'm', (unsigned char)*chars, (unsigned char)*(chars+1) );

41-6d ce-91

The first two numbers are the codes for 'A', 'm', the second two numbers are the UTF8 code for greek capital letter Alpha (CE 91).

I do not think, that it is a good idea to simply change from NSString to char * without good reason and a complete understanding of the problems. If you do not expect such characters, use NSASCIIStringEncoding. If you expect such characters check your code again and again … or read C.

C. C supports wide characters. This is similiar to Mac OS' unichar, but typed wchar_t. There are string functions for wchar_t in wchar.h.

NSString *sentence = @"Αmin Negm";
NSLog( @"length %ld", [sentence length]);
wchar_t wchars[128]; // take care of the size
wchar_t *wchar = wchars;
for (NSUInteger index = 0; index < [sentence length]; index++)
{
   *wchar++ = [sentence characterAtIndex:index];
}
*wchar = '\0';
NSLog(@"widestrlen %ld", wcslen(wchars));

length 9
widestrlen 9

D. Obviously you want to iterate through the string. The common pattern in pure C is not to use an index and to compare it to the length and definitly not to to strlen() in every loop, because it produces high costs. (C strings are not length based so the whole string has to be scanned over and over.) You simply increment the pointer to the next char:

char letter;
while ( (letter = *chars++) ) {…}

or

do
{
   // *chars points to the actual char
} while (*char++);
于 2013-06-28T15:01:35.170 回答
-1
int lenght = sizeof(chars) / sizeof(char)

可能有效,但它会(在最好的情况下)在最坏的情况下返回与 sentence.lenght 相同的东西 0 因为我现在不记得整个指针/大小的东西了

于 2013-06-28T10:45:53.773 回答