我尝试使用 NSInputStream 在 iOS 中读取一个大文件,以用换行符分隔文件(我不想使用componentsSeparatedByCharactersInSet
,因为它使用了太多内存)。
但由于并非所有行似乎都是 UTF-8 编码的(因为它们可以显示为 ASCII,相同的字节)我经常收到Incorrect NSStringEncoding value 0x0000 detected. Assuming NSASCIIStringEncoding. Will stop this compatiblity mapping behavior in the near future.
警告。
我的问题是:有没有办法通过设置编译器标志来抑制这个警告?
此外:附加/连接两个缓冲区读取是否保存,因为从字节流中读取,然后将缓冲区转换为字符串,然后附加字符串可能会使字符串损坏?
下面的示例方法演示了字节到字符串的转换将丢弃 UTF-8 字符的前半部分和后半部分,因为它是无效的。
- (void)NSInputStreamTest {
uint8_t testString[] = {0xd0, 0x91}; // @"Б"
// Test 1: Read max 1 byte at a time of UTF-8 string
uint8_t buf1[1], buf2[1];
NSString *s1, *s2, *s3;
NSInteger c1, c2;
NSInputStream *inStream = [[NSInputStream alloc] initWithData:[[NSData alloc] initWithBytes:testString length:2]];
[inStream open];
c1 = [inStream read:buf1 maxLength:1];
s1 = [[NSString alloc] initWithBytes:buf1 length:1 encoding:NSUTF8StringEncoding];
NSLog(@"Test 1: Read %d byte(s): %@", c1, s1);
c2 = [inStream read:buf2 maxLength:1];
s2 = [[NSString alloc] initWithBytes:buf2 length:1 encoding:NSUTF8StringEncoding];
NSLog(@"Test 1: Read %d byte(s): %@", c2, s2);
s3 = [s1 stringByAppendingString:s2];
NSLog(@"Test 1: Concatenated: %@", s3);
[inStream close];
// Test 2: Read max 2 bytes at a time of UTF-8 string
uint8_t buf4[2];
NSString *s4;
NSInteger c4;
NSInputStream *inStream2 = [[NSInputStream alloc] initWithData:[[NSData alloc] initWithBytes:testString length:2]];
[inStream2 open];
c4 = [inStream2 read:buf4 maxLength:2];
s4 = [[NSString alloc] initWithBytes:buf4 length:2 encoding:NSUTF8StringEncoding];
NSLog(@"Test 2: Read %d byte(s): %@", c4, s4);
[inStream2 close];
}
输出:
2013-02-10 21:16:23.412 Test[11144:c07] Test 1: Read 1 byte(s): (null)
2013-02-10 21:16:23.413 Test[11144:c07] Test 1: Read 1 byte(s): (null)
2013-02-10 21:16:23.413 Test[11144:c07] Test 1: Concatenated: (null)
2013-02-10 21:16:23.413 Test[11144:c07] Test 2: Read 2 byte(s): Б