这是其中一项任务,我想知道为什么在某个地方没有一个名为 namesafe( char *in) 的基本 C99 函数来完成这项工作。这在一个完美的世界里会很好,对于用 UTF-8 希伯来语或希腊语写的名字完全没用。但是,假设我被七位 ascii 卡住了,那么我尝试了以下操作:
/* We may even go so far as to ensure that any apostrophe, or hyphen
* or period may only appear as single entities and not as two or three
* in a row. This does not, however, protect us from total non-sense
* such as D'Archy Mc'Clo.u. d.
*
* walk the first_name string and throw away everything that is
* not in the range A-Z or a-z, with the exception of space char which
* we keep. Also a single hyphen is allowed.
*
* This all seems smart but we are not protected from stupidity such
* As a name with spaces and dashes or hypens intermixed with letters
* or from the artist formally known as 'Prince'.
*/
char buffer[256];
j = 0;
for ( k=0; k<strlen(first_name); k++ ) {
/* Accept anything in the a - z letters */
if ( ( first_name[k] >= 'a' ) && ( first_name[k] <= 'z' ) )
buffer[j++] = first_name[k];
/* Accept anything in the A - Z letters */
if ( ( first_name[k] >= 'A' ) && ( first_name[k] <= 'Z' ) )
buffer[j++] = first_name[k];
/* reduce double dashes or hyphens to a single hyphen */
while (( first_name[k] == '-' ) && ( first_name[k+1] == '-' ))
k++;
if ( first_name[k] == '-' ) /* do I need this ? */
buffer[j++] = first_name[k];
/* reduce double spaces to a single space */
while (( first_name[k] == ' ' ) && ( first_name[k+1] == ' ' ))
k++;
if ( first_name[k] == ' ' ) /* do I also need this ? */
buffer[j++] = first_name[k];
}
/* we may still yet have terminating spaces or hyphens on buffer */
while ( ( j > 1 ) && (( buffer[j-1] == ' ' ) || ( buffer[j-1] == '-' )) )
j--;
buffer[j] = '\0';
/* Accept this new cleaner First Name */
strcpy ( first_name, buffer );
只要输入名称缓冲区的长度不超过 255 个字符,似乎就可以很好地工作。但是,在第一次通过时,我想知道如何摆脱前导空格和噪音,例如破折号和连字符以及可能的撇号的混合?
所以问题是......如何让它变得更好,而且,我是否需要那些我询问是否( first_name[k] == '-' )的行并且对于空格也是如此?我只是在缓冲区上走一走,寻找重复项,应该落在连字符或单个空格上。正确的?