1

In one of my C application I am using, below functions from ctype.h :

isalpha(), isspace(), ispunct(), tolower().

After profiling I see there are some bottlenecks in the calls of these functions(Basically my app is a character/string processing from a input text file and hence these functions are called exhaustively inside critical loops)I want to optimize them for speed and have my own implementation if it helps.

Where can I find such or logic to implement them?

4

5 回答 5

4

You could implement them as macros or inline functions:

#define IS_ALPHA(x) (((x) >= 'a' && (x) <= 'z') || ((x) >= 'A' && (x) <= 'Z'))
#define IS_SPACE(x) ((x) == ' ' || (x) == '\t')
... etc.

Note however that the original isalpha, isspace, ispunct, etc. depend on the current locale and may yield different results depending on the language.

于 2011-07-18T18:40:55.157 回答
2

It sounds to me odd that such functions can be your bottleneck; likely they can take into account the locale, and this makes them "slower". If you can disregard it, then you can implement them as easily as (e.g.: this is just an idea wrote on the fly)

bool isalpha(int c)
{
   return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
}

bool isspace(int c)
{
   return c == ' ' || c == '\t'; // || whatever other char you consider space
}

bool ispunct(int c)
{
   static const char *punct = ".;!?...";
   return strchr(punct, c) == NULL ? false : true; // you can make this shorter
}

int tolower(int c)
{
   if ( !isalpha(c) ) return c;
   return (c >= 'A' && c <= 'Z') ? c - 'A' : c;
}

Then make them inline functions.

于 2011-07-18T18:47:12.423 回答
2

You can make fast implementations of these functions by using a lookup table of 256 elements. For isalpha(), the i'th element is 1 if the character whose ASCII value is i is an alphanumeric. Then isalpha is just a table lookup.

You can save some space and encode all of these functions with one table by devoting one bit of each entry to the result of one function. Then each function simply looks up the entry for the character passed in, and masks out the bit that it needs.

Dave

于 2011-07-18T19:15:56.083 回答
0

In general the people that write library code are very good software engineers and those functions have been tuned to the nth degree. Unless you can remove some of the cases that those functions have to account for you will have trouble matching their performance.

于 2011-07-18T18:40:21.643 回答
0

Take a look at the ctype.h header - your compiler library probably already provides a way to have these functions inlined or implemented as macros (if inline isn't supported for whatever reason). (By the way - what compiler & target platform are you using?)

If these things are already inlined/macros then you might want to post some details about how you're using the functions. Maybe there's a way to shortcut calling some of these functions (for example, if isspace() is true, you don't need to call isalpha() or ispunct() since they must not be true).

于 2011-07-18T18:57:49.480 回答