1

我在注释和字符串中有带有西里尔字符的源代码。MSVC 允许在标识符中使用西里尔字符。如何找到忽略所有注释和字符串的所有西​​里尔字符?我想在不使用 gcc 或脚本的情况下做到这一点,完美地使用简单的正则表达式搜索。找到评论 /*.*?*/ 并不难,但是如何找到评论中没有的东西,而不是 ASCII 字符集中的东西?

4

1 回答 1

0

Let's assume that all comments behave like '//'--even the ones that are '/* comment */'--in the sense that once a comment starts you won't have more code after the comment on the same line. Try piping your source file through this:

perl -lne 'print $1 if m{^([^/]+)(?:/[/*])?}'

That will get you everything but the comments.

The remaining problem is a function of the character set. If it is Windows-1251, you can look for patterns like this: '[^\x00-\x7f]+'

于 2012-11-21T23:20:19.450 回答