正如答案的评论所指出的,所有其他答案似乎都有缺陷,所以我将发布我自己的代码,似乎涵盖了评论中提到的所有缺陷。
我相信这getSourceRange()
将语句视为一系列标记,而不是一系列字符。这意味着,如果我们有 aclang::Stmt
对应于FOO + BAR
,则标记FOO
在字符 1 处,标记+
在字符 5 处,标记BAR
在字符 7 处。getSourceRange()
因此返回的 aSourceRange
基本上意味着“此代码以标记 1 开始并结束令牌位于 7"。因此,我们必须使用来获取令牌clang::Lexer::getLocForEndOfToken(stmt.getSourceRange().getEnd())
结束字符的实际字符位置,并将其作为结束位置传递给. 如果我们不这样做,那么将返回,而不是我们可能想要的。BAR
clang::Lexer::getSourceText
clang::Lexer::getSourceText
"FOO + "
"FOO + BAR"
我不认为我的实现有评论中提到的@Steven Lu 的问题,因为这段代码使用了该clang::Lexer::getSourceText
函数,根据 Clang 的源文档,该函数专门用于从一个范围中获取源文本。
此实现还考虑了@Ramin Halavati 的评论;我已经在一些代码上对其进行了测试,它确实返回了宏扩展的字符串。
这是我的实现:
/**
* Gets the portion of the code that corresponds to given SourceRange, including the
* last token. Returns expanded macros.
*
* @see get_source_text_raw()
*/
std::string get_source_text(clang::SourceRange range, const clang::SourceManager& sm) {
clang::LangOptions lo;
// NOTE: sm.getSpellingLoc() used in case the range corresponds to a macro/preprocessed source.
auto start_loc = sm.getSpellingLoc(range.getBegin());
auto last_token_loc = sm.getSpellingLoc(range.getEnd());
auto end_loc = clang::Lexer::getLocForEndOfToken(last_token_loc, 0, sm, lo);
auto printable_range = clang::SourceRange{start_loc, end_loc};
return get_source_text_raw(printable_range, sm);
}
/**
* Gets the portion of the code that corresponds to given SourceRange exactly as
* the range is given.
*
* @warning The end location of the SourceRange returned by some Clang functions
* (such as clang::Expr::getSourceRange) might actually point to the first character
* (the "location") of the last token of the expression, rather than the character
* past-the-end of the expression like clang::Lexer::getSourceText expects.
* get_source_text_raw() does not take this into account. Use get_source_text()
* instead if you want to get the source text including the last token.
*
* @warning This function does not obtain the source of a macro/preprocessor expansion.
* Use get_source_text() for that.
*/
std::string get_source_text_raw(clang::SourceRange range, const clang::SourceManager& sm) {
return clang::Lexer::getSourceText(clang::CharSourceRange::getCharRange(range), sm, clang::LangOptions());
}