2

I am trying to extract latex code from files but I do not want the comments; (comments start with a %). Comments are all the way to the line ending but I do not want to remove a literal % (prepended by \ as in \%). How would I go about that? Ideally given this:

   Lamport and has become the dominant method for using \TeX; few
   people write in plain \TeX{} anymore. The current version is
   \LaTeXe. % this is a comment

   % This is a comment; it will not be shown in the final output.
   % The following shows a little of the typesetting power of LaTeX:
   \begin{align}
    E &= mc^2                              \\
    m &= \frac{m_0}{\sqrt{1-\frac{v^2}{c^2}}}
   \end{align}
   this is a \% literal symbol.

I would get :

   Lamport and has become the dominant method for using \TeX; few
   people write in plain \TeX{} anymore. The current version is
   \LaTeXe.


   \begin{align}
    E &= mc^2                              \\
    m &= \frac{m_0}{\sqrt{1-\frac{v^2}{c^2}}}
   \end{align}
   this is a \% literal symbol.

Is there a way to do that with Python?

EDIT after working solution, thanks to all of you.

   r'(.*)(?<!\\\)%.*'
4

2 回答 2

4

You can do a regex replace of (?<!\\)%.*, but this is brittle, e.g. \verb!%! probably isn't a comment.

于 2013-05-29T08:35:07.953 回答
2

您可以从tex.stackechange.com上的这个答案中获得灵感。这个想法是:

  1. 用and和 in%之间的另一个非冲突符号替换\begin{verbatim}\end{verbatim}\verb|...|
  2. 使用正则(?<!\\)%.*表达式删除评论
  3. 改回以前的受保护%符号。

请注意,在乳胶中,以下

abc%comment
def

应该解释为

abcdef
于 2013-05-29T08:44:09.300 回答