I am trying to extract latex code from files but I do not want the comments; (comments start with a %
). Comments are all the way to the line ending but I do not want to remove a literal %
(prepended by \
as in \%
). How would I go about that? Ideally given this:
Lamport and has become the dominant method for using \TeX; few
people write in plain \TeX{} anymore. The current version is
\LaTeXe. % this is a comment
% This is a comment; it will not be shown in the final output.
% The following shows a little of the typesetting power of LaTeX:
\begin{align}
E &= mc^2 \\
m &= \frac{m_0}{\sqrt{1-\frac{v^2}{c^2}}}
\end{align}
this is a \% literal symbol.
I would get :
Lamport and has become the dominant method for using \TeX; few
people write in plain \TeX{} anymore. The current version is
\LaTeXe.
\begin{align}
E &= mc^2 \\
m &= \frac{m_0}{\sqrt{1-\frac{v^2}{c^2}}}
\end{align}
this is a \% literal symbol.
Is there a way to do that with Python?
EDIT after working solution, thanks to all of you.
r'(.*)(?<!\\\)%.*'