I'm trying to convert Evernote Markup Language (ENML) to Markdown using Pandoc. ENML is mostly a subset of XHTML with a few additional elements. The element I'm trying to convert is a special <en-todo checked="true"/>
. Here's a sample ENML document with two en-todo
items:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE en-note SYSTEM "xml/enml2.dtd">
<en-note style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">
<div><en-todo checked="true"/>This is a thing<br/></div>
<div><en-todo checked="false"/>This is another thing<br/></div>
</en-note>
I'm trying to convert it to the following markdown:
[X] This is a thing
[ ] This is another thing
My current approach is to create a JSON filter
pandoc --parse-raw -f html -t json test.enml | \
./my-filter | pandoc -f json -t markdown
I'm not sure how to properly parse the RawInline
blocks:
[
{
"Para": [
{
"RawInline": [
"html",
"<en-todo checked=\"true\">"
]
},
{
"RawInline": [
"html",
"</en-todo>"
]
},
{
"Str": "This"
},
"Space",
{
"Str": "is"
},
"Space",
{
"Str": "a"
},
"Space",
{
"Str": "thing"
},
"LineBreak"
]
},
{
"RawBlock": [
"html",
"</div>"
]
},
{
"RawBlock": [
"html",
"<div>"
]
},
{
"Para": [
{
"RawInline": [
"html",
"<en-todo checked=\"false\">"
]
},
{
"RawInline": [
"html",
"</en-todo>"
]
},
{
"Str": "This"
},
"Space",
{
"Str": "is"
},
"Space",
{
"Str": "another"
},
"Space",
{
"Str": "thing"
},
"LineBreak"
]
}
]