对于 Crowdflower,我需要定义一个CML,因为它与我想
LXML
用于此任务的 XML 非常相似。Crowdflower 为其定义了 CML 标签,例如:
<cml:textarea label="Sample text area:" />
或者<cml:checkbox label="A single checkbox with a default value" />
(摘自 Crowdflower 网站:[ 1 ])
此外,这些CML文件通常不使用根元素。当我尝试使用它创建CML元素时lxml
,会引发XMLSyntaxError
. 我想使用Is there a switch to ignore undefined namespace prefixes in LXML? ,但创建的CML缺少根。
如何修改 LXML 以使其忽略未知的命名空间前缀?
代码
from lxml import etree
txt = "An example label"
txt_elm = etree.fromstring('<cml:text label="%s" />' % txt)
错误信息
txt_elm = etree.fromstring('<cml:text label="%s" />' % txt)
File "lxml.etree.pyx", line 2994, in lxml.etree.fromstring (src/lxml/lxml.etree.c:63296)
File "parser.pxi", line 1617, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:93649)
File "parser.pxi", line 1495, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:92453)
File "parser.pxi", line 1011, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:89097)
File "parser.pxi", line 577, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:84804)
File "parser.pxi", line 676, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:85904)
File "parser.pxi", line 616, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:85228)
lxml.etree.XMLSyntaxError: Namespace prefix cml on text is not defined, line 1, column 16
示例 1
<h2>Image 1 :</h2>
<img src="{{urlmedia1}}" width="450" height="300">
<h2>Image 2 :</h2>
<img src="{{urlmedia2}}" width="450" height="300">
<cml:radios label="Which one of these two images conveys the most {{axis}} ?" validates="required">
<cml:radio label="Image 1"/>
<cml:radio label="Image 2"/>
</cml:radios>
示例 2
<p>Focus on the emotion you feel when watching these shots. Which one convey? <b>{{axis}}</b> the most ?</p>
<p>While you have the right to watch the shots as many times as you want, you should focus on your first impression.</p>
<hr/>
<p>video 1: <em>{{miscdata1}}</em></p>
<object style="height: 300px; width: 490px">
<param name="movie" value="{{urlmedia1}}" />
<param name="allowFullScreen" value="true" />
<param name="allowScriptAccess" value="always" />
<embed src="{{urlmedia1}}" type="application/x-shockwave-flash" allowfullscreen="false" allowScriptAccess="always" width="490" height="300"/>
</object>
<hr/>
<p>video 2: <em>{{miscdata2}}</em></p>
<object style="height: 300px; width: 490px">
<param name="movie" value="{{urlmedia2}}" />
<param name="allowFullScreen" value="true" />
<param name="allowScriptAccess" value="always" />
<embed src="{{urlmedia2}}" type="application/x-shockwave-flash" allowfullscreen="false" allowScriptAccess="always" width="490" height="300"/>
</object>
<hr/>
<cml:radios label="Which one conveys {{axis}} the most?" validates="required">
<cml:radio label="Shot 1" />
<cml:radio label="Shot 2" />
</cml:radios>
这两个例子都取自:2
注意:我的目标是构建一个 CML 结构,而不是解析一个。