我正在寻找一个 applescript 例程或子例程来查找此 HTML 标记字符串:
<td width="487">
在此 HTML 代码中:
<h1><span id="profile-name-94461" >Jan Schlatter</span></h1>
</span>
<table width="100%" border="0" cellspacing="0" cellpadding="0" id="profile-table">
<tr>
<th width="163" scope="col">Introduction</th>
<td width="487">Education :
<br />Management and support on responsibilities in finances and accounting.</td>
</tr>
<tr>
<th>Role</th>
<td>
<p>Portfolio Management</p><p>Senior Management</p> </td>
</tr>
<tr>
<th>Organisation Type</th>
<td>
<p>Family Office</p> </td>
</tr>
<tr>
<th>Email</th>
<td><a href="mailto:jan.schlatter@bohnetschlatter.ch" title="jan.schlatter@bohnetschlatter.ch" >jan.schlatter@bohnetschlatter.ch</a></td>
</tr>
<tr>
<th>Website</th>
<td><a href="http://bohnetschlatter.ch" target="_new" title="http://bohnetschlatter.ch" >http://bohnetschlatter.ch</a></td>
</tr>
<tr>
<th>Phone</th>
<td>+41 41 727 61 61</td>
</tr>
<tr>
<th>Fax</th>
<td>+41 41 727 61 62</td>
</tr>
<tr>
<th>Mailing Address</th>
<td>Gartenstrasse 2<br>Postfach 42</td>
</tr>
<tr>
<th>City</th>
<td>Zurich</td>
</tr>
<tr>
<th>State</th>
<td></td>
</tr>
<tr>
<th>Country</th>
<td>Switzerland</td>
</tr>
<tr>
<th class="lastrow" >Zip/ Postal Code</th>
<td class="lastrow" >6301</td>
</tr>
</table>
因为 HTML 标记并不总是在我想要处理的每个 HTML 文件中,所以我希望它返回一个布尔值,用于 if、then、else 语句中,然后在该值返回“true”时完成操作”。
我开始使用的applescript是
set intoTag to "<td width=" & quote & "487" & quote & ">"
on stripLastWordBeforeLogoEndTag(theText)
set text item delimiters to introTag
set a to text items of theText
set b to item 1 of a
set text item delimiters to space
set item 1 of a to (text items 1 thru -2 of b) as text
set text item delimiters to "</Logo>"
set fixedText to a as text
set text item delimiters to ""
return fixedText
if infoTag = fixedText then set bool to true
else set bool to false
end if
if true then (do action[[set extractText_INTRODUCTION to extractBetween(extractText, "<td width=" & quote & "487" & quote & ">", "</td>")]])
else (do not do action)
end if
我宁愿不使用 shell 脚本,因为我几乎不知道如何编辑 shell 脚本。在我看来,文本分隔符将是最好的解决方案,尽管欢迎任何答案。谢谢