3

问题是HTML 代码 (Automator/AppleScript) 的纯文本 URL的续集。

假设我有一个纯 txt 文件 /Users/myname/Desktop/URLlist.txt:

title 1
http://a.b/c

title 2
http://d.e/f

...

我想 (1) 将所有 URL ( http://.. .) 转换为 HTML 代码,并且 (2) 添加

&nbsp;<br />

到每个空行,这样前面提到的内容就变成了:

title 1
<a href="http://a.b/c">http://a.b/c</a>
&nbsp;<br />
title 2
<a href="http://d.e/f">http://d.e/f</a>
&nbsp;<br />
...

我来到以下Applescript:

set inFile to "/Users/myname/Desktop/URLlist.txt"
set middleFile to "/Users/myname/Desktop/URLlist2.txt"
set outFile to "/Users/myname/Desktop/URLlist3.txt"

do shell script "sed 's/\\(http[^ ]*\\)/<a href=\"\\1\">\\1<\\/a>/g' " & quoted form of inFile & " >" & quoted form of middleFile
do shell script "sed 's/^$/\\&nbsp;<br \\/>/g' " & quoted form of middleFile & " >" & quoted form of outFile

它有效,但它是多余的(而且很愚蠢?)。谁能让它更简洁?是否可以只涉及一个文本文件而不是三个(即 /Users/myname/Desktop/URLlist.txt 中的原始内容被最终结果覆盖)?

非常感谢您提前。

4

3 回答 3

2

尝试:

set inFile to "/Users/myname/Desktop/URLlist.txt"

set myData to (do shell script "sed '
/\\(http[^ ]*\\)/ a\\
&nbsp;<br />
' " & quoted form of inFile & " | sed 's/\\(http[^ ]*\\)/<a href=\"\\1\">\\1<\\/a>/g' ")

do shell script "echo " & quoted form of myData & " > " & quoted form of inFile

这将允许您稍后在脚本中使用 myData 变量。如果这不是较大脚本的一部分,而您只是修改文件,请按照 jackjr300 的建议使用 -i 选项。此外,此脚本查找原始模式并将新行附加到其中,而不是简单地查找空行。

编辑:

set inFile to "/Users/myname/Desktop/URLlist.txt"
set myData to (do shell script "sed 's/\\(http[^ ]*\\)/<a href=\"\\1\">\\1<\\/a>/g; s/^$/\\&nbsp;<br \\/>/g' " & quoted form of inFile)
do shell script "echo " & quoted form of myData & " > " & quoted form of inFile
于 2012-12-08T16:17:07.167 回答
2

使用-i ''选项就地编辑文件。

set inFile to "/Users/myname/Desktop/URLlist.txt"

do shell script "sed -i '' 's:^$:\\&nbsp;<br />:; s:\\(http[^ ]*\\):<a href=\"\\1\">\\1</a>:g' " & quoted form of inFile

如果您想要原始文件的副本,请使用指定的扩展名,例如sed -i ' copy'

- 更新:

`DOCTYPE 是必需的序言。出于遗留原因,需要 DOCTYPE。当省略时,浏览器倾向于使用与某些规范不兼容的不同呈现模式。在文档中包含 DOCTYPE 可确保浏览器尽最大努力遵循相关规范。

HTML lang 属性可用于声明网页或网页的一部分的语言。这是为了帮助搜索引擎和浏览器。<html>根据 W3C 建议,您应该在标签内使用 lang 属性声明每个网页的主要语言

<meta>标记提供有关 HTML 文档的元数据。<meta>标签总是在<head>元素内部。该http-equiv属性为内容属性的信息/值提供 HTTP 标头。 :与或名称属性 content关联的值。:要正确显示HTML页面,浏览器必须知道使用什么字符集。http-equivcharset

在此脚本中:我将“ utf-8 ”作为编码,通过原始文件的编码对其进行更改。

set inFile to "/Users/myname/Desktop/URLlist.html" -- text file with a ".html" extension
set nL to linefeed
set prepandHTML to "<!DOCTYPE html>\\" & nL & "<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en-US\" lang=\"en-US\">\\" & nL & tab & "<head><meta http-equiv=\"content-type\" content=\"text/html; charset=utf-8\" />\\" & nL & "</head>\\" & nL

do shell script "sed -i '' 's:^$:\\&nbsp;<br />:; s:\\(http[^ ]*\\):<a href=\"\\1\">\\1</a>:g; 1s~^~" & prepandHTML & "~' " & quoted form of inFile
do shell script "echo '</html>' " & quoted form of inFile -- write last HTML tag
于 2012-12-08T16:25:39.460 回答
1

我不能很好地理解 sed 命令(它让我的大脑受伤)所以这里是执行此任务的 applescript 方法。希望能帮助到你。

set f to (path to desktop as text) & "URLlist.txt"

set emptyLine to "&nbsp;<br />"
set htmlLine1 to "<a href=\""
set htmlLine2 to "\">"
set htmlLine3 to "</a>"

-- read the file into a list
set fileList to paragraphs of (read file f)

-- modify the file as required into a new list
set newList to {}
repeat with i from 1 to count of fileList
    set thisItem to item i of fileList
    if thisItem is "" then
        set end of newList to emptyLine
    else if thisItem starts with "http" then
        set end of newList to htmlLine1 & thisItem & htmlLine2 & thisItem & htmlLine3
    else
        set end of newList to thisItem
    end if
end repeat

-- make the new list into a string
set text item delimiters to return
set newFile to newList as text
set text item delimiters to ""

-- write the new string back to the file overwriting its contents
set openFile to open for access file f with write permission
write newFile to openFile starting at 0 as text
close access openFile

编辑:如果您在编码时遇到问题,这两个处理程序将正确处理读/写。因此,只需将它们插入代码并调整这些行以使用处理程序。祝你好运。

注意:使用 TextEdit 打开文件时,请使用文件菜单并专门以 UTF-8 格式打开。

on writeTo_UTF8(targetFile, theText, appendText)
    try
        set targetFile to targetFile as text
        set openFile to open for access file targetFile with write permission
        if appendText is false then
            set eof of openFile to 0
            write «data rdatEFBBBF» to openFile starting at eof -- UTF-8 BOM
        else
            tell application "Finder" to set fileExists to exists file targetFile
            if fileExists is false then
                set eof of openFile to 0
                write «data rdatEFBBBF» to openFile starting at eof -- UTF-8 BOM
            end if
        end if
        write theText as «class utf8» to openFile starting at eof
        close access openFile
        return true
    on error theError
        try
            close access file targetFile
        end try
        return theError
    end try
end writeTo_UTF8

on readFrom_UTF8(targetFile)
    try
        set targetFile to targetFile as text
        targetFile as alias -- if file doesn't exist then you get an error
        set openFile to open for access file targetFile
        set theText to read openFile as «class utf8»
        close access openFile
        return theText
    on error
        try
            close access file targetFile
        end try
        return false
    end try
end readFrom_UTF8
于 2012-12-08T18:01:33.817 回答