-2

我有一个包含以下数据的 .CSV:

"http://iis.se/write-content/?submitted","The intro","<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus dictum lectus eget enim condimentum, eget bibendum libero porta. Suspendisse vestibulum libero nisl, quis tempus nisl semper in. Ut mi nisl, vehicula quis tristique ut, molestie et est. Donec auctor, ante eu venenatis aliquam, felis nisi pretium turpis, ut mattis dui orci et sem. Duis vitae accumsan velit. Sed tristique lacus nisl, vehicula congue turpis ultrices sed. In hac habitasse platea dictumst. Sed dictum scelerisque nibh non venenatis. In viverra eros non arcu pellentesque, nec pulvinar turpis placerat.</p> <p>Proin suscipit metus vitae nisi dignissim ullamcorper. Nullam eleifend tempor ligula, sit amet semper metus.</p><p>Proin bibendum bibendum suscipit. Cras pretium lectus sit amet urna interdum, in ultricies eros scelerisque. Pellentesque id condimentum libero. Aenean placerat orci a dictum pharetra. Pellentesque sagittis egestas gravida. Pellentesque suscipit mauris neque, quis auctor lacus blandit et. Curabitur a quam a velit condimentum tristique. Morbi volutpat pulvinar viverra. Duis cursus lectus ac sem dictum, eu tempor risus blandit. In accumsan arcu at lorem mattis lacinia. Vestibulum vitae mollis sem, nec commodo nunc. Donec vel ultricies nunc. Nam at sapien nec libero aliquam pharetra vitae eget leo.</p><p>Read more here <a href=""http://www.google.com"">here</a></p>","Thank you!"
"http://website.com/add/?submitted","The, nice, Second","<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus dictum lectus eget enim condimentum, eget bibendum libero porta. Suspendisse vestibulum libero nisl, quis tempus nisl semper in. Ut mi nisl, vehicula quis tristique ut, molestie et est. <a href=""http://www.altavista.com"">Donec auctor</a>, ante eu venenatis aliquam, felis nisi pretium turpis, ut mattis dui orci et sem. Duis vitae accumsan velit. Sed tristique lacus nisl, vehicula congue turpis ultrices sed. In hac habitasse platea dictumst. Sed dictum scelerisque nibh non venenatis. In viverra eros non arcu pellentesque, nec pulvinar turpis placerat.</p> <p>Proin suscipit metus vitae nisi dignissim ullamcorper. Nullam eleifend tempor ligula, sit amet semper metus.</p><p>Proin bibendum bibendum suscipit. Cras pretium lectus sit amet urna interdum, in ultricies eros scelerisque. Pellentesque id condimentum libero. Aenean placerat orci a dictum pharetra. Pellentesque sagittis egestas gravida. Pellentesque suscipit mauris neque, quis auctor lacus blandit et. Curabitur a quam a velit condimentum tristique. Morbi volutpat pulvinar viverra. Duis cursus lectus ac sem dictum, eu tempor risus blandit. In accumsan arcu at lorem mattis lacinia. Vestibulum vitae mollis sem, nec commodo nunc. Donec vel ultricies nunc. Nam at sapien nec libero aliquam pharetra vitae eget leo.</p>","Thank you!, even more!!!"

简单地,

  • 在 COL1 中,我想删除顶级域之后的所有内容
  • 在 COL2 中,我希望逗号、空格成为连字符,但它可能不会在任何时候加倍连字符
  • COL1 和 COL2 应合并为(输出)COL1
  • 在 COL3 中,除了包含在其中的域之外的所有内容都应该被删除<a> </a>
  • COL4 不可触碰

因此,在这种情况下,我希望输出变为:

"http://iis.se/the-intro","http://www.google.com","Thank you!"
"http://website.com/the-nice-second","http://www.altavista.com","Thank you!, even more!!!"

有可能还是非常先进?

我正在考虑用宏记录的 Notepad++ 中的一些 RegEx 替换。

4

1 回答 1

1

首先,你应该真正开始尝试,即使他们失败了。它显示了您一直在尝试的内容,并且其他人可以指出您做错了什么,以便您将来可以纠正他们。


可以使用这一系列的替换(F表示查找,R表示替换,第二次替换为空):

F: (http://[^/]+/)[^"]+","([^"]+")
R: $1$2

F: "<[^"]+"
R:

F: ">[^<]+</
R: ,"

F: ,?\s(?=[^"]+",)
R: -

在较少的查找/替换中可能会有一些可能,我还没有探索所有的可能性。请注意,正则表达式不处理替换字符大小写,因此您的实际最终产品将是:

"http://iis.se/The-intro","http://www.google.com","Thank you!"
"http://website.com/The-nice-Second","http://www.altavista.com","Thank you!, even more!!!"

与您想要的结果进行比较:

"http://iis.se/the-intro","http://www.google.com","Thank you!"
"http://website.com/the-nice-second","http://www.altavista.com","Thank you!, even more!!!"

要将它们转换为小写,如果链接的字符长度全部相同,您也许可以选择列并将它们转换为小写(通过按住Alt然后选择文本,您在记事本++中垂直选择并使用Ctrl+U将所有字符转换为小写)。

于 2013-07-10T15:02:40.733 回答