perl - s3cmd 内容列表 - 只有文件名 - perl 一个衬里？

Question

目前我正在使用s3cmd ls s3://location/ > file.txt获取我的 s3 存储桶的内容列表并保存在 txt 中。然而，上面返回日期、文件大小路径和文件名。

例如：

2011-10-18 08:52      6148   s3://location//picture_1.jpg

我只需要 s3 存储桶的文件名 - 所以在上面的例子中我只需要picture_1.jpg.
有什么建议么？

这是否可以在初始导出后使用 Perl 单列来完成？

score 5 · Accepted Answer

使用 awk：

s3cmd ls s3://location/ | awk '{ print $4 }' > file.txt

如果您有带空格的文件名，请尝试：

s3cmd ls s3://location/ | awk '{ s = ""; for (i = 4; i <= NF; i++) s = s $i " "; print s }' > file.txt

score 2 · Accepted Answer

File::Listing不支持这种格式，因为这种列表格式的设计者愚蠢到不能简单地重复使用现有的格式。让我们手动解析它。

use URI;
my @ls = (
    "2011-10-18 08:52 6148 s3://location//picture_1.jpg\n",
    "2011-10-18 08:52 6148 s3://location//picture_2.jpg\n",
    "2011-10-18 08:52 6148 s3://location//picture_3.jpg\n",
);

for my $line (@ls) {
    chomp $line;
    my $basename = (URI->new((split q( ), $line)[-1])->path_segments)[-1];
}

__END__
picture_1.jpg
picture_2.jpg
picture_3.jpg

作为oneliner：

perl -mURI -lne 'print ((URI->new((split q( ), $line)[-1])->path_segments)[-1])' < input

score 0 · Accepted Answer

我确信一个特定的模块是更安全的选择，但如果数据是可靠的，你可以使用单线：

假设输入是：

2011-10-18 08:52 6148 s3://location//picture_1.jpg
2011-10-18 08:52 6148 s3://location//picture_2.jpg
2011-10-18 08:52 6148 s3://location//picture_3.jpg
...

单线：

perl -lnwe 'print for m#(?<=//)([^/]+)$#'

-l chomps 输入，并在print语句末尾添加换行符
-n在脚本周围添加一个while(<>)循环
(?<=//)后向断言找到一个双斜杠
...后跟非斜线到行尾
for循环向我们保证不打印不匹配的内容。

该-n选项的好处是这种单衬管可用于管道或文件中。

command | perl -lnwe '...'
perl -lnwe '...' filename

perl - s3cmd 内容列表 - 只有文件名 - perl 一个衬里？

3 回答 3

Related

Reference