0

如何在 mac 上使用 bash 或 python 将文本文件(示例文本如下)中的用户名提取到 mysql 数据库中?

124dave87 10 months ago

:) ...Thank you for making this video.

Reply  ·

kateDVKH 1 year ago
@karluchii19 i'm still trying to figure out who you are?!?

Thanks for replying.
Reply  ·

shotwioke 3 months ago
hey how is everything going with your health-i hope/pray things are going good for you.God bless
Reply  ·   in reply to MrNickkaye (Show the comment)

例如,对于上面的文本文件,脚本会输出以下内容:

124dave87    
kateDVKH    
shotwioke
4

5 回答 5

1

如果我理解正确,您正在寻找以行首开头并以第一个空格字符结尾的字符串。是对的吗?

如果是这样,最快/最简单的方法可能是:

egrep -o "^[^ ]*"

编辑(根据您在下面的评论)

你能稍微扩展一下你在这里寻找的东西吗?真正的目的是什么?它可能会帮助我们构建我们的答案......

也就是说,如果您只是想获取唯一用户名列表,您可以尝试:

egrep -o "^[^ ]*" | sort | uniq

如果您的架构允许,您还可以向数据库表添加唯一约束。

于 2013-04-09T19:27:42.227 回答
1

您可以在 Python 中使用正则表达式。例如:

import re

test="""124dave87 10 months ago :) ...Thank you for making this video. Reply ·

kateDVKH 1 year ago @karluchii19? i'm still trying to figure out who you are?!? Reply ·

shotwioke 3 months ago hey how is everything going with? your health-i hope/pray things are going good for you.God bless Reply · in reply to MrNickkaye (Show the comment)
"""

for line in test.split('\n'):
    words = re.findall(r'\w+', line)
    if(len(words) > 0):
        # write words[0] to mysql
于 2013-04-09T19:48:39.673 回答
1
grep -E "[0-9]+ (month|year|day|week)s? ago" a.txt| grep -Eo "^[a-zA-Z0-9]+"

我确信这可以使用 awk 或 sed 一步完成

于 2013-04-09T19:50:52.620 回答
1

awk可能相对容易理解:

awk '$0 ~ " [0-9]+ (month|year|day|week)s? ago" {print $1}'

如果该行包含该模式,则打印第一个单词。管道以sort | uniq获取唯一的用户名。

于 2013-04-09T19:55:54.687 回答
1

grep与前瞻可以给你想要的:

 grep -Po '^(\w+)(?=\s\d+\s\w+\sago$)' file
于 2013-04-09T20:00:04.253 回答