I want to extract usernames from a tweet where these usernames may be:
- followed by some non-alphanumerical characters.
- not preceded by a white space.
For instance, from this:
"RT@user1: This is a retweet that mentions @user2."
I would like to get a vector like
[1] @user1 @user2
(with or without the "@")
This is my current script:
text <- "RT@user1: This is a retweet that mentions @user2."
tokens <- unlist(strsplit(text, " "))
mentions.mask <- grepl("@\\w+", tokens)
mentions <- tokens[mentions.mask]
cat(mentions)
[1] "RT@user1:" "@user2."
How can I do it properly?