I have a data frame sp
which contains several species names but as they come from different databases, they are written in different ways.
For example, one specie can be called Urtica dioica and Urtica dioica L..
To correct this, I use the following code which extracs only the two first words from a row:
paste(strsplit(sp[i,"sp"]," ")[[1]][1],strsplit(sp[i,"sp"]," ")[[1]][2],sep=" ")
For now, this code is integrated in a for
loop, which works but takes ages to finish:
for (i in seq_along(sp$sp)) {
sp[i,"sp2"] = paste(strsplit(sp[i,"sp"]," ")[[1]][1],
strsplit(sp[i,"sp"]," ")[[1]][2],
sep=" ")
}
If there a way to improve this basic code using vectors or an apply function?