I'm not a [computational] linguistic, so please excuse my supper dummy-ness in this topic.
According to Wikipedia, lemmatisation is defined as:
Lemmatisation (or lemmatization) in linguistics, is the process of grouping together the different inflected forms of a word so they can be analysed as a single item.
Now my question is, is the lemmatised version of any member of the set {am, is, are} supposed to be "be"? If not, why not?
Second question: How do I get that in R or python? I've tried methods like this link, but non of them gives "be" given "are". I guess at least for the purpose of classifying text documents, this makes sense to be true.
I also couldn't do that with any of the given demos here.
What am I doing/assuming wrong?