I am coming up with a made-up example of bag of words from three documents (I am demonstrating how tf-idf works given a document-term frequency matrix), and I want to transform my bow matrix into a tf-idf matrix. I don't actually have text data, just the number I made up in my example? How can I use that to produce tf-idf output? I am getting the error message "'numpy.ndarray' object has no attribute 'lower'" on the last line (and I am assuming it is because fit_transform
is expecting text data. Is it possible to specify or override this somehow?
bow = np.array([[15,0,5,0,20], [20,30,0,25,0], [15,10,10,20,15]])
vectorizer = TfidfVectorizer()
vectorizer.fit_transform(bow)