Does anybody have any experience with using StanfordCoreNLP ( through rJava in R? I’ve been struggling to get it to work for two days now, and think I’ve exhausted Google and previous questions on StackOverflow.
Essentially I’m trying to use the StanfordNLP libraries from within R. I have zero Java experience, but experience with other languages, so understand the basics about classes and objects etc.
From what I can see, the demo .java file that comes with the libraries seems to show that to use the classes from within Java, you’d import the libraries and then create a new object, along the lines of:
import java.util.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.util.*;
public class demo {
StanfordCoreNLP pipeline = new StanfordCoreNLP();
From within R, I’ve tried calling some standard java functions; this works fine, which makes me think it’s the way I’m trying to access the Stanford libraries that’s causing the issue.
I extracted the Stanford ZIP to h:\stanfordcore, so the .jar files are all in the root of this directory. As well as the various other files contained in the zip, it contains the main .jar files:
- joda-time.jar
- stanford-corenlp-1.3.4.jar
- stanford-corenlp-1.3.4-javadoc.jar
- stanford-corenlp-1.3.4-models.jar
- joda-time-2.1-sources.jar
- jollyday-0.4.7-sources.jar
- stanford-corenlp-1.3.4-sources.jar
- xom.jar
- jollyday.jar
If I try to access the NLP tools from the command line, it works fine.
From within R, I initalized the JVM and set the classpath variable:
.jinit(classpath = " h:/stanfordcore", parameters = getOption("java.parameters"),silent = FALSE, force.init = TRUE)
After this, if I use the command
This shows that the directory containing the required .jar files has been added and gives this output in R:
[1] "H:\RProject-2.15.1\library\rJava\java" "h:\ stanfordcore"
However, when I try create a new object (not sure if this is the right Java terminology) I get an error.
I’ve tried creating the object in dozens of different ways (basically shooting in the dark though), but the most promising (simply because it seems to actually find the class is):
pipeline <- .jnew(class="edu/stanford/nlp/pipeline/StanfordCoreNLP",check=TRUE,silent=FALSE)
I know this finds the class, because if I change the class parameter to something not listed in the API, I get a cannot find class error.
As it stands, however, I get the error:
Error in .jnew(class = "edu/stanford/nlp/pipeline/StanfordCoreNLP", check = TRUE, : java.lang.NoClassDefFoundError: Could not initialize class edu.stanford.nlp.pipeline.StanfordCoreNLP
My Googling indicates that this might be something to do with not finding a required .jar file, but I’m completely stuck. Am I missing something obvious?
If anyone can point me even a little in the right direction, I’d be incredibly grateful.
Thanks in advance!