I'm using Apache Tika for extracting metadata from documents. I'm mostly interested in setting up a basic dublin core, like Author, Title, Date, etc. I'm not interested in the content of the documents at all. Currently I'm simply doing the usual thing:
FileInputStream fis = new FileInputStream( uploadedFileLocation );
// Tika parsing
Metadata metadata = new Metadata();
ContentHandler handler = new BodyContentHandler();
AutoDetectParser parser = new AutoDetectParser();
parser.parse(fis, handler, metadata);
Is there some way to tell Tika to not parse the content? I'm hoping that this will speed things up as well as save memory.