I have a program which copies a word file (docx/doc) as follows:
A source file which is doc/docx is first copied to a temporary raw file where the extension is lost. Now the contents of this temporary raw file are to be copied to a file with suitable extension(doc/docx). Since, nothing is known at this point about the original file, it is required here to derive extension of the source Word Document from its contents.
InputStream in = new FileInputStream ( src );
OutputStream out = new FileOutputStream ( dst );
byte [] buf = new byte [1024];
int len;
while ( ( len = in.read ( buf ) ) > 0 ) {
out.write ( buf, 0, len );
}
Destination dst
is a raw file without any extension (say, 'sample-file'), which I can't change. The sourcesrc
may be a 'doc' or a 'docx' type.
Now, as an output, I need to copy the contents of dst
to a Word Document with proper format as of src
(this 'proper format' is important here, otherwise the document is rendered useless). Since dst
doesn't have any extension, I cannot find the file format by just looking at the name. Is there a way, I can retrieve the file extension from file contents? Hopefully, Word document must have some meta-data containing this information.