Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.14
-
None
Description
When parsing an encrypted Word document, a org.apache.poi.EncryptedDocumentException is thrown at WordExtractor.java#151. Tika catches this too far up the stack and incorrectly wraps it in a plain TikaException instead of a org.apache.tika.exception.EncryptedDocumentException.
The fix would be to catch and wrap the exception correctly, for example:
try { document = new HWPFDocument(root); } catch (org.apache.poi.EncryptedDocumentException e) { throw new EncryptedDocumentException(e); } catch (OldWordFileFormatException e) { parseWord6(root, xhtml); return; }