[TIKA-1318] Use of Deprecated Word6Extractor.getParagraphText() Method - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: 1.5
Fix Version/s: 1.17, 2.0.0-BETA, 2.1.0
Component/s: parser
Labels:
- deprecation

Description

org.apache.tika.parser.microsoft.WordExtractor.parseWord6() uses the deprecated Word6Extractor.getParagraphText() method. getParagraphText() is supposed to return a String[] with an element for each paragraph in the text. The replacement is getText(), which lets paragraph, cell, etc separation be implementation specific. I'm not sure, at this point, how the POI WordExtractor separates them.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Tyler Bui-Palsulich

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 03/Jun/14 16:02

Updated:: 17/Aug/21 13:29