[STANBOL-1403] Add PLAIN linking mode to the FST linking engine - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.0.0, 0.12.1
Component/s: Enhancement Engines
Labels:
None

Description

The Lucene FST linking engine uses a similar linking process as the entity linking engine. This means that NLP processing results are used to determine "Linkable" and "Matchable" tokens in the text. "Linkable" tokens are than used to initiate vocabulary lookups and "Linkable" and "Matchable" tokens are used to check if labels of entities do actually match with the text.

This issue will introduce a new linking mode where the FST linking engine that will try to link every singe word in the text. Instead of using NLP processing results this will simple use the Solr Analyzer of the configured field.

The PLAIN mode is intended to be used in cases:

where no NLP support is available
for vocabularies that do contain entities that appear in text with tokens other than nouns (e.g. a vocabulary that contains activities)

The PLAIN mode will not work in cases where users have used ProperNoun mode with big vocabularies.

Attachments

Activity

People

Assignee:: Rupert Westenthaler

Reporter:: Rupert Westenthaler

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 10/Nov/14 14:09

Updated:: 12/Nov/14 07:55

Resolved:: 12/Nov/14 07:55