Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
New, Patch Available
Description
currently there s no way to preserve the delimiter in tokenizer, because the basic tokenizer like CharTokenizer ignore them.
this s to make the basic tokenizer more customizable
e.g. "mac_book_pro" -> [mac_, book_, pro]