Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.11.1
-
None
-
None
Description
ReplaceText processor strips backslashes if the Replacement Strategy is Regex Replace and the Replacement Value contains expression and backreference.
I was able to reduce the problem to the following example:
- Search Value: (.)
- Replacement Value: ${'$1'}
- Input: \a
- Expected output: \a
- Actual output: a -> the backslash has been removed
I did some investigation and the cause seems to be the usage of Matcher.matcher.appendReplacement at [1], which according to the documentation [2] treats backslashes (and dollar signs) in a special manner: "Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string."
See the following example, it's very similar to what happens at [1] when running with the example above.
Matcher matcher = Pattern.compile("(.)").matcher("A"); matcher.find(); matcher.appendReplacement(sb, "\\a");
The result is a in the StringBuilder, not \a as I would expect at first. If the backslash in \a is escaped (i.e.
a) the result is \a.
This issue relates to NIFI-5813 and it seems the normalizeReplacementString method [3] works incorrectly: in NIFI-5813 it escapes things it shouldn't while it doesn't escape what it should.
[1] https://github.com/apache/nifi/blob/rel/nifi-1.11.1/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceText.java#L555
[2] https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Matcher.html#appendReplacement(java.lang.StringBuffer,java.lang.String)
[3] https://github.com/apache/nifi/blob/rel/nifi-1.11.1/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceText.java#L681