Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-2063

A 4 byte UTF-8 character incorrectly failing maxlenght facet.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0, 3.0.1, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.1.3, 3.1.4
    • None
    • None
    • Windows (Affects all OS)

    Description

      A 4 byte UTF-8 character incorrectly failing maxlenght facet.
      The data is F0 9D 90 80 and is a 4-byte UTF-8 sequence to represent 1 character.
      It is failing with
      Error at file input.xml, line 4, char 17
      Message: value '??' has length '2' which exceeds maxLength facet value '1'
      when running sax2count.exe

      This looks like a limitation but I could not find any documentation about it in the bug list.

      *Input XML*

      <?xml version="1.1" encoding="UTF-8"?>
      <Root xmlns="http://www.example.org/Test" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.example.org/Test
      Input.xsd">
      <Data>𝐀</Data>
      </Root>

      *Schema*

      <?xml version="1.0" encoding="UTF-8"?>
      <schema targetNamespace="http://www.example.org/Test" elementFormDefault="qualified" xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.example.org/Test">
      <element name="Root">
      <complexType>
      <sequence>
      <element name="Data">
      <simpleType>
      <restriction base="string">
      <maxLength value="1"/>
      </restriction>
      </simpleType>
      </element>
      </sequence>
      </complexType>
      </element>
      </schema>

      Attachments

        Activity

          People

            Unassigned Unassigned
            giwinski Greg Iwinski
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: