Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-2028

The example in delta-encoding seems incorrect

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • parquet-format
    • None

    Description

      In the example using delta-encoded, encoding [1, 2, 3, 4, 5], we state that

      The final encoded data is:
      
      header: 8 (block size), 1 (miniblock count), 5 (value count), 1 (first value)
      
      block 1 (minimum delta), 0 (bitwidth), (no data needed for bitwidth 0)
      

      I believe that the correct result should be

      header: [8, 1, 5, 2]
      block: [2, 0]

      I.e first_value and min_delta should be 2, not 1.

      This is because the zig-zag ULEB128-encoding of 1 is 2: the ULEB-128 encoding of 1 is 1, but AFAIK the zig-zag encoding of 1 is 2 (see e.g. here).

      Alternatively, we could re-phrase "The final encoded data is:" to "The final data prior to zig-zag encoding is:"

      Attachments

        Activity

          People

            Unassigned Unassigned
            jorgecarleitao Jorge Leitão
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: