Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.1.8
    • rdbmk
    • None

    Description

      The current approach is to serialize the Document using JSON, and then to store either (a) the full JSON in a VARCHAR column, or, if that column isn't wide enough, (b) to store it in a BLOB (optionally gzipped).

      For debugging purposes, the inline VARCHAR always gets populated with the start of the JSON serialization.

      However, with Oracle we are limited to 4000 bytes (which may be way less characters due to non-ASCII overhead), so many document instances will use what was initially thought to be the exception case.

      Questions:

      1) Do we stick with JSON or do we attempt a different serialization? It might make sense both wrt to length and performance. There might be also some code to borrow from the off-heap serialization code.

      2) Do we get rid of the "dual" strategy, and just always use the BLOB? The indirection might make things more expensive, but then the total column width would drop considerably. – How can we do good benchmarks on this?

      (This all assumes that we stick with a model where all code is the same between database types, except for the DDL statements; of course it's also conceivable add more vendor-specific special cases into the Java code)

      Attachments

        1. OAK-1941-cmodcount.diff
          12 kB
          Julian Reschke
        2. utf8measure.diff
          11 kB
          Julian Reschke
        3. with-modified-index.diff
          2 kB
          Julian Reschke
        4. with-modified-index.diff
          2 kB
          Julian Reschke

        Issue Links

          Activity

            People

              reschke Julian Reschke
              reschke Julian Reschke
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: