Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17756

We should have better introspection of HFiles

    XMLWordPrintableJSON

    Details

    • Type: Brainstorming
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: HFile
    • Labels:
      None

      Description

      Stack was suggesting to use DataSketches (https://datasketches.github.io) in order to write additional statistics to the HFiles. This could be used to improve our split decisions, troubleshooting or potentially do other interesting analysis without having to perform full table scans. The statistics could be stored as part of the HFile but we could initially improve the visibility of the data by adding some statistics to HFilePrettyPrinter.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              esteban Esteban Gutierrez
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated: