Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3103

Improve efficiency of BloomFilter Thrift serialisation

    XMLWordPrintableJSON

Details

    Description

      TBloomFilters have a 'directory' structure that is a list of individual buckets (buckets are about 64k wide). The total size of the directory can be 1MB or even much more. That leads to a lot of buckets, and very inefficient deserialisation as each bucket has to be allocated on the heap.

      Instead, the TBloomFilter representation should use one contiguous string (like the real BloomFilter does, so that it can be allocated with a single operation (and deserialized with a single copy).

      Attachments

        Issue Links

          Activity

            People

              henryr Henry Robinson
              henryr Henry Robinson
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: