Uploaded image for project: 'S2Graph'
  1. S2Graph
  2. S2GRAPH-50

Provide new HBase Storage Schema

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Done
    • Major
    • Resolution: Done
    • None
    • None
    • None

    Description

      I think we need to provide choice for both for `Tall` and `Wide` row for IndexEdge. The fatal difference between these two would be following.

      1. Wide.
      if we store adjacent edges on single row with wide column and use get request to get adjacent edges. This is how IndexEdge is currently stored.

      2. Tall.
      adjacent edges are on multiple `consecutive` rows and we use scanner to scan through them.

      once S2GRAPH-17 is resolved, then I think only thing we have to do is provide `IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on HBase and I think this is very trivial task since we all have primitives for this.

      The hard part would be changing interface for client.

      currently query support `offset` and `limit` for pagination. if we use scanner, then there is no easy way to support `offset`.

      I think it is worth to try with Tall row schema and benchmark them over Wide row schema. also I think this is very beneficial for others who is interested in implementing other storage such as RocksDB or LevelDB(including myself).

      I will followup with benchmark on both `Tall` and `Wide` row then we can decide what schema should be default. What others think?

      Attachments

        Activity

          People

            steamshon Do Yung Yoon
            steamshon Do Yung Yoon
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 168h
                168h
                Remaining:
                Remaining Estimate - 168h
                168h
                Logged:
                Time Spent - Not Specified
                Not Specified