Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21506

Memory based TxnHandler implementation

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Transactions
    • None

    Description

      The current TxnHandler implementations are using the backend RDBMS to store every Hive lock and transaction data, so multiple TxnHandler instances can run simultaneously and can serve requests. The continuous communication/locking done on the RDBMS side puts serious load on the backend databases also restricts the possible throughput.

      If it is possible to have only a single active TxnHandler (with the current design HMS) instance then we can provide much better (using only java based locking) performance. We still have to store the committed write transactions to the RDBMS (or later some other persistent storage), but other lock and transaction operations could remain memory only.

      The most important drawbacks with this solution is that we definitely lose scalability when one instance of TxnHandler is no longer able to serve the requests (see NameNode), and fault tolerance in the sense that the ongoing transactions should be terminated when the TxnHandler is failed. If this drawbacks are acceptable in certain situations the we can provide better throughput for the users.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pvary Peter Vary
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: