Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25318 Improvement of scheduler and execution for Flink OLAP
  3. FLINK-29317

Add WebSocket in Dispatcher to support olap query submission and push results in session cluster

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.14.5, 1.15.3
    • None
    • None

    Description

      Currently client submit olap query to flink session cluster via http rest api, and pull the results through interval polling. The sink task in TaskManager creates socket server for each query, when the JobManager receives the pull request from client, it requests query results from the socket server. The process is as follows
      Job submission path:
      client -> http rest -> JobManager -> Sink Socket Server
      Result acquisition path:
      client <- http rest <- JobManager <- Sink Socket Server

      This leads to two problems
      1. There will be some performance loss when submitting jobs through http rest, for example, temporary files will be created for each job
      2. The client pulls the result data at a certain time interval, which is a fixed cost. The larger interval leads to increase query latency, the smaller interval will increase the pressure of Dispatcher.
      3. Each sink task initializes a socket server, it will increase the query latency, on the other hand, it wastes resources.

      For the Flink OLAP scenario, we propose to add websocket protocol in session cluster to support submitting jobs and returning results. The client creates and manage a connection with websocket server, submits olap query to session cluster. The TaskManagers create and manage connection to websocket server too, and sink task sends results to the server in stream. When the JobManager receives the results from sink task, it pushes the result data to the client through the connection between them.

      We implemented this feature in the internal Flink version of ByteDance. On average, the latency of each query can be reduced by about 100ms, it's a big optimization for OLAP queries.

      Attachments

        Activity

          People

            Unassigned Unassigned
            zjureel Fang Yong
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: