Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2485

Enhance Kudu-Spark docs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.7.1
    • None
    • documentation, spark
    • None

    Description

      Users often get confused about the right way to use the Kudu-Spark integration. The most common dangerous result is that they create multiple Kudu clients, sometimes even one per task. It's pretty easy to overwhelm the master in this way, e.g., with a 2 second batch window and a client per task in a Spark streaming job. We should take our current minimal Spark docs and provide better examples and bigger, louder, redder warnings about making extra Kudu clients. Users should be directed to use the KuduContext exclusively. When a client is needed, the client instance inside the KuduContext should be used.

      Attachments

        Activity

          People

            Unassigned Unassigned
            wdberkeley William Berkeley
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: