Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-2449

pyspark kafka streaming second run error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.7.1
    • None
    • pySpark
    • None

    Description

      run after stop(second run) below code, I have bleow message.
      --------------------------------------------------------------------------------------------
      %spark.pyspark
      #from pyspark import SparkContext
      from pyspark.streaming import StreamingContext
      from pyspark.streaming.kafka import KafkaUtils

      ssc = StreamingContext(sc, 3)
      kvs = KafkaUtils.createDirectStream(ssc, ['fw'],

      {"metadata.broker.list": 'queue:9200'}

      )
      lines = kvs.map(lambda x: x[1])
      counts = lines.flatMap(lambda line: line.split(" ")) \
      .map(lambda word: (word, 1)) \
      .reduceByKey(lambda a, b: a+b)
      counts.pprint()

      ssc.start()
      try:
      ssc.awaitTermination()
      except KeyboardInterrupt:
      ssc.stop()
      --------------------------------------------------------------------------------------------
      Traceback (most recent call last):
      File "/tmp/zeppelin_pyspark-5081537614820689398.py", line 325, in <module>
      sc.setJobGroup(jobGroup, "Zeppelin")
      File "/opt/zeppelin-0.7.1-bin-all/interpreter/spark/pyspark/pyspark.zip/pyspark/context.py", line 902, in setJobGroup
      self._jsc.setJobGroup(groupId, description, interruptOnCancel)
      AttributeError: 'NoneType' object has no attribute 'setJobGroup'

      Attachments

        Activity

          People

            Unassigned Unassigned
            dalgomtaeng Jeon ChangBae
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: