Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-15413

kafka-server-stop fails with COLUMNS environment variable on Ubuntu

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • tools
    • None
    • kafka: 3.5.1
      Java: openjdk version "20.0.1" 2023-04-18
      OS: Ubuntu 22.04.3 LTS on WSL2/Windows 11

    Description

      kafka-server-stop script does not work if environment variable COLUMNS is set on Ubuntu.

      Steps to reproduce:
      kafka/zookeeper.properties

      dataDir=/tmp/kafka-test-20230828-15217-1lop1tk/zookeeper
      clientPort=34461
      maxClientCnxns=0
      admin.enableServer=false
      

      kafka/server.properties

      broker.id=0
      listeners=PLAINTEXT://:46161
      num.network.threads=3
      num.io.threads=8
      socket.send.buffer.bytes=102400
      socket.receive.buffer.bytes=102400
      socket.request.max.bytes=104857600
      log.dirs=/tmp/kafka-test-20230828-15217-1lop1tk/kafka-logs
      num.partitions=1
      num.recovery.threads.per.data.dir=1
      offsets.topic.replication.factor=1
      transaction.state.log.replication.factor=1
      transaction.state.log.min.isr=1
      log.retention.hours=168
      log.retention.check.interval.ms=300000
      zookeeper.connect=localhost:34461
      zookeeper.connection.timeout.ms=18000
      group.initial.rebalance.delay.ms=0
      
      $ zookeeper-server-start kafka/zookeeper.properties >/dev/null 2>&1 &
      [1] 18593
      $ kafka-server-start kafka/server.properties >/dev/null 2>&1 &
      [2] 18982
      $ COLUMNS=10 kafka-server-stop # This is unexpected
      No kafka server to stop
      $ kafka-server-stop
      $ zookeeper-server-stop
      [2]+  Exit 143                kafka-server-start kafka/server.properties
      $ 
      [1]+  Exit 143                zookeeper-server-start kafka/zookeeper.properties 

      In the third command, I specified COLUMNS environment variable. It caused kafka-server-stop script to fail finding kafka process.

      Cause

      kafka-server-stop script uses ps ax to find kafka process.

      OSNAME=$(uname -s)
      if [[ "$OSNAME" == "OS/390" ]]; then
          (snip)
      elif [[ "$OSNAME" == "OS400" ]]; then
          (snip)
      else
          PIDS=$(ps ax | grep ' kafka\.Kafka ' | grep java | grep -v grep | awk '{print $1}')
      fi
      

      On Ubuntu, ps ax truncates its output if environment variable COLUMNS exists.

      (source code of ps command] shows that COLUMNS environment variable wins result of isatty)

      $ ps ax | cat
        19912 pts/0    Sl     0:03 /home/linuxbrew/.linuxbrew/opt/openjdk/libexec/bin/java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 -Djava.awt.headless=true -Xlog:gc*:file=/home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../logs/kafkaServer-gc.log:time,tags:filecount=10,filesize=100M -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../logs -Dlog4j.configuration=file:/home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../config/log4j.properties -cp /home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../libs/activation-1.1.1.jar:(snip):/home/linuxbrew/.linuxbrew/Cellar/kafka/3.5.1/libexec/bin/../libs/zstd-jni-1.5.5-1.jar kafka.Kafka kafka/server.properties
      $ COLUMNS=10 ps ax | cat
        19912 pts/0    Sl     0:05 /home/linux
      

      I tested this on WSL2 on Windows and openjdk installed with Homebrew, but it should occur on any environment with procps-ng.

      Problem

      This caused CI failure in Homebrew project. (GitHub/Homebrew/homebrew-core#133887)

      Homebrew's behavior that passes COLUMNS environment variable seems a bug. But, server-stop script is not expected to be affected by such an environment variable. So, this also seemed to be a bug for me.

      Related issues

      This problem, KAFKA-4931 and KAFKA-4110 can also be fixed by introducing ProcessID file. But the three problem have different cause and can be thought separately.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tak_sakai Takashi Sakai
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: