Uploaded image for project: 'Thrift'
  1. Thrift
  2. THRIFT-4805

Suppress excessive logging of SASL TTransportExceptions in case of END_OF_FILE

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      This has to do with the fix checked in for THRIFT-3769 and THRIFT-2268, which was to mute the noise fromĀ TSaslTransportExceptions raised from load-balancer health-checks for Thrift services, such as the Hive metastore. Please consider the code in TThreadPoolServer::run():

      TThreadPoolServer.java
            } catch (TException tx) {
              LOGGER.error("Thrift error occurred during processing of message.", tx);
            } catch (Exception x) {
              // We'll usually receive RuntimeException types here
              // Need to unwrap to ascertain real causing exception before we choose to ignore
              Throwable realCause = x.getCause();
              // Ignore err-logging all transport-level/type exceptions
              if ((realCause != null && realCause instanceof TTransportException)
                  || (x instanceof TTransportException)) {
                LOGGER.debug(
                    "Received TTransportException during processing of message. Ignoring.",
                    x);
              } else {
                // Log the exception at error level and continue
                LOGGER.error("Error occurred during processing of message.", x);
              }
            } finally {...}
      

      The logic here is solid for RuntimeExceptions that wrap TTansportExceptions. But it slips up on handling the condition being checked for at line#323, i.e.:

      TThreadPoolServer.java
                  || (x instanceof TTransportException)) {
      

      The catch (Exception x) comes after the catch (TException tx), so it's a guarantee that !(x instanceof TTransportException). When a TTransportException (or TSaslTransportException) is thrown, it will be caught in the first catch block, and logged. This rather defeats the purpose of the fix. The error manifests with the following stack-trace filling up my logs:

      org.apache.thrift.transport.TTransportException
              at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
              at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
              at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
              at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
              at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
              at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:43)
              at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
              at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
              at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
              at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
              at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
              at org.apache.hive.service.cli.thrift.ThriftCLIMetricsProcessor.process(ThriftCLIMetricsProcessor.java:63)
              at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$2.run(HadoopThriftAuthBridge.java:777)
              at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$2.run(HadoopThriftAuthBridge.java:773)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
              at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:773)
              at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:748)
      

      The fix is simple. I'll upload it shortly, for review.

      Attachments

        1. THRIFT-4805.bonus.patch
          3 kB
          Mithun Radhakrishnan
        2. THRIFT-4805.1.patch
          2 kB
          Mithun Radhakrishnan

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mithun Mithun Radhakrishnan
            mithun Mithun Radhakrishnan
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 40m
                40m

                Slack

                  Issue deployment