Details
Description
[Client report]
We had the cache crash gracefully twice last night on a segfault. Both
times the callstack produced by trafficserver's signal handler was:
/usr/bin/traffic_server[0x529596]
/lib/libpthread.so.0(+0xef60)[0x2ab09a897f60]
[0x2ab09e7c0a10]
usr/bin/traffic_server(HttpServerSession::do_io_close(int)+0xa8)[0x567a3c]
/usr/bin/traffic_server(HttpVCTable::cleanup_entry(HttpVCTableEntry*)+0x4c)[0x56aff6]
/usr/bin/traffic_server(HttpVCTable::cleanup_all()+0x64)[0x56b07a]
/usr/bin/traffic_server(HttpSM::kill_this()+0x120)[0x57c226]
/usr/bin/traffic_server(HttpSM::main_handler(int, void*)+0x208)[0x571b28]
/usr/bin/traffic_server(Continuation::handleEvent(int,
void*)+0x69)[0x4e4623]
I went through the disassembly and the instruction that it is on in
::do_io_close is loading the value of diags (not dereferencing it) so it
is unlikely that that through a segfault (unless this is some how in
thread local storage and that is corrupt).
The kernel message claimed that the instruction pointer was 0x4e438e
which in this build is in ProxyMutexPtr::operator ->() on the
instruction that dereferences the object pointer to get the stored mutex
pointer (bingo!), so it would seem that at some point we are
dereferencing a null "safe" pointer.