Uploaded image for project: 'Thrift'
  1. Thrift
  2. THRIFT-5682

UB in generated C++ code, stops compiling with C++20

    XMLWordPrintableJSON

Details

    • Patch Available
    • Patch

    Description

      The C++ compiler of Thrift 0.17.0 produces C++ code with undefined behavior, which stops compiling on Clang 15 in C++20 mode.

      For a generated class, both the default constructor and operator== implementations reference certain member functions of the class' members. As an example, the default constructor references (i.e., "uses") the default constructors of its members.

      If a class contains a std::vector<Foo>, and Foo has only been forward declared (which happens often in Thrift-generated code), this creates undefined behavior: The std::vector specification states that as long as Foo is an incomplete type, it is fine to reference std::vector<Foo>, but not any members (such as its default constructor).

      Thus, we must defer our default constructor's implementation (which references the default constructor of std::vector) to a point where Foo is a complete type. That is the case in the .cpp file.

      The same holds for operator==.Example

      Take this very simple Thrift file:

      struct StructA {
        1:required list<StructB> myBs
      }

      struct StructB
      {
        1:required string someString
      }

      If I compile this using thrift --gen cpp:no_skeleton -o out ./file.thrift I get a header file that contains the following:

      class StructA;
      class StructB;

      class StructA : …

      public:
        …
        StructA() noexcept {}
        …
        std::vector<StructB> myBs;
        …
      };

      class StructB : …

      { … }

      ;

       

      In this case, the default constructor for StructA references the default constructor of std::vector<StructB> while StructB is still an incomplete type. This is undefined behavior. It did apparently compile with all big compilers until recently, but with C++20, Clang 15 stops accepting this kind of construct, as you can see here at CompilerExplorer: https://godbolt.org/z/xcc4av6cb
       

      Proposed Solution

      I propose to move the definitions of the default constructor and operator== from the foobar_types.h into the foobar_types.cpp file. That way, when the definition is seen (and therefore the methods of std::vector<Foo> referenced), Foo already is a complete type and everything works out.

      The only alternative Solution (and which only works if Thrift does not allow circular dependencies - does it?) that I can see would be to compute a DAG of dependencies between the structures and then output them in topological order. That would not be a minor change.

      I guess that the reason for the definitions being in the header file is that this allows better inlining of these two methods. Nowadays, most compilers/linkers perform link-time optimization, which allows for inlining to happen at link time, so my feeling is that moving the definitions into their own translation unit inside the CPP file should not cause significant performance loss.

       

      I have created a Pull Request for my proposed solution here: https://github.com/apache/thrift/pull/2755

       

      Attachments

        Issue Links

          Activity

            People

              tinloaf Lukas Barth
              tinloaf Lukas Barth
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 10m
                  4h 10m