Code Analysis and Refactoring with Clang Tools
CoARCT (pronounced like the word "corked") is a small set of tools built on Clang's LibTooling. CoARCT demonstrates some more sustained examples of refactoring and analyzing code with AST Matchers and the clang Refactoring Tool. The goal is not to supply tools, but rather to show how one can create one's own tools.
It includes library code and command line drivers that go beyond some of the (excellent! but short) tutorials that are available. The CoARCT examples are drawn from refactoring legacy codes:
- Reporting which functions use which global variables;
- Replacing global variables with local variables, including threading variables through a call chain;
- Detecting which functions use which fields of a struct: this data can be used to analyze how to break up large structs;
- Finding code associated with a classic C-style linked list;
- Identifying struct fields defined with typedefs, reporting underlying types (apps/TypedefFinder.cc);
- Identifying typedef;
- Identify uses of a class template, such as std::vector.
It also demonstrates a few useful things that were not immediately clear from the tutorials and examples I learned from, such as unit testing matchers and callbacks, and building out of the Clang/LLVM tree.
Our hope is that CoARCT will help demystify the Clang AST tools to developers. If the CoARCT tools are directly useful in your work, let us know!
- CMake version 3+ (https://cmake.org/download/)
- Clang and LLVM 11.0 libraries and headers (http://releases.llvm.org/download.html)
- libtinfo
- Boost (currently using 1.61, just needs boost/type_index in one spot)
- Google test (currently using 1.8.0 https://github.com/google/googletest)
Default branch is Clang 11.0 (older branches: 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.9, 3.8). These branches track changes in the Clang API's, so be sure to match the branch against the version of Clang you are using.
-
Make sure clang++ is in your path
-
Define these environment variables
CXX: Your clang++ version 11.0.x GTEST_DIR: Top level directory of google test installation BOOST_DIR: Top level of Boost (#include "boost/type_index.hpp" needs to work) TINFO_LIB_DIR: points to where libtinfo.a is installed.
-
Clone the repository
-
Create a build directory
/home/CoARCT $ mkdir build-clang-11.0.0 /home/CoARCT $ cd build-clang-11.0.0
-
Run cmake, make
/home/CoARCT/build-clang-11.0.0 $ cmake .. /home/CoARCT/build-clang-11.0.0 $ make
-
Run the unit tests
/home/CoARCT/build-clang-11.0.0 $ ./test/corct-unittests ... [==========] 63 tests from 16 test cases ran. (438 ms total) [ PASSED ] 63 tests.
Tracking a few changes to the LLVM/Clang APIs:
NamedDecl::isHidden()
to !NamedDecl::isUnconditionallyVisible()
Confusion with LLVM::StringRef
vs std::string
Needed include file for LLVM::Module
(Still) None!
None! (This seems too easy...)
Renamed getLoc(Start|End)
to get(Start|End)Loc
, following Clang API change. Version bumps for Docker/Travis.
None! (Hmmm.)
Added logic to match desugared types in template variable matcher. This doesn't affect anything in Clang 5, and it keeps the Clang 6 behavior the same as before (i.e. it still "sees through" type aliases).
Minor tweaks. Hopefully CMake configuration is improved. Also added ability to configure compiler instances in unit tests; this should permit more complex test inputs.
These issues did not arise this time with my standard build on a Mac. But I'll mention them again, in case they come up.
Building CoARCT failed on OSX with pre-built binaries from llvm.org: the function futimens
was undefined. Workaround: build Clang and LLVM from source as described at http://clang.llvm.org/get_started.html.
Building CoARCT on Linux failed with errors about no member is_final in namespace std
. Diagnosis: That installation of Clang seems to be finding headers with an older GCC (4.8.5). Workarounds
-
set (or append)
--gcc-toolchain=/path/to/newer/gcc
to theCXXFLAGS
environment variable when running CMake. -
The above solution did not work on one system. In that case, overriding cxx-isystem was necessary. Pass:
-cxx-isystem /path/to/newer/gcc/include/c++/version -cxx-isystem /path/to/newer/gcc/include/c++/<version>/x86_64-pc-linux-gnu
Los Alamos National Security, LLC (LANS) owns the copyright to CoARCT, which it identifies internally as LA-CC-17-039. See the LICENSE file for license information.