CMake Compare Part 1: Creation

Abstract: Part 1 of a 5 article series about using Understand to analyze how a CMake-based project has changed over time. This article focuses on getting the project to build successfully, both for the current state of the code and the 4-year-old version of the code.

Understand’s Git comparison project feature lets you analyze historical code without checking out everything in Git again. Then you can visualize changes with several different views in Understand. Unfortunately, it can have issues with CMake projects. Understand gets the build instructions from the compile_commands.json file output by CMake, however, these files aren’t usually tracked by git. So, Understand doesn’t get the correct build instructions from the historical code.

With this known limitation, I’m going to find out exactly how much can we do with this feature and a CMake project.

Picking A CMake Project

The first step is to find a sample project to use. The sample project needs to build with CMake and have a git history. Ideally, it will also be small enough to analyze quickly and have a short history to review. Luckily, one of the sample projects we already have for Understand meets all four goals: GitAhead.

I’m going to set up my initial Understand project with the current state of the code. The sample project shipped with Understand doesn’t include the Git information. So, the first step is to clone the source tree from GitHub. At the time of this article, the current commit is 711a693 (September 2021). Once I’ve checked it out, CMake should be run to generate the compile_commands.json file so Understand can get the build instructions. Finally, it’s worth actually building the project so automatically generated files get created.

$git clone https://github.com/gitahead/gitahead.git .
$git submodule init
$cd dep/openssl/openssl/
$./Configure darwin64-x86_64-cc no-shared
$make
$cd ../../../
$mkdir -p build/release
$cd build/release/
$cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=1 -DCMAKE_PREFIX_PATH=/Users/natasha/Qt/5.15.2/clang_64/ ../..
$ninja

With all that done, the next step is to create an Understand project using the compile_commands.json file. But, there are still parse errors! The 13 errors come from the test directory. The fastest solution is to ignore the test directory.

An initial CMake Understand Project of GitAhead will have 13 errors from the test directory. The test directory can be ignored.

Creating A Git Comparison Project

From the top-level Compare menu, the “Comparison Projects” option opens a panel to manage comparison projects. I want to add a new project. To see all the issues I might hit, I’ll go back to the oldest/first commit in the GitHub history: 53d6ddc (December 2018).

Creation of a Git comparison project from Understand.

The warning notifies me that I’m using an import file that’s not part of Git. I’ll create the project anyway since I want to know if it will work. After the project analyzes, I can see the analysis errors by hovering over the status. There are 2,150 errors and 97 warnings.

That’s a lot of errors. To view the details, I will need to open the project. The file path to the project was part of creating the database. I can open it just like any other Understand project. Then I’ll run Analyze All to see the errors.

An initial analysis of commit 53d6ddc using the compile_commands.json file of the current (711a693) commit.

A lot of the errors seem to be “unable to find file.” The first set is from the hunspell dependency. Looking at GitAhead, that makes sense because hunspell was added as a submodule in November 2020, after commit 53d6ddc (December 2018).

Fixing file not found errors from Git

Strangely, the next set of missing files is from libgit2. Libgit2 is a submodule, and Understand should find the correct versions of files in submodules. Other submodules like CMark and libssh2 worked. Why not libgit2?

Because Understand was launched from the command line, there is a hint. This error message was output many times:

Error -3 Repository::lookupCommit(id) - object not found - no match for id (1294cc47dce48b36b5145ecf7c4bd940f0b6f4f0)

The commit (1294cc4) isn’t the GitAhead commit (53d6ddc). So, it probably is a submodule commit. I can verify it’s the libgit2 commit with this command:

$ git ls-tree 53d6ddc65cf5acee7d254ea991363dc4d1851f68 dep/libgit2/libgit2
160000 commit 1294cc47dce48b36b5145ecf7c4bd940f0b6f4f0	dep/libgit2/libgit2

Why isn’t the commit in the repository? Viewing the commit on GitHub provides the answer:

A screenshot showing the libgit2 commit does not belong to any branch, explaining why it is not part of the local repository.

Because of the way the libgit2 fork was created, rebasing history instead of merging history, the commit at that time point isn’t referenced. This isn’t an error most users are likely to have. But, I need to fix it to reduce analysis errors.

The advice on GitHub to find a fork with the commit did not work. Checking out GitAhead commit 53d6ddc with GitAhead also did not work. However, using the command line Git did fix the problem:

$ git checkout 53d6ddc65cf5acee7d254ea991363dc4d1851f68
$ git submodule update
Submodule path 'dep/git/git': checked out '4d4165b80d6b91a255e2847583bd4df98b5d54e1'
From https://github.com/stinb/libgit2
 * branch                1294cc47dce48b36b5145ecf7c4bd940f0b6f4f0 -> FETCH_HEAD
Submodule path 'dep/libgit2/libgit2': checked out '1294cc47dce48b36b5145ecf7c4bd940f0b6f4f0'
Submodule path 'dep/libssh2/libssh2': checked out '54bef4c5dad868a9d45fdbfca9729b191c0abab5'
Submodule path 'dep/openssl/openssl': checked out '5707219a6aae8052cb98aa361d115be01b8fd894'

Fixing file not found errors from changed paths

With libgit2 fixed, I now have 1,935 analysis errors. I’ll check for other missing files first, since missing files might be causing other errors. Both the lua and scintilla libraries are missing. Looking at the paths gives a hint:

Analysis errors with lua and scintilla.

At the historical time point, lua is version 5.3.5 and scintilla is version 3.6.0. The current time point has versions 5.4.0 and 3.21.0 respectively. Can I fix these errors without running CMake for the historical time point? Maybe find and replace with the compile_commands.json file will work?

The Find & Replace Dialog in Understand

It does help! The errors went from 1,935 to 676.

Other Analysis Errors

Are there any fast fixes for my remaining 676 errors?

I’m still missing some files, even in libraries that are found. Those files were probably added after December 2018. Conversely, I’m probably also missing files that existed in December 2018 but don’t exist in the present.

Other errors come from the automatically-generated files. These errors are expected since the files are for the present time point and not the historical one. The question is whether these errors matter? In my case, I don’t care about automatically generated files, so I’ll exclude the build directory from the analysis. This reduces the errors to 473.

The remaining errors don’t look like they have fast fixes. At this point, it would probably be faster to checkout the historical time point and run CMake to get accurate build instructions.

Analysis with an up-to-date compile_commands.json file

Checking out the commit, the first problem is that cmake and ninja can’t run because openssl is a different version. Openssl was built with make instead of CMake so it has to be rebuilt directly. The second problem is less fixable. After CMake successfully ran, the ninja command failed. The version of QT on my computer is now 5.15 which has deprecated some of the functions that GitAhead was using in 2018. 

So, I have an accurate compile_commands.json file, but the automatically generated files are not correct. Also, since I can’t build the project, Understand will have parse errors. How many? 66.

They fall into two broad categories. First, QPainterPath problems that are likely related to the QT version change. Second, a variety of QT errors in 2 automatically generated files. The automatically generated files must have been directly included from parsed files since the build directory is excluded. 

A breakdown of the 66 analysis errors using a compile_commands.json file generated for the commit.

With 2 (automatically generated) files failing to parse accurately and 13 source files referencing a single missing class, I probably have an accurate enough parse to answer most change-related questions.

With a comparison database, Understand can locate changed entities and show diffs.

Now that we have the starting and ending states figured out, in Part 2 I’ll look at using batch commands to fill in the timeline between the two by automatically creating hundreds of Understand comparison projects.