Users sometimes ask how new diagnostics appear in the PVS-Studio static analyzer. We answer that we draw inspiration from a variety of sources: books, coding standards, our own mistakes, our users' emails, and others. Recently we came up with an interesting idea of a new diagnostic. Today we decided to tell the story of how it happened.
It all started with a review of the COVID-19 CovidSim Model project and an article about an uninitialized variable. The project turned out to be small and written using the modern C++ language standard. This means it can perfectly add to the base of test projects for regression testing of the PVS-Studio analyzer core.
Before supplementing the base, we find it useful to look through warnings to search out patterns of false positives and highlight them to improve the analyzer in future. This is also an additional opportunity to notice that something else is wrong. For example, a message fails to describe an error for a particular code construct.
Luckily, the developer who was assigned to add the project to the test base approached the task thoroughly and decided to look into the MISRA diagnostics section. This wasn't an indispensable step. MISRA diagnostics are generally specific. They can be safely disabled for such projects, as CovidSim.
MISRA C and MISRA C++ diagnostics are intended for developers of embedded systems, and their point is to limit the use of unsafe programming constructs. For example, it is not recommended to use the goto operator (V2502), since it provokes the creation of complex code, where it is easy to make a logical error. Read more about the philosophy of the MISRA coding standard in the article "What Is MISRA and how to Cook It".
As for application software development, it doesn't make sense to enable them. The CovidSim project could do without them. Otherwise, a user will simply drown in a huge number of messages that are of little use in this case. For example, when experimenting with this set of diagnostics, we received more than a million warnings for some medium-sized open projects. Roughly speaking, every third line of code might be faulty in the view of MISRA. No one will scrape through all warnings, much less fix them. The project is either developed immediately taking into account MISRA recommendations, or this coding standard is irrelevant for it.
Anyway, let's get back to the topic. So, while skimming through the MISRA warnings, a colleague caught a glimpse of the V2507 warning issued for this code snippet.
if (radiusSquared > StateT[tn].maxRad2) StateT[tn].maxRad2 = radiusSquared;
{
SusceptibleToLatent(a->pcell);
if (a->listpos < Cells[a->pcell].S)
{
UpdateCell(Cells[a->pcell].susceptible, a->listpos, Cells[a->pcell].S);
a->listpos = Cells[a->pcell].S;
Cells[a->pcell].latent[0] = ai;
}
}
StateT[tn].cumI_keyworker[a->keyworker]++;
The V2507 rule forces us to wrap the bodies of conditional statements in curly braces.
At first, our meticulous colleague thought that the analyzer had failed. After all, there is a block of text in curly braces! Is this a false positive?
Let's take a closer look. The code only seems to be correct, but it is not! The curly braces are not attached to the if statement.
Let's tweak the code for clarity:
if (radiusSquared > StateT[tn].maxRad2)
StateT[tn].maxRad2 = radiusSquared;
{
SusceptibleToLatent(a->pcell);
....
}
Agree, this is a nice bug. It will surely be one of the Top10 C++ bugs we found in 2021.
What follows from this? The MISRA standard approach works! Yes, it forces you to write curly braces everywhere. Yes, it's tedious. Though this is a reasonable price to pay for improving the reliability of embedded applications used in medical devices, automobiles, and other high-responsibility systems.
I'm glad developers who use the MISRA standard are doing fine. However, recommending that everyone use curly braces is a bad idea. With this approach it is very easy to bring the analyzer to the state where it becomes impossible to use it. There will be so many warnings that no one will care about them.
Finally we got to the idea of a new General Analysis diagnostic and the following rule.
The analyzer will issue a warning in case the following conditions are met for the if statement:
- the entire conditional if statement is written in one line and has only a then branch;
- the next statement after if is a compound statement, and it is on different lines with if.
We look forward to getting a decent rule that gives few false positives.
This is how this idea is now described in our task tracker. Perhaps something will be done differently in the implementation process, but it doesn't really matter at this point. The main thing is that a decent diagnostic rule will appear, which will begin to identify a new error pattern. Next, we will extend it to the C# and Java cores of the PVS-Studio analyzer.
We just looked at the unique example of how a new diagnostic rule came up, which we will implement in PVS-Studio. Kudos to the CovidSim project, the MISRA coding standard, and our colleague's observation skills.
Thank you for your attention and follow me into the world of C++ and bugs :). Twitter. Facebook.
Additional links: