0832_foreach_ConditionalAccess/image1.png
Do you like the '?.' operator? Well, who doesn't? Many people like these concise null checks. However, today's article shows that the '?.' operator may be tricky sometimes. That is, it can create an illusion of safety when used in the foreach loop.


Let's start with a small task. Take a look at the following code:


void ForeachTest(IEnumerable<String> collection)
{
  // #1
  foreach (var item in collection.NotNullItems())
    Console.WriteLine(item);

  // #2
  foreach (var item in collection?.NotNullItems())
    Console.WriteLine(item);
}

Suppose the collection is null. Have you got any ideas on how each of the loops will run? Case #2 with ?. seems to be safer. But is that really so? The article's title should have already planted a seed of doubt in your mind.


In any case, we'll try to figure this out below. We'll return to this task at the end of the article when we have more information.


Note. The C# Specification uses the term "expression" to denote the following entity. In this article we use the term "enumerable expression". This may help to avoid confusion when we talk about different expressions.


0832_foreach_ConditionalAccess/image2.png


Why is it dangerous to use the '?.' operator in the foreach loop's enumerable expression?


First, let's recall what the ?. operator is.


It won't take long.


var b = a?.Foo();

So:


  • if a == null,b == null;
  • if a != null, b == a.Foo().

Now let's take a look at the foreach loop.


void Foo1(IEnumerable<String> collection)
{
  foreach (var item in collection)
    Console.WriteLine(item);
}

IL code suggests that you can rewrite the above code fragment in C# without foreach. It would look something like this:


void Foo2(IEnumerable<String> collection)
{
  var enumerator = collection.GetEnumerator();
  try
  {
    while (enumerator.MoveNext())
    {
      var item = enumerator.Current;
      Console.WriteLine(item);
    }
  }
  finally
  {
    if (enumerator != null)
    {
      enumerator.Dispose();
    }
  }
}

Note. In some cases, foreach loop's IL code may become identical to the code for the for loop. However, the problem still persists. I think we'll have another article about the possible optimizations of the foreach loop.


The collection.GetEnumerator() is the key element here. In black and white (although it depends on your color scheme), the code says that when the GetEnumerator method is called, the reference is dereferenced. If this reference is null, we get NullReferenceException.


Now let's take a look at what happens in the foreach loop's enumerable expression with the ?. operator:


static void Foo3(Wrapper wrapper)
{
  foreach (var item in wrapper?.Strings)
    Console.WriteLine(item);
}

We may rewrite this code as follows:


static void Foo4(Wrapper wrapper)
{
  IEnumerable<String> strings;
  if (wrapper == null)
  {
    strings = null;
  }
  else
  {
    strings = wrapper.Strings;
  }

  var enumerator = strings.GetEnumerator();
  try
  {
    while (enumerator.MoveNext())
    {
      var item = enumerator.Current;
      Console.WriteLine(item);
    }
  }
  finally
  {
    if (enumerator != null)
    {
      enumerator.Dispose();
    }
  }
}

As in the previous case, the GetEnumerator (strings.GetEnumerator) call occurs. However, note that the strings value can be null if wrapper is null. Well, that's to be expected with the ?. operator (we discussed it earlier). In this case, when trying to call the string.GetEnumerator() method, we get a NullReferenceException.


That's why the ?. operator in the foreach loop's enumerable expression does not protect against null dereference. It only creates an illusion of safety.


What prompted us to improve the analyzer?


Once my colleague came to me and said — here's the code, we can't find the error. I was surprised. I recall exactly how I offered to work on the case that involved the foreach loop's enumerable expression having the null value. Checked it out. Indeed, the analyzer was not issuing warnings on the code below.


void Test1(IEnumerable<String> collection, 
          Func<String, bool> predicate)
{
  foreach (var item in collection?.Where(predicate))
    Console.WriteLine(item);
}

The same was with this code.


void Test2(IEnumerable<String> collection, 
          Func<String, bool> predicate)
{
  var query = collection?.Where(predicate);
  foreach (var item in query)
    Console.WriteLine(item);
}

However, the analyzer issued a warning on the following code fragment.


void Test3(IEnumerable<String> collection, 
          Func<String, bool> predicate,
          bool flag)
{
  var query = collection != null ? collection.Where(predicate) : null;
  foreach (var item in query)
    Console.WriteLine(item);
}

PVS-Studio warning: V3080 Possible null dereference. Consider inspecting 'query'.


The analyzer would also issue a warning on the following code.


IEnumerable<String> GetPotentialNull(IEnumerable<String> collection,
                                     Func<String, bool> predicate,
                                     bool flag)
{
  return collection != null ? collection.Where(predicate) : null;
}

void Test4(IEnumerable<String> collection, 
          Func<String, bool> predicate,
          bool flag)
{
  foreach (var item in GetPotentialNull(collection, predicate, flag))
    Console.WriteLine(item);
}

PVS-Studio warning: V3080 Possible null dereference of method return value. Consider inspecting: GetPotentialNull(...).


Why did the analyzer issue warnings for Test3 and Test4, but not for Test1 and Test2? The point is that the analyzer sees these cases as different:


  • the analyzer did not issue a warning if a variable received the ?. operator's result;
  • an expression may have the null value. For example, if a variable directly received null or if a method returned null. In this case, the analyzer issued a warning.

This differentiation helps the analyzer to handle each situation thoroughly. So, as a result, the analyzer:


  • issues a more accurate warning;
  • has the ability to handle these cases separately (to raise / lower the warning level, to suppress / not suppress, etc.);
  • has documentation for each case.

What diagnostics we refined


As a result, we improved 2 diagnostic rules: V3105 and V3153.


V3105 now detects suspicious code fragments when a variable contains the result of the ?. operator. Then, the enumerable expression foreach uses this variable.


void Test(IEnumerable<String> collection, 
          Func<String, bool> predicate)
{
  var query = collection?.Where(predicate);
  foreach (var item in query)
    Console.WriteLine(item);
}

PVS-Studio warning: V3105 The 'query' variable was used after it was assigned through null-conditional operator. NullReferenceException is possible.


V3153 now detects cases where the foreach loop's enumerable expression directly uses the ?. operator.


void Test(IEnumerable<String> collection, 
          Func<String, bool> predicate)
{
  foreach (var item in collection?.Where(predicate))
    Console.WriteLine(item);
}

PVS-Studio warning: V3153 Enumerating the result of null-conditional access operator can lead to NullReferenceException. Consider inspecting: collection?.Where(predicate).


The improved analyzer detects more issues


It's a great feeling to see the analyzer works better! As I've already said, we regularly test the analyzer on open-source projects. So, after we improved V3105 and V3153, we managed to find some new triggerings!


Note. This code was up to date when we added the projects to our tests. By now the code could have changed and may not contain these code fragments.


RavenDB


private void HandleInternalReplication(DatabaseRecord newRecord, 
                                       List<IDisposable> instancesToDispose)
{
  var newInternalDestinations =
        newRecord.Topology?.GetDestinations(_server.NodeTag,
                                            Database.Name,
                                            newRecord.DeletionInProgress,
                                            _clusterTopology,
                                            _server.Engine.CurrentState);
  var internalConnections 
        = DatabaseTopology.FindChanges(_internalDestinations, 
                                       newInternalDestinations);

  if (internalConnections.RemovedDestiantions.Count > 0)
  {
    var removed = internalConnections.RemovedDestiantions
                                     .Select(r => new InternalReplication
      {
        NodeTag = _clusterTopology.TryGetNodeTagByUrl(r).NodeTag,
        Url = r,
        Database = Database.Name
      });

    DropOutgoingConnections(removed, instancesToDispose);
  }
  if (internalConnections.AddedDestinations.Count > 0)
  {
    var added = internalConnections.AddedDestinations
                                   .Select(r => new InternalReplication
    {
      NodeTag = _clusterTopology.TryGetNodeTagByUrl(r).NodeTag,
      Url = r,
      Database = Database.Name
    });
    StartOutgoingConnections(added.ToList());
  }
  _internalDestinations.Clear();
  foreach (var item in newInternalDestinations)
  {
    _internalDestinations.Add(item);
  }
}

I intentionally listed the entire code fragment. You will probably agree this issue is not very obvious. And of course, it's easier to find something if you know what you're looking for. ;)


If you simplify the code, the issue becomes more obvious.


private void HandleInternalReplication(DatabaseRecord newRecord, 
                                       List<IDisposable> instancesToDispose)
{
  var newInternalDestinations = newRecord.Topology?.GetDestinations(....);
  ....
  foreach (var item in newInternalDestinations)
    ....
}

The newInternalDestinations variable takes the ?. operator's result. If newRecord.Topology is null, newInternalDestinations will also be null. When the execution flow reaches the foreach loop, the NullReferenceException exception will be thrown.


PVS-Studio warning: V3105 The 'newInternalDestinations' variable was used after it was assigned through null-conditional operator. NullReferenceException is possible. ReplicationLoader.cs 828


What's more interesting, the DatabaseTopology.FindChanges method takes the newInternalDestinations variable as the newDestinations parameter and checks it for null.


internal static 
(HashSet<string> AddedDestinations, HashSet<string> RemovedDestiantions)
FindChanges(IEnumerable<ReplicationNode> oldDestinations, 
            List<ReplicationNode> newDestinations)
{
  ....
  if (newDestinations != null)
  {
    newList.AddRange(newDestinations.Select(s => s.Url));
  }
  ....
}

MSBuild


public void LogTelemetry(string eventName, 
                         IDictionary<string, string> properties)
{
  string message 
           = $"Received telemetry event '{eventName}'{Environment.NewLine}";

  foreach (string key in properties?.Keys)
  {
    message += $"  Property '{key}' = '{properties[key]}'{Environment.NewLine}";
  }
  ....
}

PVS-Studio warning: V3153 Enumerating the result of null-conditional access operator can lead to NullReferenceException. Consider inspecting: properties?.Keys. MockEngine.cs 159


Here the foreach directly contains the ?. operator. Perhaps the developer thought the ?. operator would protect from NullReferenceException. But we know that it's not safer. ;)


Nethermind


This example is similar to the previous one.


public NLogLogger(....)
{
  ....

  foreach (FileTarget target in global::NLog.LogManager
                                            .Configuration
                                           ?.AllTargets
                                            .OfType<FileTarget>())
  {
    ....
  }
  ....
}

PVS-Studio warning: V3153 Enumerating the result of null-conditional access operator can lead to NullReferenceException. NLogLogger.cs 50


Also, the developers used the ?. operator directly in the foreach loop's enumerable expression to avoid NullReferenceException. Maybe they'll get lucky, and the Configuration property will never return null. Otherwise, some time later this code may play a trick on you.


Roslyn


private ImmutableArray<char>
GetExcludedCommitCharacters(ImmutableArray<RoslynCompletionItem> roslynItems)
{
  var hashSet = new HashSet<char>();
  foreach (var roslynItem in roslynItems)
  {
    foreach (var rule in roslynItem.Rules?.FilterCharacterRules)
    {
      if (rule.Kind == CharacterSetModificationKind.Add)
      {
        foreach (var c in rule.Characters)
        {
          hashSet.Add(c);
        }
      }
    }
  }

  return hashSet.ToImmutableArray();
}

PVS-Studio warning: V3153 Enumerating the result of null-conditional access operator can lead to NullReferenceException. CompletionSource.cs 482


That's great, isn't it? I love it when PVS-Studio finds interesting cases in compilers or other analyzers.


PVS-Studio


And now it's time to admit that we are not perfect either. We've made the same mistakes. :)


0832_foreach_ConditionalAccess/image3.png


We regularly check PVS-Studio with PVS-Studio. This is how it works:


  • at night, we build a new version of the analyzer distribution. It includes changes we committed to the main branch during the day;
  • this new version checks various projects, including PVS-Studio itself;
  • the BlameNotifier utility notifies developers and managers about the warnings the analyzer issued;
  • then, we fix the warnings found.

And so, after we've improved V3153 and V3105, the analyzer issued several warnings on our code. Indeed, the analyzer detected cases when the foreach loop's enumerable expression contained the ?. operator. Also, we found indirect cases (when a variable takes a value). We were lucky that we hadn't been getting an exception. In any case, we've already taken the warnings into account and fixed the corresponding cases. ;)


Here's a code fragment that triggered a warning:


public override void
VisitAnonymousObjectCreationExpression(
  AnonymousObjectCreationExpressionSyntax node)
{
  foreach (var initializer in node?.Initializers)
    initializer?.Expression?.Accept(this);
}

Yeah, there's a bunch of ?. here. Try to find the one that will shoot you in the foot. It seems like ?. operators provide maximum safety (use the Crysis nanosuit voice effect while reading) for your code, but in fact, that's not true.


Is it possible to use the '?.' operator in the enumerable expression without exceptions?


Of course, you can do that. And we've seen such code examples. For example, the ?? operator can come to the rescue.


The following code is dangerous and may lead to NullReferenceException:


static void Test(IEnumerable<String> collection,
                 Func<String, bool> predicate)
{
  foreach (var item in collection?.Where(predicate))
    Console.WriteLine(item);
}

And this code will not throw an exception:


static void Test(IEnumerable<String> collection,
                 Func<String, bool> predicate)
{
  foreach (var item in    collection?.Where(predicate) 
                       ?? Enumerable.Empty<String>())
  {
    Console.WriteLine(item);
  }
}

While the ?. operator returns a null value, the ?? operator results in Enumerable.Empty<String>(). Therefore, there will be no exception. However, adding an explicit null check instead could be a good idea.


static void Test(IEnumerable<String> collection,
                 Func<String, bool> predicate)
{
  if (collection != null)
  {
    foreach (var item in collection.Where(predicate))
      Console.WriteLine(item);
  }
}

Obviously, it looks not so modern but clear and easy to read.


Let's solve the task discussed in the beginning


As you may remember, we started the article with the following task:


void ForeachTest(IEnumerable<String> collection)
{
  // #1
  foreach (var item in collection.NotNullItems())
    Console.WriteLine(item);

  // #2
  foreach (var item in collection?.NotNullItems())
    Console.WriteLine(item);
}

Now you know that option #2 is not safe at all. It won't help you avoid NullReferenceException. And what about option #1? At first glance, it seems that we'll have NullReferenceException when calling collection.NotNullItems(). But that's not necessarily true! Suppose NotNullItems is an extension method with the following body:


public static IEnumerable<T>
NotNullItems<T>(this IEnumerable<T> collection) where T : class
{
  if (collection == null)
    return Enumerable.Empty<T>();

  return collection.Where(item => item != null);
}

As we can see, the method checks collection for null. Since in this case the method returns the Enumerable.Empty<T>() value, there will be no exception. That is, loop #1 works successfully, even if collection is null.


But the second loop remains dangerous. If collection is null, the NotNullItems method is not called. Therefore, check for null does not work. As a result, we have the same situation which we kept seeing time and again — an attempt to call the GetEnumerator() method for a null reference.


That's one interesting case we have! Calling the collection.NotNullItems() method explicitly prevents NullReferenceException, but a "safe" call — collection?.NotNullItems() — does not.


Conclusion


We have several conclusions here:


  • do not use the ?. operator in the foreach loop's enumerable expression directly or indirectly. It only creates an illusion of safety;
  • use a static analyzer regularly.

We, as developers once again realized that it's important not only to develop new diagnostics but also to refine the existing ones.


PVS-Studio 7.13 includes the updates we discussed in this article. Do you want to know whether anyone uses the ?. operator in the enumerable expression in your code base? Then, feel free to download the analyzer from the website and check the code.


As usual, feel free to follow me on Twitter.

Комментарии (1)