Intermittent InvalidOperationException in Dependency Injection

I’m using CouchbaseNetClient v3.0.4 and Couchbase.Extensions.DependencyInjection v3.0.4.811, both from Nuget… I’m seeing some intermittent errors when I call GetBucketAsync. If I retry the operation it will eventually be successful.

My code is pretty simple: I call AddCouchbase(Configuration.GetSection("Couchbase")) in Startup.ConfigureServices. I inject IBucketProvider and then do var bucket = await _bucketProvider.GetBucketAsync("mybucket");

Here’s the stack trace:

System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
 at System.Collections.Generic.Dictionary`2.Enumerator.MoveNext()
   at System.Linq.Enumerable.ToDictionary[TSource,TKey,TElement](IEnumerable`1 source, Func`2 keySelector, Func`2 elementSelector, IEqualityComparer`1 comparer)
   at Couchbase.Core.DI.CouchbaseServiceProvider..ctor(IEnumerable`1 serviceFactories)
   at Couchbase.ClusterOptions.BuildServiceProvider()
   at Couchbase.Core.ClusterContext..ctor(ICluster cluster, CancellationTokenSource tokenSource, ClusterOptions options)
   at Couchbase.Cluster..ctor(ClusterOptions clusterOptions)
   at Couchbase.Cluster.ConnectAsync(ClusterOptions options)
   at Couchbase.Extensions.DependencyInjection.Internal.ClusterProvider.GetClusterAsync()
   at Couchbase.Extensions.DependencyInjection.Internal.BucketProvider.<GetBucketAsync>b__4_0(String name)

I’d really appreciate any help you can offer. Thanks, and have a good weekend.

Hi, jmurphy.

It looks like services are still being injected while that call to GetBucketAsync is being made. It looks like our code isn’t doing any locking around that.

Are you doing anything asynchronous in your startup initialization, per chance?

Hi Richard, thanks for your reply. I apologize for my slow response, I’ve been out of the office. I don’t have anything async in Startup.cs. I’ve done some testing, and I’ve found something interesting. I am running my .net core app in docker on kubernetes. When I start two pods, one of them may throw this exception on every call. When I delete that pod, the replacement will not exhibit this behavior. Both pods are attempting to connect to the same CB cluster using the same connect string with the same credentials. The appsettings.json looks like this:

  "Couchbase": {
    "Username": "myuser",
    "Password": "mypass",
    "ConnectionString": "couchbase://cb01.abc.com,cb02.abc.com,cb03.abc.com",
    "Buckets": [
      {
        "Name": "mybucket"
      }
    ]
  }

I don’t think it’s a race condition, the pod will often have been running for several minutes before I wind up using it. Is there some logging you’d recommend I enable to track this down? Thanks, John

@jmurphy

I have a theory. In our clusters at CenterEdge, we always wire up to do requests to Couchbase as part of the health check URL, and that’s wired to Kubernetes so it doesn’t get traffic until that health check passes. This means that, in our use case, it will only ever get one request at a time that triggers bootstrapping to Couchbase (note that the Couchbase connection doesn’t get bootstrapped until you try to use it, just registering it on DI doesn’t).

If you’re not doing that, then it would bootstrap on the first real request. If two of those came in at the same time, then we could hit this concurrency problem. Seems like we need to add some locking to the DI system to prevent that. In the meantime, I’m wondering if using health checks or some other way to make requests to Couchbase in Startup.cs might not work around the problem.

1 Like

That sounds completely plausible. I do have a Kubernetes health check, but it’s not hitting Couchbase. I’m going to modify it to call collection.ExistsAsync and see whether the issue happens again. Thanks, John

Hi @jmurphy -

I created a ticket for tracking: Loading...

-Jeff