Click here to Skip to main content
14,876,191 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
we have a public async method with the following code in it

int page = 1;
     int size = 50;


     var activeDocuments = await GetDocuments(page, size);

     while (activeDocuments .Count > 0)
     {
         foreach (var document in activeDocuments)
         {
             // re-calculate points
             document.Points = await CalculatePoints(document, settings);

             //update in DB
             updatedDocument = await _feedRepository.UpdateFeedAsync(document);
         }
         //increment page to get next feeds
         page += 1;

         // get active documents for next page
         activeDocuments = await GetDocuments(page, size);
     }


But the problem we are facing is that records are being missed from the iteration. Suppose we have 600+ records and above code only updates around 300.

Seems like an issue with the parallel processing but unable to get the hint of it.

Any help would highly be appreciated.

Note: We are using Hangfire and the code above is a background job enqueued via Hangfire.


EDIT: [Solution] Seems like its the mongo issue with indexes, although I created an index on CreatedAt(Datetimeoffset) field but in vain. While If I dont sort the collection (sorted automatically by default index i.e. _id), it works fine.

What I have tried:

Looked into different articles on google related to foreach implementation w.r.t multi-threading/parallel processing etc.
Posted
Updated 28-Feb-21 19:39pm
v2
Comments
Gerry Schmitz 26-Feb-21 15:10pm
   
Overwriting the data source in the while that tests the data source looks "intriguing";

1 solution

Difficult to answer, since we can't see what the GetDocuments or UpdateFeedAsync methods are doing. But at a guess, you're changing something on the document which causes it to move to a different "page" of data, so the next call to GetDocuments returns the same document again.

If that's the case, this isn't an async problem at all. You'd have the same issue with a purely synchronous method. You'll need to change your code so that the sequence doesn't change between calls to GetDocuments.
   
Comments
VICK 26-Feb-21 9:45am
   
We are using ReplaceOneAsync inside the update method and in the get fetching records sorted based on CreatedDate.

=====================

var filter = Builders<document>
.Filter
.Eq(x => x.Id, passedDocument.Id);


filter &= Builders<document>
.Filter
.Eq(x => x._status, (int)StatusEnum.Active);


var result = await _Collection.ReplaceOneAsync(filter, passedDocument);

===================================

var documents = await _collection.AsQueryable()
.Where(i => i._status == (int)StatusEnum.Active)
.OrderBy(i => i.CreatedAt)
.Skip((page - 1) * size)
.Take(size)
.ToListAsync();

==============================

As we are sorting out based on CreatedDate, should not the sequence be same ?

One more thing to note is CreatedDate is DateTimeOffset type in .net code and db is mongo.
Richard Deeming 26-Feb-21 9:52am
   
The sort order should remain the same, but I'm not familiar with the ReplaceOneAsync method, so I can't be sure.

You'd also need to make sure you're not changing the _status property on the document. If a document from page 1 is no longer active, documents which were previously on page 2 will now be on page 1 instead.

Have you tried creating a version of the code that doesn't use async? You can't be sure it's an async method unless the issue disappears when you use synchronous methods.
VICK 26-Feb-21 9:57am
   
Tried without async and same. When trying with page size 50, few records are not coming up but if i get the collection in a single go with higer page number, then everything is fine.
Richard Deeming 26-Feb-21 9:59am
   
Which confirms my suspicion that you are either altering the property used to sort the documents, or you are altering the property used to filter the documents.

If you want to load the documents in pages, the filter and sort order need to be stable between requests.
VICK 1-Mar-21 1:37am
   
seems like if we dont sort the records (default _id sort will work in that way), it works properly. But even If I add index on CreatedAt field and sort on that it still wont work. So for now going with the default index sort i.e. _id.

Thanks for your help.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900