Click here to Skip to main content
15,887,267 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I was trying to delegate batch responses to a website that has a numerical index.
I was able to do this correctly with a parallel.for, however, I wanted to batch this work out and create tasks.

Here is my parallel.for

Parallel.For(0, 10, options, Sub(i)
                                        Call Get_HTML_data("https://www.website.com/" & i)
                                    End Sub)


I tried to follow instructions on this website:
How to send many requests in parallel in ASP.Net Core – Michał Białecki Blog[^]

Here is what I have so far:

Dim downloadTasksQuery As IEnumerable(Of Task(Of Integer)) = Enumerable.Range(10, 1).ToList
        Dim downloadTasks As List(Of Task(Of Integer)) = downloadTasksQuery.ToList()

        Dim batchSize = 100
        Dim numberOfBatches As Integer = CInt(Math.Ceiling(CDbl(downloadTasksQuery.Count()) / batchSize))

        For i As Integer = 0 To numberOfBatches - 1
            Dim currentbatch = downloadTasksQuery.Skip(i * batchSize).Take(batchSize)
            downloadTasks.Add(client.GetUsers(currentbatch))
        Next

        Dim finishedTask As Task(Of Integer) = Await Task.WhenAny(downloadTasks)
        downloadTasks.Remove(finishedTask)


I am not sure what to do in the for i as integer = 0 to numberofBatches - 1 part of the code.


here is the code that the I want to connect to

Private Sub Get_HTML_data(website_str As String)

       Try

           Dim x As String
           Dim pos1 As Long, pos2 As Long, row_number As Long
           Dim request As WebRequest = WebRequest.Create(website_str)
           Dim response As HttpWebResponse = CType(request.GetResponse(), HttpWebResponse)
           Dim datastream As Stream = response.GetResponseStream
           Dim reader As New StreamReader(datastream)
           Dim strdata As String = reader.ReadToEnd

           SyncLock dt

               'add new row to datatable
               'Dim R As DataRow = dt.NewRow
               R = dt.NewRow

               'always add ID to the first column
               R("ID") = dt.Rows.Count + 1

               If strdata <> "" Then
                   For row_number = 0 To DataGridView1.Rows.Count - 2

                       x = DataGridView1.Rows(row_number).Cells(1).Value
                       pos1 = InStr(strdata, x)

                       If pos1 > 0 Then

                           pos1 = pos1 + Len(x) - 1
                           pos2 = InStr(pos1, strdata, DataGridView1.Rows(row_number).Cells(2).Value.ToString, vbTextCompare) - 1
                           R(dt.Columns(CInt(row_number + 1)).ColumnName) = Trim(strdata.Substring(pos1, pos2 - pos1))

                       End If
                   Next

                   'add new row once all data is collected
                   dt.Rows.Add(R)

               End If
           End SyncLock

       Catch ex As Exception
           'MsgBox(ex.Message)
           'need to make a list of numbers that did not work and display list of numbers at end of process

       End Try

   End Sub


What I have tried:

I have tried to follow the website as closely as I can and I have looked into other methods but still can not figure out how to correctly make multiple tasks.
Posted
Updated 4-Aug-21 23:38pm

1 solution

Start by making your retrieval code asynchronous. You'll probably want to use the HttpClient class for that:
HttpClient Class (System.Net.Http) | Microsoft Docs[^]
VB.NET
Private Async Function LoadHtml(ByVal url As String) As Task
    Try
        Dim client As New HttpClient()
        Dim response As String = Await client.GetStringAsync(url)
        
        If InvokeRequired Then
            Dim d As Action(Of String) = AddressOf AddRow
            Dim ar As IAsyncResult = BeginInvoke(d, response)
            Await Task(Of Object).Factory.FromAsync(ar, AddressOf EndInvoke, Nothing)
        Else
            AddRow(response)
        End If
        
    Catch ex As Exception
        ' TODO: Log the error details...
    End Try
End Function

Private Sub AddRow(ByVal response As String)
    ' Only calling this from the UI thread, so there's no need to lock:
    Dim row As DataRow = dt.NewRow()
    row("ID") = dt.Rows.Count + 1
    If Not String.IsNullOrEmpty(response) Then
        ...
    End If
    
    dt.Rows.Add(row)
End Sub
Create a list of tasks, and then use Task.WhenAll to wait for them all to finish.
VB.NET
Dim tasks As List(Of Task) = Enumerable.Range(0, 10).Select(Function (i) LoadHtml(BaseUrl & i)).ToList()
Await Task.WhenAll(tasks)
 
Share this answer
 
Comments
Member 11856456 5-Aug-21 6:19am    
Richard, I noticed that during the async memory usage goes up pretty rapidly, how can I slow the progression or keep it more steady? When I was using parallel.for I was able to keep the memory consumption to about 45 MB, using async for instance of 1000 records jumps up to about 520 MB.
Richard Deeming 5-Aug-21 6:40am    
You could try using a Semaphore, as described in this SO thread:
c# - How to throttle multiple asynchronous tasks? - Stack Overflow[^]
Dim semaphore As New SemaphoreSlim(Environment.ProcessorCount * 2)
Dim tasks As IEnumerable(Of Task) = Enumerable.Range(0, 10).Select(Async Function (i)
    Await semaphore.WaitAsync()
    Try
        Await LoadHtml(BaseUrl & i)
    Finally
        semaphore.Release()
    End Try
End Function)

Await Task.WhenAll(tasks)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900