How To Obtain a Random Sub-datatable from Another Datatable





4.00/5 (1 vote)
How to get a random subset of data from a DataTable
Introduction
In this article, I'll show how to get a random subset of data from a DataTable
. This is useful when you already have queries that are filtered correctly, but it returns all the rows.
Analysis
I came across this situation when I wanted to display a random tag cloud. I already had the query to get the keywords ordered by number of clicks and I wanted to created a tag cloud. Tags that are the most popular should have more chance to get picked and should be displayed larger than the less popular ones.
Implementation
In this code snippet, there is everything you need:
' Min size, in pixels for the tag
Private Const MIN_FONT_SIZE As Integer = 9
' Max size, in pixels for the tag
Private Const MAX_FONT_SIZE As Integer = 14
' Basic function that retrieves Tags from a DataBase
Public Shared Function GetTags() As MediasTagsDataTable
' Simple call to the TableAdapter, to get the Tags ordered by number of clicks
Dim dt As MediasTagsDataTable = taMediasTags.GetDataValide
' If the query returned no result, return an empty DataTable
If dt Is Nothing OrElse dt.Rows.Count < 1 Then
Return New MediasTagsDataTable
End If
' Set the font-size of the group of data
' We are dividing our results into sub set, according to their number of clicks
' Example: 10 results -> [0,2] will get font size 9, [3,5]
' will get font size 10, [6,8] will get 11, ...
' This is the number of elements in one group
Dim groupLenth As Integer = CType(Math.Floor_
(dt.Rows.Count / (MAX_FONT_SIZE - MIN_FONT_SIZE)), Integer)
' Counter of elements in the same group
Dim counter As Integer = 0
' Counter of groups
Dim groupCounter As Integer = 0
' Loop through the list
For Each row As MediasTagsRow In dt
' Set the font-size in a custom column
row.c_FontSize = MIN_FONT_SIZE + groupCounter
' Increment the counter
counter += 1
' If the group counter is less than the counter
If groupLenth <= counter Then
' Start a new group
counter = 0
groupCounter += 1
End If
Next
' Return the new DataTable with font-size
Return dt
End Function
' Function that generates the random sub set
Public Shared Function GetRandomSampleTags(ByVal KeyCount As Integer) _
As MediasTagsDataTable
' Get the data
Dim dt As MediasTagsDataTable = GetTags()
' Create a new DataTable that will contain the random set
Dim rep As MediasTagsDataTable = New MediasTagsDataTable
' Count the number of rows in the new DataTable
Dim count As Integer = 0
' Random number generator
Dim rand As New Random()
While count < KeyCount
Randomize()
' Pick a random row
Dim r As Integer = rand.Next(0, dt.Rows.Count - 1)
Dim tmpRow As MediasTagsRow = dt(r)
' Import it into the new DataTable
rep.ImportRow(tmpRow)
' Remove it from the old one, to be sure not to pick it again
dt.Rows.RemoveAt(r)
' Increment the counter
count += 1
End While
' Return the new sub set
Return rep
End Function
Pros
This method is good because it doesn't require much work to get it working fast. It is a good concept when you are working with small tables, let's say less than 100 records.
Cons
If you have more than 100 records, an out of memory exception may occur since we are copying and duplicating rows. I would consider using a stored procedure instead.
Category: CodeProject
Published: 4/19/2009 10:04 PM