Sorting optimization - high level strategies wanted

Question

5.00/5 (2 votes)

See more:

Hello everyone

I have a piece of code that I wrote today that sorts the edges of a 3d mesh for processing. Essentially, I have linked list of sorted edges. I then process some edges that now become dirty (out of order in the list). I then use the following code to first delete the dirty edges from the list (leaving an incomplete but sorted list). I then insert the dirty edges back into the list in the right position (leaving a complete sorted list). I then iterate over this process some thousands of times.

The following code snip-it works fine but is too slow. I've done a time analysis and found that this code consumes over 90% of the processing time. This amounts to about 30 seconds for 8000 iterations using a list with 15,000 edges. It is way to slow for my needs.

Now I'm trying to figure out what I should do and was hoping that some of the more experienced programmers here might be able to advise me on how to go about speeding this up. Some high level pointers to what I should do to optimize the sorting algorithm would be good.

I ask here because I've searched on sorting algorithms and there is a lot of simple algorithms that work (e.g. I've tried the shell-sort algorithm) but they also seem too slow. However, maybe within the constraints I've outlined in this post there are some algorithmic tricks to speed this up.

Any help would be appreciated!!!

//////////////////////////////////////////////////////////////////////
list<error_node> edge_head;  // list of mesh edges
vector<list<error_node>::iterator> edge_map; // vector of iterators that point to each edge in egde_head
vector<error_node> sort_edge; // vector of dirty edges

struct error_node
{
   float err; // error value used to sort the list
   int ind; // index of edge
};

// compare function is used to std::sort
bool compare(error_node first, error_node second)
{
    if (first.err >= second.err)
        return false;
    return true;
}
/////////////////////////////////////////////////////////////////////

// region of code that takes 90% of running time

                if (!sort_edge.empty())
                {
                    std::sort(sort_edge.begin(), sort_edge.end(), compare); // sort edges into ascending `err` values

// next for loop removes dirty edges from edge_head leaving an incomplete but sorted list
                    vector<error_node>::iterator vec;
                    for (vec=sort_edge.begin(); vec!=sort_edge.end(); vec++){
// erase edge with index ind using edge_map which is an indexing of the error_nodes in edge_head e.g. suppose (*vec).ind == 10 then edge_map[10] is an iterator to the error_node with index 10 in edge_head list
    edge_head.erase(edge_map[(*vec).ind]);
                    }
//next bit of code inserts the dirty edges back into the list in the right order
                    vec=sort_edge.begin();
                    for (lis=edge_head.begin(); lis!=edge_head.end(); lis++){
                        if ((*lis).err>=(*vec).err){
                            lis = edge_head.insert(lis, *vec);
                            edge_map[(*vec).ind] = lis;
                            vec++;
                        }
                        if (vec==sort_edge.end())
                            break;
                    }
// now insert the remaining dirty edges at the end of the list.
                    while (vec!=sort_edge.end()){
                        edge_head.push_back(*vec);
                        edge_map[(*vec).ind] = (--edge_head.end());
                        vec++;
                    }
                    sort_edge.clear();
                }

Posted 11-Feb-11 10:30am

FlurryKnox

Updated 11-Feb-11 21:12pm

v2

Add a Solution

Comments

Niklas L 11-Feb-11 17:22pm

If edge_head has about 15000 elements, how many elements would sort_edge have in average?

FlurryKnox 12-Feb-11 3:18am

sort_edge is typically small having less than 20 elements.

One thing though...I have tried to use std::sort to sort the edge_head list and find it takes 4 times as long as a straight-forward implementation of the shell-sort algorithm. What I'll do is analyse how much of the processing time is consumed by std:sort and post the result later.

jswolf19 11-Feb-11 21:09pm

Using a linked list is killing you, I think. Have you tried a tree or hash structure? A hash will be faster at a larger (probably much larger) memory footprint, but a tree should be only a slightly larger footprint to a linked list but faster for sorting. Probably something like a red-black tree should work much better for you.

FlurryKnox 12-Feb-11 3:31am

Yes, I think that the linked list is not the way to go. I'm looking into hash tables and trees now to determine which to try out first. I'll post again to let you know what I find.

Andrew Brock 12-Feb-11 0:04am

If possible, can you please add the definition of error_node (is it a struct?) and list any constraints on member variables.
Also provide the code for the function compare() or at least tell us what is getting compared.

In some circumstances it is possible to make a much more efficient sorting algorithm than any of the well known ones.

FlurryKnox 12-Feb-11 3:32am

I've updated the post to include the definitions.

4 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Andrew Brock · Answer 1 · 2011-02-11T23:29:00

If we make a little program to compare your sort to some others, we can see if there is a more efficient algorithm.

#include <stdio.h>
#include <stdlib.h>
#include <windows.h>
#include <mmsystem.h> //for timeGetTime()
#include <vector>
#include <list>
#include <algorithm>

using std::vector;
using std::list;

static DWORD gs_nStartTime = 0;

#pragma comment(lib, "winmm.lib") //for mmsystem.h

void StartTimer() {
	gs_nStartTime = timeGetTime();
}

void StopTimer(LPSTR szName) {
	printf("%s sort took %ums\n", szName, timeGetTime() - gs_nStartTime);
}

struct error_node {
	float err; // error value used to sort the list
	int ind; // index of edge
};

//Create some variable types for convenience
typedef list<error_node> edge_head_t;
typedef vector<edge_head_t::iterator> edge_map_t;
typedef vector<error_node> sort_edge_t;

//This is your compare function
//Note that it is passed by value. This creates a copy of both
//"first" and "second" every time this function is called.
bool StdCompare(error_node first, error_node second) {
	if (first.err >= second.err) {
		return false;
	}
	return true;
}

int QSortCompare(const void *pElem1, const void *pElem2) {
	if (((error_node *)pElem1)->err < ((error_node *)pElem2)->err) { //less than
		return -1;
	} else if (((error_node *)pElem1)->err > ((error_node *)pElem2)->err) { //greater than
		return 1;
	}
	return 0; //equal
}

//This is your sorting algorithm:
void YourSort(float *pEdgesArray, unsigned nEdges, unsigned *pDirtyArray, unsigned nDirtyEdges) {
	edge_head_t edge_head;  // list of mesh edges
	edge_map_t edge_map; // vector of iterators that point to each edge in egde_head
	sort_edge_t sort_edge; // vector of dirty edges
	//Setup
	//edge_head
	for (unsigned nEdge = 0; nEdge < nEdges; ++nEdge) {
		error_node enNew;
		enNew.ind = nEdge;
		enNew.err = pEdgesArray[nEdge];
		edge_head.insert(edge_head.end(), enNew);
	}
	//edge_map
	for (edge_head_t::iterator iEdge = edge_head.begin(); iEdge != edge_head.end(); ++iEdge) {
		edge_map.push_back(iEdge);
	}
	//sort_edge
	for (unsigned nEdge = 0; nEdge < nDirtyEdges; ++nEdge) {
		error_node enNew;
		enNew.ind = pDirtyArray[nEdge];
		enNew.err = pEdgesArray[enNew.ind];
		sort_edge.push_back(enNew);
	}
	//Now start the timer and sort
	StartTimer();
	if (!sort_edge.empty()) {
		std::sort(sort_edge.begin(), sort_edge.end(), StdCompare); // sort edges into ascending `err` values
		// next for loop removes dirty edges from edge_head leaving an incomplete but sorted list
		sort_edge_t::iterator vec;
		for (vec = sort_edge.begin(); vec != sort_edge.end(); ++vec){ //NOTE: ++vec is faster than vec++, especially for iterators
			// erase edge with index ind using edge_map which is an indexing of the error_nodes in edge_head e.g. suppose (*vec).ind == 10 then edge_map[10] is an iterator to the error_node with index 10 in edge_head list
			edge_head.erase(edge_map[(*vec).ind]);
		}
		//next bit of code inserts the dirty edges back into the list in the right order
		vec = sort_edge.begin();
		for (edge_head_t::iterator lis = edge_head.begin(); lis != edge_head.end(); ++lis){
			if ((*lis).err >= (*vec).err){
				lis = edge_head.insert(lis, *vec);
				edge_map[(*vec).ind] = lis;
				++vec;
			}
			if (vec == sort_edge.end()) {
				break;
			}
		}
		// now insert the remaining dirty edges at the end of the list.
		while (vec != sort_edge.end()){
			edge_head.push_back(*vec);
			edge_map[(*vec).ind] = (--edge_head.end());
			++vec;
		}
		sort_edge.clear();
	}
	StopTimer("Your");
}

//Just a simple vector with no insert/remove
void SimpleStdVectorSort(float *pEdgesArray, unsigned nEdges) {
	sort_edge_t sort_edge; // vector of ALL edges
	//Setup
	//sort_edge
	for (unsigned nEdge = 0; nEdge < nEdges; ++nEdge) {
		error_node enNew;
		enNew.ind = nEdge;
		enNew.err = pEdgesArray[nEdge];
		sort_edge.push_back(enNew);
	}
	//Now start the timer and sort
	StartTimer();
	std::sort(sort_edge.begin(), sort_edge.end(), StdCompare); // sort edges into ascending `err` values
	StopTimer("std::vector");
}

//Using the ANSI qsort
void QSort(float *pEdgesArray, unsigned nEdges) {
	//Setup
	error_node *pNodes = new error_node[nEdges];
	for (unsigned nEdge = 0; nEdge < nEdges; ++nEdge) {
		pNodes[nEdge].ind = nEdge;
		pNodes[nEdge].err = pEdgesArray[nEdge];
	}
	//Now start the timer and sort
	StartTimer();
	qsort(pNodes, nEdges, sizeof(error_node), &QSortCompare);
	StopTimer("qsort");
	delete []pNodes;
}

//Testing
int SetupCompare(const void *pElem1, const void *pElem2) {
	if (*(float *)pElem1 < *(float *)pElem2) { //less than
		return -1;
	} else if (*(float *)pElem1 > *(float *)pElem2) { //greater than
		return 1;
	}
	return 0; //equal
}

int main() {
	printf("Setting up. Please wait.\n");
	//This is how many elements will be sorted.
	//You should change this to what you would expect to be your common case.
	//Results may vary widely if your common case is only 500, or 1,000,000
	const int nElements = 15000;
	//This is how many "dirty" elements will be in the array.
	//You should change this to what you would expect to be your common case.
	//Results may vary widely if your common case is 1% of the array or 95% of the array.
	const int nRandomElements = 1000; //1000 elements will be randomised again after the initial sort
	float *pSortArray = new float[nElements];
	unsigned *pDirtyArray = new unsigned[nRandomElements];
	for (int nElement = 0; nElement < nElements; ++nElement) {
		pSortArray[nElement] = rand() * 4.0f / RAND_MAX; //generate a bunch of random numbers in the range [0.0, 4.0)
	}
	qsort(pSortArray, nElements, sizeof(float), &SetupCompare);
	//We now have a completly sorted array.
	//Now inject a bit of randomness.
	for (int nElement = 0; nElement < nRandomElements; ++nElement) {
		bool bAlreadyExists;
		do {
			bAlreadyExists = false;
			pDirtyArray[nElement] = rand() % nElements;
			for (int nTest = 0; nTest < nElement; ++nTest) {
				if (pDirtyArray[nElement] == pDirtyArray[nTest]) {
					bAlreadyExists = true;
					break;
				}
			}
		} while (bAlreadyExists);
		pSortArray[pDirtyArray[nElement]] = rand() * 4.0f / RAND_MAX; //generate a bunch of random numbers in the range [0.0, 4.0)
	}
	printf("Starting sorts\n");
	YourSort(pSortArray, nElements, pDirtyArray, nRandomElements);
	SimpleStdVectorSort(pSortArray, nElements);
	QSort(pSortArray, nElements);
	delete []pDirtyArray;
	delete []pSortArray;
	return 0;
}

The output of the above on my computer was:

Setting up. Please wait.
Starting sorts
Your sort took 63ms
std::vector sort took 154ms
qsort sort took 4ms

These results were on an array of 15,000 elements with 1,000 "dirty" elements randomly spread throughout the data. The times were only for sorting only, not setting up or pre-processing the data.
You should test with arrays of different sizes and different numbers of dirty elements that would be around what you expect to be a common case for your algorithm.
Also, I'm not sure on your situation, if you are heavily relying on the standard library, then qsort may not be the best option.

To my amazement, your algorithm was actually much faster than the plain std::vector sort, however it was beaten by qsort by a factor of 16.

You may consider adding in other comparisons such as a Binary Tree, or the earlier mentioned Red-Black Tree.

Aescleal · Answer 2 · 2011-02-12T06:05:00

A couple of things you might like to try before abandoning std::sort for std::qsort...

- Don't use a function as the predicate, use a functor. In a quick test I knocked up using your data structure I found that making the predicate a functor sped up the sort so it completed in 21/41 of the time over qsort. I got the same speedup implementing operator < in the structure definition. Using a function with std::sort still completed 38/41 times faster than qsort.

- Don't pass by value to the predicate. Pass by const reference. This opens up a whole wheelbarrow full of optimisations the compiler can give you when it inlines the functor.

- Implement a swap method for your structure. Some compilers have an overload of std::sort for objects that implement swap, which if you're a bit cunning can remove a copy. However they usually do the same thing for PODs so you might not get anything - it could be worth a try though.

The first one is by far the most important. Don't use functions when you can use functors. Even when compilers can do whole program compilation they still often can't optimise calls through function pointers - and this is doubly so when the function pointer is called from library code.

Cheers,

Ash

tgis.top · Answer 3 · 2011-02-11T17:16:00

Two method
1 don't use linked list. use some kind of sorting tree. Once after process make it sorted.

2 Once after process insert the edge in the right place to make it in order.

the second seem more easy for you to implemented.

When process you have a point refer to the edge. so you can move foreword or backward. After processing, you can calculate whether move the edge foreword or backward to make the list in order. save the point refer to the edge and move foreword or backward to find the correct place to insert the edge and insert it.

this will save the time for iterating the whole list to delete the dirty edge and iterating to insert.

FlurryKnox · Answer 4 · 2011-02-12T07:36:00

Just one thought about my initial problem....

I guess there are two ways to approach this problem. One approach is to select an efficient sorting algorithm. There will be a lot of algorithms better than my straight forward linked list approach. The other approach is to eliminate the sort whenever possible. This is something that I have neglected so far. Eliminating sorting might be possible because all I really want is to find the smallest element in the list during each iteration of my algorithm.

Just as an example, it might be possible to create a main sorted list then extract dirty elements into a separate, smaller list. I could keep track of the smallest dirty element in this separate list. I could then continue to use the main list until the the lowest value element exceeds the smallest dirty element. I could then merge and sort both lists.

This is just an example to explain the idea. I'm sure that some of you have used far better strategies to eliminate unnecessary processing in similar situations. If so, then I'd like to hear your thoughts about this as it might give me some new ideas.

What do you think???

Sorting optimization - high level strategies wanted

4 solutions

Solution 2

Solution 3

Solution 1

Solution 4

Add your solution here

Preview 0