Fast and improved 2D Convex Hull algorithm and its implementation in O(n log h)

Omar Saad (IREQ)

4.98/5 (45 votes)

Oct 13, 2017

CPOL

58 min read

99276

986

Many improvements over a pretty new and unknown very fast 2D Convex Hull algorithm and much more.

Source code is available at: GitHub and binaries (executable) here: Download OuelletOnlineConvexHull.zip

Note: I got better printing results selecting landscape instead of portraits.

Introduction

This article is the 2^nd of a series of 3

Ouellet Algorithm article link

Summary

A Convex Hull Algorithm and its implementation in O(n log h)

Description of the inner working of the algorithm.

Fast and improved 2D Convex Hull algorithm and its implementation in O(n log h)

This article.

Show a C++ implementation. Describe and show a new implementation using an AVL tree as convex hull point container. Describe many added optimizations. Show results of multiple comparisons of Convex Hull algorithm implementations.

First and Extremely fast Online 2D Convex Hull Algorithm in O(Log h) per point

Added "Online" functionalities. Also added helper functions to make the algorithm more flexible and easier to use, without affecting original performance.

Little request

After reading this article, if you think this algorithm is good enough to be in Wikipedia – Convex hull algorithms, I would be grateful to add a link to Liu and Chen article (or any of the 2 articles I wrote, this one and/or A Convex Hull Algorithm and its implementation in O(n log h)). But please be sure to read this section first: Appendix B – My Wikipedia experience.

Reason for writing back a new article about almost the same algorithm

This article is the continuation of the first one: A Convex Hull Algorithm and its implementation in O(n log h). New article motivations are:

The initial reason was to show a newly developed implementation of Ouellet algorithm done in C++. According to provided performance tests, the new implementation is around 4 times faster than Chan implementation in C (also provided).
To add and show another implementation of Ouellet algorithm using an AVL tree as a container for Convex Hull points instead of an array based container. There is 2 implementations based on AVL tree: a simple one and a more efficient one with many added optimization.
To explain what optimizations had been added to the AVL version to get better performance (Ouellet AVL v2 vs Ouellet AVL).
To share my workbench tool to easily compare 2D algorithms using points like the Convex Hull algorithm or any other algorithm that take points as input and produce points as output. For example, the workbench can also be used to test the smallest enclosing circle algorithms (code provided).
To show and document few improvements over my first iteration of my algorithm which makes my C# Convex Hull implementation now faster than Pat Morin implementation of Chan algorithm written in C (at least with all "normal" random point generators that I used for testing).
To show many real life convex hull algorithm implementation and compare their performance.
To show the fastest implementation of a generator of all permutations possibilities of a set of points (code provided).
To share other thoughts/discoveries (Multithreading, Wikipedia, Smallest Enclosing Circle, …).

What is a Convex Hull

Definition

According to Wikipedia: "In mathematics, the Convex Hull or convex envelope of a set X of points in the Euclidean plane or Euclidean space is the smallest convex set that contains X. For instance, when X is a bounded subset of the plane, the Convex Hull may be visualized as the shape enclosed by a rubber band stretched around X."

Diagram 1: Example, in red of a convex null (from the provided Workbench)

Applications of Convex Hull in 2D and in 3D

Convex Hull algorithm is a fundamental algorithm in computation geometry, on which are many algorithms in computation geometry based. Also there are a lot of applications that use Convex Hull algorithm.

The Convex Hull in used in many areas where the path surrounding the space taken by all points become a valuable information. There is some example:

To determine the impedance zone of electrical public utility simulations of their network (IEEE). This is the reason why I met the convex hull few years ago.
It could serve for input to other algorithms, to improve their performance. For example, a common Smallest Enclosing Circle algorithm can have its performance greatly improved by pre-processing points by a fast Convex Hull algorithm implementation. Difference in performance can be shown in provided benchmark.
Convex hulls can be used in image processing.
Used in GIS system to create a feature class containing polygons which represent a specified minimum bounding geometry enclosing each input feature or each group of input features.
And many others ... 11 applications on Quora

The finding of the Convex Hull is also largely used and studied in universities.

Performance and complexity of Convex Hull algorithms

There are many Convex Hull algorithms with different complexities. Wikipedia - Convex hull algorithms list many of them. Ouellet Convex Hull (or Liu and Chen) algorithm is not in Wikipedia. For more information about why, read Appendix B – Wikipedia at the end of this paper. A good indicator of algorithm performance is usually related to its complexity. It is recommended to choose an algorithm that has the fastest complexity. In the case of convex hull, Results for Speed Test confirms that rule of thumb. Currently, the best known complexity order for a Convex Hull algorithm is O(n log h) where:

“n” is the number of all source points
“h” is the set of result points, among all source points, that delimits the path of the Convex Hull.

According to tests and personal experience, in general use cases, “h” is a lot smaller than “n”. For example, in my tests for a random set of 20 000 000 points in a circle, the Convex Hull is usually made of 200 to 600 points for regular random generators (circle or throw away). Considering the fact that it exists algorithm where the complexity is either: O(n²), O(n log n) and O(n log h). It is quite easy to realize that having a complexity based on “log h” instead of “log n” would improve the performance drastically. But it is also possible to have 2 algorithms in the same complexity where one is really faster than the other. That is the main reason of this article. But as you will see and could test yourself with provided code, results from general cases suggest that Ouellet algorithm is the fastest one and by a pretty good margin. It also has other advantages depending of your needs. When comparing, you should also consider the language the algorithm is implemented. In actual test results, you have to take into account that C# has the reputation to be slower than C. My personal test seems to be in accordance to the fact that C# is quite slower than C++, at least for the implementation of this algorithm.

Ouellet Convex Hull Algorithm

How it works

The best detailed description of the inner working of Ouellet algorithm could be found at

My first code project article about it: A Convex Hull Algorithm and its implementation in O(n log h)
Liu and Chen original Scientific Paper: A new algorithm for computing the convex hull of a planar point set.

Additional information can be found later in section: Differences of Ouellet VS. Liu and Chen Convex Hull Algorithm

Algorithm main idea

This is a quick Summary. More information can be found at A convex Hull Algorithm and its implementation in O(n log h)

The algorithm has 3 main steps that occur in sequence:

Step 1 - Identify virtual quadrant boundary
- We found the boundary (left, top, right, bottom) of all input points (first pass on all input points).
- We define 4 virtual quadrant based on boundary points (Q1 is based on top and most right points, its root is based on x of the top point and y of most right point). Same things apply for each quadrant.
- Each Hull Points will be kept per quadrant (virtual quadrant) and will always be ordered. Note: in all supplied implementations all points are stored counterclockwise for each quadrant.
Step 2 - Find Hull Points
- For each input points, we find if it is part of a quadrant and insert as Convex Hull point if applied.
  - Using dichotomy, we search where an input point should be inserted if apply.
  - We insert the input point if it is part of the Convex Hull. If it is the case, we also remove any neighbors for which the new Convex Hull point invalidated them.
Step 3 – Merging Quadrant results
- Merge the 4 quadrant Hull Points results, removing any duplicate at quadrant boundaries

Differences of Ouellet VS. Liu and Chen Convex Hull Algorithm

Before stepping into Convex Hull optimizations, it should be clear Liu and Chen Convex Hull algorithm and Ouellet algorithm are based on the same principle: virtual quadrant, at least according to what I understood from the article of Liu and Chen: A new algorithm for computing the convex hull of a planar point set. The article of Liu and Chen has no code or implementation detail and/or link to any of those. Because of that, I decided to rename my first implementation code: Liu and Chen. Many optimizations have been added since then which improve performance. Included benchmark results show those differences.

According to the date of publication: 2007-07-01, Liu and Chen are the first to discover the principle of using virtual quadrant. Optimizations is what could be seen as differences between "Liu and Chen" and "Ouellet".

Optimizations

Following optimizations are optional optimizations that could be applied to one or more of the Ouellet convex hull algorithm. Some are only used in the Ouellet CPP version while most of them were applied in the Ouellet AVL v2. None of them are included in Liu and Chen implementation.

Slope VS Cross Product

In the case of the Ouellet Convex Hull, to insert a new point as a potential convex hull point, it requires to know if it is to the left or right of 2 neighbor points where it should be inserted. It could be done in 2 ways:

Method for determining the side of a point from 2 other ones	Slope	Cross Product
Formula
Description	By comparing 2 slopes. For A, B, C points where A and B are points each side of new point to compare C. Compare slope of (AB) with slope of (AC). Note: The slope (AB) can be easily cached in order to save some calculation next time.	By using cross product directly. In previous formula for A, B, C: if (d>0) then C is to the left if (d=0) then C is on the same line if (d<0) then C is to the right.
Required calculation	2 x (2 subtractions and 1 division) = 4 subtractions and 2 divisions **	5 subtractions and 2 multiplications
Final result in the algorithm	Slightly slower	Slightly faster

Results in algorithm

Usage results in algorithm was a little gain in performance for Cross Product. Just enough to beat Chan algorithm in some cases. But it also had the effect of simplifying the selection of potential convex hull candidate. The cross product could be used exactly the same for all quadrants which enables three things:

Simplification and better code architecture (more generic) by raising point selection on the base class of each quadrant.
Simplification by removing the cache management of slopes into point structure
Smaller footprint by removing the cached slope from the point structure. Can use the already defined framework point structure.

Multiplication VS. Division

Benchmark results :

According to an included benchmark test, the multiplication is 1.14 faster than the division. To know more about testing condition, see Hardware used for testing. Difference in speed between the 2 partly explain why Cross Product is a little faster. Many other variables should be considered like CPU pipelining, memory proximity, and more.

Quick Discard (QD)

The quick discard was the first difference that exists between "Liu and Chen" algorithm and Ouellet algorithm. It is a simple check, added to Ouellet algorithm, which is done at each iteration of the dichotomy search, which in most cases, would quickly stop the search because the point will be rejected without going any further into the dichotomy search. The check to know if we can reject the currently evaluated point does not require any calculation, only one simple comparaison is required against the each point of the dicotomy travel. This advantage comes from the fact that we are using quadrants, and also, because quadrant Convex Hull points are always sorted. In general, the time to do this simple test will save time to perform the all the dichotomy steps because a point could be discarded quickly in first steps. Quick discard is made possible because dichotomy is merging to its insertion point. If we take into account that "h" (count of Convex Hull points) is, in general, a lot less than "n" (count of input points), it should save time in most cases. In other words, there is a great amount of point that should be discarded and the chance of discarding one quickly is high.

Tests have been done between Liu and Chen and Ouellet with the 2 most valuable (closer to real life) random generators: throw away and circles. Liu and Chen has no QD while Ouellet has QD. In bad case (throw away point random generator), QD is just a little bit slower (negligible) and pretty stable for any amount of input points. In good case (circle point random generator), QD is quicker by 1.2 times for 100 000 points and performance increase slowly with the number of points. It could be worse without QD, but so slightly that it is a nice addition and should improve performance in general usage. The same test could be done easily with the provided benchmark.

Opposite quadrant check (OQC)

In Ouellet algorithm, due to the way quadrant are defined, 2 adjacent quadrants can’t have any intersection zone. Odd and even quadrant are mutually exclusive while 2 opposite quadrant can share a zone.

Point quadrant belonging is calculated based on quadrant root point, in first versions of the algorithm. Opposite quadrant can have an intersection zone (see example below). Because of this, the algorithm should do opposite quadrant check (OQC) for any point in the intersection. OQC come from the necessity to check a point on its opposite quadrant although it has been identified as being part of a specific quadrant. That situation only occurs in some specific point distribution but should be taken into account.

The following example shows a situation where the distribution force the algorithm to do opposite quadrant check. In that example, P1 is part of 2 quadrants (Q1 and Q3) at the same time, according to each quadrant root point. This situation occurs mainly when the distribution of point is a narrow diagonal:

Q1 and Q3 share an intersection zone. In that case, it is required to check point in its opposite quadrant.
Without OQC, the resulting Convex Hull would have missed that point.

Here we have Virtual Q1 in pink and Q3 in green, with each quadrant root point as diamond and same color as its quadrant. As you can see, P1, the yellow point, which should be evaluated to see if it is part of the Convex Hull, is included in Q1 (pink zone) and Q3 (green zone). But if we would evaluate P1 for Q1 and discard it without verifying it against Q3, then we would have discarded a good candidate. To be sure to not miss any quadrant check, the lazy way would be to check each point against each quadrant. But it could easily be optimized by verifying only against an opposite quadrant when a point has been identified as being part of a quadrant. In most algorithm implementation, each point is checked against all possible quadrant in order to be sure to not miss any point in any quadrant. But in Ouellet CPP and Ouellet AVL v2, some optimization has been added (see specific information about each implementation of Ouellet Convex Hull). We can easily see that odd and even quadrants are mutually exclusive. In other words, if a point is part of Q1 or Q3, it can’t be part of Q2 or Q4 and vice-versa.

Opposite quadrant check bypass (OQCB)

Another optimization is as soon as a point is estimated to be a Convex Hull point (not just being part of a quadrant) in any dichotomy step, it is not required to check it against any other quadrant, either the opposite one. It can’t be part of convex hull points in 2 quadrants simultaneously. The check for an opposite quadrant is only mandatory for a point that has been identified has being part of a quadrant without being identified as being a potential hull point. Only quadrant limit points can be in 2 quadrants at the same time which would be merged into only one point at the 3^rd steps of the algorithm. That optimization is only done in Ouellet CPP and Ouellet AVL v2.

Disjoint quadrant check (DQC)

PCZ

First and last points are always defined counterclockwise per quadrant. PCZ is the "potential candidate zone" and is defined by the first and last quadrant points. It is the zone to the right of the first and last point of a quadrant. This zone is always unique per quadrant and define the zone of potential candidates to be added. Any other place outside that zone can’t contain any valid hull point for this quadrant.

DQC Using PCZ

There is a nice optimization that can also be done which is only used in Ouellet AVL v2. It is to verify at start if all quadrants are disjoint. To verify that, we only have to check if the root point of a quadrant is outside of its opposite quadrant PCZ. If it is the case for all quadrants, then if an input point lies into one quadrant, we can skip any other quadrant check, either its opposite one, always.

That optimization would require 2 things:

A verification to do before the 2^nd step of the algorithm to determine if quadrants are disjoint.
Different code for the 2^nd step of the algorithm depending if quadrants are disjoint or not:
- If disjoint. use DQZ
- If not disjoint, use original algorithm code
  
  Q3 is part of Q1 PCZ (yellow). Quadrants are not disjoint because at least one root point (diamond) is in its opposite quadrant PCZ
  
  All root points (diamond) are not part of any of their opposite quadrant PCZ (yellow). DQC optimization can be used.

PCZ point belonging (PCZB)

PCZ is explained at the beginning of disjoint quadrant check. Usage of the quadrant PCZ to determine current point quadrant belonging.

To quickly keep/discard potential candidate point, we can start by checking if it is part of the target quadrant PCZ by using Cross Product instead of using the quadrant root point. Doing cross product has a calculation has a cost but using it could prevent us to do many dichotomy iteration. It also has the advantage that being in PCZ of a quadrant, a point can’t be a hull point of any other quadrant. Ouellet AVL v2 uses this optimization.

Previous quadrant (PQ)

Another optimization only done in "Ouellet AVL v2", is to always start quadrant check with the last one where a point had been added, then cycle with remaining quadrant in order. That optimization could be useful in some situations where consecutive input point to evaluate have a high risk to be neighbors together, and hence being part of the same quadrant. Doing so will also enable us to skip unnecessary other quadrant checks and improve performance. Although that optimization does not bring any advantage in random point placement, it would not add any downside either. It’s an optimization that could be nice in some cases and have no impact in all others.

Algorithms

Algorithms Comparison

General description

Algorithm name	Author	Dimensions supported	In place*	Reference(s)
Monotone chain	M. Andrew	2	yes	Algorithm Implementation/Geometry/Convex hull/Monotone chain
Graham scan	Ronald Graham	2 (3?)	?	Graham scan
Delaunay/Voronoi	Boris Delaunay and Georgy Voronoi	Any	?	Delaunay triangulation and Voronoi diagram
Chan	Timothy M. Chan	3 (according to Wikipedia)	yes	Chan’s algorithm
Liu and Chen / Ouellet	Liu and Chen / Ouellet	2 (highly probable: any dimension but it has to be proven)	no	A new algorithm for computing the convex hull of a planar point set (Springer) or Code Project or this article

*In place mean that point are moved directly into its source container affecting original order. If you can’t accept that, a copy of source points should be done prior to call the algorithm.

Complexity

Algorithm name	Complexity
	Speed			Memory
	Best	Typical	Worst	Typical	Worst
Monotone chain	?	O(n log n)	?	O(n)	O(n)
Graham scan	?	O(n log n)	?	?	?
Delaunay/Voronoi	O(n log log n)	O(n log n)	O(n log n)	O(n)	?
Chan	O(n log h)	O(n log h)	O(n log h)	O(n)	O(n)
Liu and Chen / Ouellet	O(n log h)	O(n log h)	O(n log h)	O(n) *	O(n) *

* Given for AVL implementation. Exact: n+(h*2)+(h*2*sizeof(pointer))

Algorithm implementation information

This is a list of implementation details in order to better understand any differences.

Algorithm name	Implementation name	Implementation			Reference / Note
Algorithm name	Implementation name	Language	#Thread	Author	Reference / Note
Monotone chain	MonotoneChain	C#	1	David Piepgrass	Loyc – 2D Convex hull in C#: 40 lines of code
Graham scan	HeapHull	C	1	Pat Morin	The link to Pat Morin code had been removed after I advised him about a little bug it its Chan algorithm implementation. A copy of the code is provided with this article source code.
Delaynay/Voronoi	MI Convex Hull	C#	1	Design Engineering Lab	GitHub, although not in O(n log h). I included it because the code is public, highly known in 3D world and I wanted to show the importance in performance of complexity (Big O) of algorithms.
Chan*	Chan	C	1	Pat Morin	The link to Pat Morin code had been removed after I advised him about a little bug it its Chan algorithm implementation. A copy of the code is provided with this article source code.
Liu and Chen	Liu and Chen	C#	1	Eric Ouellet	Original first implementation of Ouellet (Liu and Chen) algorithm
Ouellet	Ouellet C++	C++	1	Eric Ouellet	Although it is quite the same algorithm as "Ouellet 1", it can’t be used to compare language directly because I made few optimization in C++ that does not exist into the C# version.
	Ouellet 1	C#	1	Eric Ouellet	Single thread (name is based on the second pass of the algorithm)
	Ouellet 4	C#	4, 4, 1	Eric Ouellet	The 3 passes of the algorithm are done in 4, 4, and 1 threads respectively.
	Ouellet multi	C#	M, 4, 1	Eric Ouellet	Same as 4 threads but the first pass is done in multithreaded using all available threads.
	Ouellet AVL	C#	1	Eric Ouellet	The AVL implementation which code is derived come from Bitlush

* The current implementation has a very little bug which appears rarely and mainly when there is a small number of points. The bug could be shown by the workbench pretty easily but it is random (I don’t know the reason why). Code provided is made by Pat Morin from University of Carleton.

Ouellet Convex Hull Algorithm Implementation comparison

Name	Implementation description	Lang	Threads (per step)	Container	Optimization						Notes
Name	Implementation description	Lang	Threads (per step)	Container	QD	OQC	OQCB	DQC	PCZPI	PQ	Notes
Liu & Chen	First implementation	C#	1-1-1	List (array)							I renamed my first implementation as Liu and Chen
Ouellet ST	First implementation with added QD optimization	C#	1-1-1	List (array)	x						Fast, simple. See array problem
Ouellet 4T	Second implementation to increase speed	C#	1-4-1	List (array)	x						Faster than 1T
Ouellet MT	3^rd implementation	C#	M-4-1	List (array)	x						Slightly faster than 4T
Ouellet C++	4^th implementation	C++	1-1-1	Array	x	x					Fastest in general cases. Many "goto", code redundancy … Speed in mind, the code look bad.
Ouellet AVL	5^th implementation	C#	1-1-1	AVL tree	x						Not affected by large "h" (convex hull points) because the underlying container for hull point is an AVL tree. It does not reserve more space in advance to minimize heap allocator request. Due to the nature of the algorithm, a tree seems to be the natural container (discovered recently).
Ouellet Array	6^th implementation (few tests of ways of copying array data)	C#	1-1-1	Array	x						Few tests to see if I can get better performance by using array instead of a "List". Also, when using an array, is there a way to move data faster.
Ouellet Array Immu	7^th implementation	C#	1-1-1	Array but immutable	x						Instead of only copying overlapping data after the insertion point of an array (to make room for new point or remove any), I copy the entire array. It enables me to keep data coherent in Multithread.
Ouellet MT	8^th implementation (not presented here)	C#	M-M-1	List (array)	x						Attempt to produce an algorithm in O(n). See note below: Convex Hull in O(n)
Ouellet AVL v2	9^th implementation	C#	1-1-1	AVL tree	x	x	x	x	x	x	A branch of the Ouellet AVL. I mainly changed the way to check if a point is part of a quadrant or not in order to save a little bit more time. Many optimization(s) has been included only in this implementation.

The Code

The benchmark is made with Visual Studio 2017. The framework is WPF with some MVVM. Algorithms are all made in C# except few which are clearly stated which are made of C/C++, see Algorithm implementation information for more information. All tests have been made with code optimized, compiled for "Release".

Note: The only missing algorithm that is in O (n log h) is "The ultimate planar convex hull algorithm". Because there is calculation to find the median, there is a very high probability to be slower than any other actual O(n log h) algorithms. If I would have found any implementation, I would have included it here.

Implementation choices

Array VS. List

A "List" class is a C# collection which uses an array as its underlying container. Using a "List" instead of an array should have similar performance. Tests confirm a very small increase in performance for managing an array directly. The difference is so small that it is hard to justify the lost in clarity using an array. Both collections have been used in different implementation and can be compared together.

Array vs Tree

Array based containers are used in all Ouellet (and Liu and Chen) implementations except for the Ouellet AVL version. Ouellet AVL and Ouellet AVL v2 uses an AVL tree to store potential candidate instead of an array based container. Using an array based container implies a manual dichotomy management. While the tree, by its nature, implement a dichotomy internally. Using dichotomy to get appropriate insertion point ensure good performance and is the main key to stay in O(n log h) . According to my tests and one real usage in life, I suspect that "h" should stay under 1000 in most cases which would not make any difference on using an array or a tree. But having a tree would be safer for cases where "h" could be very large (~ more than 500 000 points on my tests). Otherwise, the items shifting that should occur in array based container for insertion/deletion would become too much prominent and affect performance too much.

The advantages of arrays are (compared to a tree structure):

Data is contiguous together, achieving greater CPU cache hit.
In most cases, only one call to the heap allocator is required to get an array large enough to accommodate all hull points. That is based on many tests using random point generators as sources of input points as described in benchmark here. Actual implementation reserve 1000 Points at start to reduce heap allocator requests. For 1 000 000 points, results almost always stay under 800 Hull Points with all normal random generators (not "Arc" one).
Can use indexing, which enable direct access to point. In fact, direct access would enable to achieve O(n) performance in a multithreaded algorithm but some tests showed slower results than actual implementation. Slower performance appears to be caused by multithreaded locking mechanism. See Convex Hull in O(n) for a little bit more information on a quick test that I did which did not work as expected. Also because "h" is very low in most cases, the addition of multithreading mechanism which applies on "n", should be really low in order to not become more significant than the time related to "h". The O(n) test is not included in provided code.

The major drawback of using an array over a tree structure is:

When the amount of "h" become large, bigger than ~ 500 000 (Ouellet C# implementation), the array container solution become expensive for 2 reasons:

The main one is because we have to shift points every time we insert or delete a candidate.
Also, we have to allocate a new array each time we reach the reserved space boundary and copy all data to the new array.

The tree solution (in code using an AVL tree) is newer. I mean using an AVL tree instead of an array is the latest major implementation change made. AVL tree implementation appears to be a great choice in order to get an algorithm that is directly dependent on its output size, not higher. That would assure a more constant performance in all cases. But it should not be required in general use cases because "h" (hull points) is a lots smaller than "n" (source points) in all tested cases. The reason why the C++ implementation is not using a tree is because I just recently realized the negative impact of array when "h" become too large and I’m missing time to code another implementation in C++ .

There is 2 most popular tree management algorithms for balanced tree: AVL and "Red-Black" tree. I chose AVL over "Red-Black" tree for 2 reasons:

Laziness, it appears to me to be easier to implement AVL over Red-Black tree.
According to my tests, because "h" stay extremely small in regards to "n", there should be a lot more read than insertion into the tree which should favor AVL tree over Red-Black tree to get best performance.

Convex Hull in O(n)

I attempted to create an algorithm in O(n) instead of O(n log h). I think I it is possible with the help of threads, array as containers and good design. At least 2 threads are necessary to achieve the goal. One thread that filter potential candidate quickly and another one which insert potential candidate in the result Convex Hull container, an array.

Thread 1 - Filter candidate in O(1) per point. That thread reject bad candidates and put good one into a stack. To get approximate insertion place into the array of already found candidate convex hull points, the algorithm should use linear positioning between first and last points. That should not give exact location of insertion (linearization of an arc) but should be enough to quickly reject most points. This way, there is no more dichotomy, which corresponds to removing the "log h". That thread does its job in O (n).
Thread 2- Try to insert potential convex hull point filtered by thread one. That is done in O (~h++ log h). Insertion should occur less and less over time, when approaching n points and/or approaching the solution. That should give enough time to empty the stack of point to try to insert.

If thread 1 finishes always before thread 2 or very close to that (constant time), then we could say we have a Convex Hull in O(n).

The preferred container between the 2 threads is a stack because most recently evaluated points have a higher risk to be better convex hull candidate.

In fact it was working but the time to do the job was twice as slow as Ouellet Single threaded version. Please note that in most cases, the thread 1 finished with no point in the stack of potential candidates, which should indicate a success of being in O(n), at least for general cases.

Also, the algorithm alternate between 2 copies of immutable array in order to stay coherent and not slowdown the thread which filter candidate. That was not a real success for few estimated reasons:

In general case, it is twice as slow as any other of my implementations, probably due to locking mechanism required to access the shared stack of filtered candidate. Although I used a "SpinLock" instead of a standard "Lock", which should be faster, I feel that there were too much blocking situation which added delays. I should find a not blocking way to insert potential candidates.
It is highly dependent on a good random distribution of points. It could be pretty bad otherwise because filtering could have pretty bad discard hit, at least the way I implemented my algorithm. I would have to reworks everything in order to make it more efficient and less dependent on the data…if ever possible.
I fixed an amount of thread per Quadrant which would sleep instead of helping other quadrant to complete their task when their initial Quadrant/job were completed. After seeing results, I decided to stop dreaming about O(n) for real world usage. Because of simplicity and speed of each step in the regular algorithm, adding a locking mechanism slow things down too much. Perhaps in many years, when thousands of millions of points would be few points, and the amount of thread would be higher, then perhaps it would be much regarded and useful.
I think that in short, the slowness problem comes from the small "h" which in general stay under 1000 which is a tree level of 8-10. Doing 10 iterations is so fast that it competes well against any lock mechanism.

As a side note: Due to my lack of knowledge, I’m not sure if I have the right to say that an algorithm is in "O(n)" if it requires many threads to achieve that performance? I have to look for that also. Everything about finding Hull Points in O(n) would have to be proven and would have been really fun to do but time and money rules the world and I’m part of it. Because of bad performance and because I think I should improve the algorithm, I preferred to not publish that part.

C# vs C++

Although I did not implement C++ and C# version the same, it is easy to see that C++ flexibility, or machine language proximity, has an edge in terms of performance over C#. But there are few things to note about C++ implementation:

The code is awful.
It has one very big function with repetitive code.
There is many "goto".

The C++ code is hard to maintain. C# implementations are pretty much easier to read and easy to follow. There is also more generic functions in base class, and less repetitive code. There is still some room for improvement but most peoples should understand more easily C# implementation over C++ one.

Every choice made in C++ were intentional in order to beat "Chan". Base on test results, in most general cases, at least based on my 4 general generators (circles, 5 circles, rectangles, throw away), for one million of points, the Ouellet CPP implementation is around 4 times faster than Chan. Results come from my benchmark and are available at: Results for "Deep performance test".

Multithreading

According to benchmark results, it is easy to see that multithreaded version is more than twice as fast as the mono thread one. Because of the nature of Ouellet Hull algorithm, it does make it a good candidate for an easy implementation of multithreading, at least for the 2 first steps of the 3. The 2 first steps are dependent on "n" while the 3^rd is only dependent on "h". Of the 3 steps, the first could easily be fully multithreaded, the second could be implemented easily on 4 totally independent threads (no synchronous mechanism required) and the 3^rd step would be a little harder to multithreaded but it is not affected by "n" only by "h".

Although multithreading usage is not as important as the complexity (Big O) of an algorithm, it could help get nice boost in performance. In the case of Ouellet algorithm, it is very easy to add multithreaded and kept threads fully independent which it is an additional bonus.

C# - Safe code VS. Unmanaged code

Unsafe code usage instead of safe code would have probably resulted in better performance. But unsafe code requires a lot of artifacts. Decision has been made that potential speed gain of unsafe code would not suffice to counterparts the difficulty it would add to read and maintain the code.

Container usage remark

2 different types of container had been used:

Array (or list which uses array under its hood)
AVL tree

Both containers have pretty similar speed in general cases. I started with a "List" because it was simpler to me. But there is one main drawback with array derived container: when "h" (the count of hull point) become too big, its performance becomes quickly accursed. The number of points where array become a real problem of performance is around ~100000 - ~500000. That’s why I decided to create a new version with AVL tree for those pretty rare cases. By its nature, an AVL tree is the perfect container for Ouellet algorithm in single thread. It is not affected by "h" like array is. The main difference is

Array : Dicotomy is done externally. Insertion and deletion require moving the objects close the index implied (close because of invalidated neighbors).
AVL Tree: Dicotomy come from the internal way a tree is done. We need to rebalance the tree in order to stay in O(n log h).

When the number of result points (hull points) become too big, it is obvious that AVL tree win easily.

C# Array.Copy vs MSVCRT memmove vs MSVCRT memcpy (for immutable)

If you have to copy array of data quickly and you know that type of inner array object is of simple type (or struct of simple type), then you can use MSVCRT memcpy (not overlapping copy) or memmove (overlapping copy). The function MSVCRT - memcpy if the fastest but, according to MSDN, it can’t be used for overlapping data. Otherwise Array.Copy and memmove are identical with a very consistent but negligible time advantage for memmove. To use memcpy, we require to copy the entire array each time (same as using immutable array).

Please note, in a language with a garbage collector, moving any object (not simple type or struct containing simple type) require to pin that object in memory in order for them to not being moved by the garbage collector when being moved in memory by unmanaged code. An array is an object. An array itself should be pinned when used from non-GC aware code.

C# Buffer.Copy

I can’t use Buffer.Copy to move points. Buffer.Copy does not support struct either if they are made of simple types only. See exception below:

If anybody has a clean explanation of why it is like that, I would really appreciate to know it.

C# sizeof(T)

C# sizeof(T) can’t be used on a generic type because the size of object can’t be known at compile time. I wonder if it would not be possible to know the size of the object at compile time, for example for simple type and struct of simple type? Wouldn’t be possible to have a "where" clause that would apply on T which specifies "SimpleType"? If feasible, I think it would be a nice addition the C# language to have the possibility to use "sizeof" on T.

Compiling/running the code

The code has been done using Visual Studio 2013 and had been ported to Visual Studio 2015 and then to Visual Studio 2017. It currently uses some features of VS. 2017. It is advised to use that version. If you can’t do so, you will probably have some trouble with C++ code due to its specific binding to a version of MSVCRT.dll which come with Visual Studio 2017.

If you have an exception saying BadImageFormat, you should ensure to run in 64 bits. The problem is a wrong image CPU target (64 vs 32 bits). The code in C++ is made in 64 bits only. If you remove the C++ code, then you can build with "AnyCPU" setting.

Algorithm Benchmark

All benchmark results have been made using the code provided. Everything is testable and verifiable.

Random generators

The benchmark has five different type of point generator:

Generator	Description
Circle	Random points in a circle.
5 Circles	5 random circles with random points in it. Circles can overlap or not.
Rectangle	Random points in a rectangle.
Throw away	Set a random point, goes away in any random direction and to a random distance and set a new point, iterate from that new point.
Arc 4^th quadrant	All points are on an edge of an arc making all of them valid hull points. All points are in the arc of the 4^th quadrant. This generator is very special. It is made solely to check the worst case of Ouellet algorithms and should not reflect any possible real case usage. Because my algorithm in single thread do check each point for quadrant in order (1 ,2 ,3 ,4) processing points for the 4^th quadrant is a little worse than any other one. Also keeping each point as a convex hull point makes me realize that using a list is a major drawing in cases where the count of point forming the convex hull is very high. It shows very ugly bad performance for those cases when using an array based container. In those cases, a tree container is a lot better. See Array vs Tree for more details.

Hardware used for testing

Just as additional information, this is my machine specs:

OS Name:	Microsoft Windows 7 Enterprise Version 6.1.7601 Service Pack 1 Build 7601
System Manufacturer:	Hewlett-Packard
System Model	HP Z620 Workstation
System Type	x64-based PC
Processor	Intel(R) Xeon(R) CPU E5-1660 0 @ 3.30GHz, 6 Core(s), 12 Logical Processor(s)
BIOS Version/Date	Hewlett-Packard J61 v03.50, 2013-08-15
Installed Physical Memory (RAM)	16 GB

Results for "Speed Test"

Most results presented are for algorithms in O(n log h) to better differentiate performance of fastest ones. Most tests were done with 2 generators, “Circle” and “Throw away” which should be closer to real life usage. Any other combinations of algorithms, random generators can be easily tested with the code provided. You can also add your own generator or algorithm if you prefer.

Results for implementation made in C/C++ are taken inside native language, not in C# code to prevent to take results conversion time in account. That should give a better idea of real time taken by each algorithm/implementation.

A linear regression directly based on result points is also added in order to see the tendency and if it is linear or not.

A general overview of major algorithms and/or their implementations

A general overview of major algorithms and/or their implementations (minus Monotone chain implementation)

A general overview of major algorithms in O(n log h) and/or their implementations – Circle

A general overview of major algorithms in O(n log h) and/or their implementations – Throw away

Difference between array based mono thread implementation of Ouellet algorithms

Immutable array is more affected than regular array by a large result point count
("Arc" point generator result is made of all input points)

Effects of using array based containers when h become large (Ouellet C# ST and Ouellet CPP ST) vs non-array based (Chan and Ouellet Avl)

Differences between Chan and both version of Ouellet AVL (v1 and v2) with "Arc" point generator.
The "Arc" generator is specifically the worst case for Ouellet algorithms and should be far from any real-life case.
It mainly shows the linearity of algorithms performance either when the amount of convex hull points ("h") is very high. Here: h = n.

Conclusion – Comparison of major fastest versions of algorithm implementation provided – Circle

Conclusion – Comparison of major fastest versions of algorithm implementation provided – Throw away

Conclusion about "Speed Test" results

Observations

Advantages of O(n log h) is obvious.
Monotone chain algorithm is the slowest of evaluated algorithm implementation
Although MiConvexHull and Heap Algorithms are both in O(n log n), Heap is pretty much slower.
Speed depends on source points, their placement and order. Speed results could vary depending on point arrangement (random generator).
All convex hull algorithms in O(n log h) are faster but good differences exist between them.
Optimizations made in Ouellet, in regards to Liu and Chen, bring better performance.
Array based container implementations are dependent on very large "h". The problem appears when "h" reach about ~500 000 points.
Chan is more independent to the type of random generator than Ouellet. This difference could be seen when comparing results for different random generators.
Ouellet AVL and Ouellet AVL v2 are more affected by "h" than Chan. It can be seen by comparing results of "Arc generator" with any other generator. It should be caused by heap allocator delay and/or rebalancing the tree.
Multithreading is a real advantage.
C++ is a lot faster than C#. Although not being the exact same algorithm, the difference is lots higher than expected. I think it could be attributed to extensive memory read/write which could have less indirection in C++ and also due to array boundary checks which happen automatically in C# for safety but not in C++.
C# Single thread vs Chan is different depending on the type of generator. While for "Circle" point generator, C# is faster than Chan in C. When using the "Throw away" point generator, Chan wins easily.
Ouellet CPP is 4x faster than Chan where both are made in the same language (C/CPP). That is for a number of points higher than 100000.
Immutable array copy is affected by a lot of points a lot quicker than when array is used with overlapping copy (normal usage, like into a list).
Although Ouellet CPP clearly wins in normal usage, it performs badly when "h" become too large due to array based usage for storing hull points.
The difference between Ouellet AVL vs Chan is pretty big for the arc generator (where n=h). But this difference is reducing when the amount of point augment (become huge). See Results for "Deep performance test".
For Ouellet, an AVL tree seems to be the best container to ensure good performance in all cases. Although a little slower than an array in general cases, it has a more contant behavior when "h" become very large.

Warning about the "Arc" Point Generator

Some tests were done using the "Arc" random generator. Output from that generator is the worst case data generator that could be used to feed the Ouellet algorithm when it is based on an array container (array or list). Points generated by the "Arc" point generator will all be part of the convex hull, no exception. All points form an arc which is the border of an ellipse in the 4^th quadrant. This should not happen in real life but it was fun to test here what could be a worst case for Ouellet algorithm and how it could affect some of its implementations.

Derived Conclusions

Based on results, we can extrapolate that it would be possible to get a Ouellet AVL v2 "CPP" implementation that could be faster than Ouellet CPP. That implementation wouldn’t have the "Array" container effect that affects all Ouellet implementation based on array. But due to the usage of the heap allocator for each insertion of potential hull point candidate, the speed gain will probably not be as good as the one between Ouellet ST and Ouellet CPP.

Results for "Deep performance test"

This test mainly shows a comparison of algorithm speed. Results are divided in two sections:

The first table show raw speed results
The second table shows the same information as a ratio relative to the fastest algorithm on the same row

How "Deep performance test" are achieving

For each Point count (column: Points)
- Do 10 times each test
  - Create a random number of points (ptsCount)
  - Execute each algorithm calculating its time taken
- For each algorithm, do an average based on the 10 tests (to have a better normalized average of each algorithm)

An advantage of this type of test is that it is done on a specific amount of point but for each quantity, an average is made over 10 different sets of random points for which each algorithm is tested against the same set of points. Calculating an average on 10 tests helps to have more reliable results that better reflect reality. It lower the variations that could occur on performance of algorithm calculation.

Some sample results:

Circle generator
Points	Heap	MI ConvexHull	Chan	Ouellet C# ST	Ouellet C# Avl v2	Ouellet C# 4T	Ouellet C# MT	Ouellet CPP ST
10	0,00108	1,87276	0,01983	1,68595	2,23541	2,31733	0,06784	0,00442
100	0,00928	0,07411	0,01553	0,01491	0,05198	0,04098	0,67324	0,00811
1000	0,12542	0,41404	0,12634	0,10726	0,19491	0,10209	0,09456	0,03884
10000	1,59029	1,4313	0,92123	0,88127	0,8602	0,47758	0,40407	0,20166
100000	22,79887	27,42013	8,81501	8,46962	7,20517	3,78218	11,55244	2,95017
1000000	541,474	456,7736	106,1159	86,48805	72,61888	32,67093	32,60175	19,12635
10000000	9704,916	3494,845	1123,966	904,2915	753,2944	429,324	321,1153	195,4884
50000000	59832,37	90730,66	5648,566	4321,777	3625,627	1858,696	1611,791	975,5726
Circle generator ratio
Points	Heap	MI ConvexHull	Chan	Ouellet C# ST	Ouellet C# Avl v2	Ouellet C# 4T	Ouellet C# MT	Ouellet CPP ST
10	1	1734,037	18,36111	1561,065	2069,824	2145,676	62,81481	4,092593
100	1,144266	9,138101	1,91492	1,838471	6,409371	5,053021	83,01356	1
1000	3,229145	10,66014	3,252832	2,761586	5,01828	2,628476	2,434604	1
10000	7,885996	7,09759	4,568234	4,370078	4,265596	2,368244	2,003719	1
100000	7,727985	9,294424	2,987967	2,870892	2,44229	1,282021	3,915856	1
1000000	28,31037	23,8819	5,548151	4,521932	3,796798	1,708163	1,704546	1
10000000	49,64446	17,8775	5,749527	4,625806	3,853397	2,196161	1,642631	1
50000000	61,33052	93,00247	5,790001	4,42999	3,71641	1,905236	1,652149	1

5 Circles generator
Points	Heap	MI ConvexHUll	Chan	Ouellet C# ST	Ouellet C# Avl v2	Ouellet C# 4T	Ouellet C# MT	Ouellet CPP ST
10	0,00114	0,06827	0,00182	0,00555	0,03001	0,05306	3,65214	1,56142
100	0,00972	1,4628	0,01716	0,01725	0,06456	0,03631	0,07146	0,00804
1000	0,12014	0,37807	0,1058	0,10401	0,15686	0,10403	0,23345	0,03499
10000	1,51341	2,6314	0,8092	0,84836	0,60956	0,48868	3,96307	0,98666
100000	24,82565	42,64931	8,4731	8,68809	12,91215	4,34475	3,96131	2,01127
1000000	566,718	381,0362	99,78135	84,84197	55,64279	38,98233	39,39097	21,28255
10000000	9823,763	2789,914	1124,163	883,7857	712,3286	500,4938	391,891	251,782
50000000	56538,56	14800,21	4875,033	4398,306	2597,754	1891,765	1937,045	1161,692
5 Circles generator ratio
Points	Heap	MI ConvexHUll	Chan	Ouellet C# ST	Ouellet C# Avl v2	Ouellet C# 4T	Ouellet C# MT	Ouellet CPP ST
10	1	59,88596	1,596491	4,868421	26,32456	46,54386	3203,632	1369,667
100	1,208955	181,9403	2,134328	2,145522	8,029851	4,516169	8,88806	1
1000	3,433552	10,80509	3,023721	2,972564	4,482995	2,973135	6,671906	1
10000	3,096935	5,38471	1,655889	1,736024	1,24736	1	8,109745	2,019031
100000	12,34327	21,20516	4,212811	4,319703	6,419899	2,160202	1,969557	1
1000000	26,62829	17,90369	4,688411	3,986457	2,614479	1,831657	1,850858	1
10000000	39,01694	11,08067	4,464827	3,510123	2,829148	1,987806	1,55647	1
50000000	48,66916	12,74023	4,196495	3,786122	2,236182	1,628457	1,667435	1

Rectangle generator
Points	Heap	MI ConvexHUll	Chan	Ouellet C# ST	Ouellet C# Avl v2	Ouellet C# 4T	Ouellet C# MT	Ouellet CPP ST
10	0,00102	7,4844	0,00263	0,00643	0,03033	0,0248	1,74169	0,00483
100	0,00997	0,07902	0,01451	0,02722	0,05932	0,03731	0,03714	0,01129
1000	0,12493	0,20427	0,09821	0,10101	0,17569	0,09951	0,09731	0,11553
10000	1,55069	3,7444	0,7152	0,81974	0,66621	0,47438	0,39172	0,18275
100000	22,27714	112,672	7,98381	9,76501	5,95338	3,21348	5,7474	1,53242
1000000	554,4348	406,3471	90,25243	83,9157	60,25598	46,47369	38,26555	16,35665
10000000	10041,87	2625,678	936,3977	834,0339	564,9664	375,76	297,6457	204,7239
50000000	56114,48	16609,91	14926,98	3645,529	2783,8	1483,013	1436,47	856,4593
Rectangle generator ratio
Points	Heap	MI ConvexHUll	Chan	Ouellet C# ST	Ouellet C# Avl v2	Ouellet C# 4T	Ouellet C# MT	Ouellet CPP ST
10	1	7337,647	2,578431	6,303922	29,73529	24,31373	1707,539	4,735294
100	1	7,925777	1,455366	2,730191	5,94985	3,742227	3,725176	1,132397
1000	1,283835	2,099168	1,009249	1,038023	1,805467	1,022608	1	1,187237
10000	8,485308	20,48919	3,913543	4,485581	3,645472	2,595787	2,143475	1
100000	14,53723	73,5255	5,209936	6,37228	3,884953	2,096997	3,750538	1
1000000	33,8966	24,84293	5,517782	5,130372	3,683883	2,841272	2,339449	1
10000000	49,05078	12,82546	4,573954	4,073945	2,759651	1,835448	1,453888	1
50000000	65,51914	19,3937	17,42871	4,256511	3,250359	1,731562	1,677219	1

Throw away generator
Points	Heap	MI ConvexHUll	Chan	Ouellet C# ST	Ouellet C# Avl v2	Ouellet C# 4T	Ouellet C# MT	Ouellet CPP ST
10	0,0012	0,06457	0,00163	0,00714	0,03473	0,03447	0,03586	0,00423
100	0,00915	0,07318	0,0149	0,01467	0,05557	0,03048	0,03261	0,00755
1000	0,13591	0,34241	0,0957	0,09783	0,66553	0,1031	0,08968	0,03367
10000	1,47477	1,19184	0,69639	0,8837	0,58769	0,48858	0,45522	0,20029
100000	20,05145	14,0944	6,9497	8,28934	4,80163	4,14814	6,55351	1,95432
1000000	352,7863	335,9496	90,7813	89,00771	48,50286	36,29886	39,43067	20,9981
10000000	5565,573	2037,657	900,9266	859,9736	560,264	357,0917	346,4088	242,5131
50000000	32520,03	22094,49	4213,683	4071,137	2194,05	1593,671	1605,114	872,252
Throw away generator ratio
Points	Heap	MI ConvexHUll	Chan	Ouellet C# ST	Ouellet C# Avl v2	Ouellet C# 4T	Ouellet C# MT	Ouellet CPP ST
10	1	53,80833	1,358333	5,95	28,94167	28,725	29,88333	3,525
100	1,211921	9,692715	1,97351	1,943046	7,360265	4,037086	4,319205	1
1000	4,036531	10,16959	2,842293	2,905554	19,76626	3,062073	2,663499	1
10000	7,363173	5,950572	3,476908	4,412102	2,934195	2,439363	2,272804	1
100000	10,26006	7,21192	3,556071	4,241547	2,456931	2,122549	3,353345	1
1000000	16,80087	15,99905	4,32331	4,238846	2,309869	1,728674	1,877821	1
10000000	22,94957	8,402252	3,71496	3,546091	2,310242	1,472463	1,428412	1
50000000	37,28284	25,3304	4,830809	4,667386	2,515385	1,827076	1,840195	1

Arc generator
Points	Chan	Ouellet C# Avl v2
10	0,00194	0,03007
100	0,02189	0,2406
1000	0,27112	4,38024
10000	3,33285	35,75693
100000	41,3635	490,43127
1000000	495,68418	4567,68341
10000000	5753,82461	36702,53548
50000000	31337,83897	111482,6417
Arc generator ratio
Points	Chan	Ouellet C# Avl v2
10	1	15,5
100	1	10,99132024
1000	1	16,15609324
10000	1	10,72863465
100000	1	11,85661924
1000000	1	9,214906576
10000000	1	6,378806788
50000000	1	3,557445099

About "Arc" random point generator, please read: Warning about the "Arc" point generator

Please keep in mind that most algorithms are written in C# which is a language that has a reputation to be quite slower than C (Chan and Ouellet CPP are in C).

What’s left

There are many things left:

Test the Ouellet Avl with a Red-Black Tree instead of an Avl tree and compare performance
Implement AVL versions in C++ to find the language advantage and see if the difference is the same as for Ouellet ST
Add some or all Optimizations in all versions
Do a multithreaded version in C++, like the 2 done in C#, with an AVL Tree
Improve the multithreaded version in order to use all threads in all cases on each pass.
Find better implementation choices and make more testing on an algorithm that run in O(n)
Adapt the algorithm to do 3D and/or any dimensions by making it more generic. It should be feasible. As an idea, we can use bit values as quadrant.
Combine any of the preceding to get the best algorithm implementation ever

Who has enough courage, time, money and knowledge to do any of them or all?

Appendix A - History of Ouellet Convex Hull

Date	Event	Comment
2013 – Spring	- Got a task to implement a convex hull functionality into a software developed for electric network simulation called EMTP.- Bought ceometric libray.	Ceometric library was slow and crashed for millions of points.
2013 – Summer	- Found Pat Morin code implementation of Chan algorithm.	Pat Morin implantation is coded in "C" but I preferred to stay in C#. Also, I found it difficult to switch between 32bits/64 bits.
2014- Spring	- An idea sprout in my mind about a new way to find the convex hull. I tested it quickly and came to the conclusion that it should work.- I created a benchmark, tested the algorithm and debugged it
2014-05-20	- Initial post at code project about the new algorithm
2014-06-17	- Attempt of adding a reference to my algorithm into Wikipedia – Convex Hull algorithms.	It stays there one day. David Eppstein removed it, see Wikipedia webpage history for details.
2015 – Summer	- I decided to try to publish in a "True" scientific journal. I communicate with teachers here in Quebec to help me and I tried to find an appropriate journal to publish. - I found a convex hull article from Liu and Chen article "A new algorithm for computing the convex hull of a planar point set" for which appears to me to be the exact same algorithm as mine, or I should say that my algorithm is the same as them.	Liu and Chen article discovery was a real choc to me.
2015-07-25	- Attempt to add a reference to Liu and Chen article into Wikipedia.	David Eppstein, "The Wikipedia Guardian of the Convex Hull algorithms pages", evaluated that Liu And Chen article was not published in an appropriate journal. The journal was too obscure for him, for Wikipedia (details in history of Wikipedia - convex hull algorithms).
2017 – September	- This article	Hoping it will be useful and appreciated by readers.

Appendix B – Wikipedia

On Wikipedia history page of the "Convex hull algorithms", you can see a trial, by myself, to add a reference to my first Code Project article and Liu and Chen article. Both attempt has been removed by David Eppstein. Up to now, there is no reference to this algorithm in Wikipedia which makes it harder to discover.

Appendix C - WPF Chart Control

In the included code, the graphic control used is OxyPlot. This control works fine and it is simple to use. But its performance is on the slow side. In fact, I never saw any WPF only graph control that could handle many thousands of points smoothly. There is one control that I use at work that deliver proper performance and it has a nice API too: "LightningChart" from Arction. I also think that SciChart make a really good control too. LightningChart uses DirectX under the hood and I think SciChart is also using DirectX to ensure acceptable performance. Both have C# control and are very easy to use.

Arction offer exceptional support and SciChart seems to offer something similar.

My experience of C# Chart control:

Company	Arction	SciChart	National Instrument	Oystein Bjorke (objorke on GitHub)	Microsoft Research
Control	LightningChart	WPF Charts	Measurement Studio WPF Graph Control	OxyPlot	DynamicDataDisplay
API (quality of design)	Excellent	So-so in the past. Didn’t try a recent one but I heard good words.	Good. Based on GDI32.dll. Need some tweaks to ensure proper usage with WPF.	Good	Good
Price	Expensive	?	Expensive	Free	Free
Performance	Great	? Expecting great performance because it uses Direct X	Good	Slow	Slow
Source code	Available $$$	?	No	Yes	Yes
Remarks	Excellent. Easy to recommend.	?	I would avoid it. Bad support.C++ COM control adapted to WPF. Not used for 2-3 years now.	GitHub project. Very nice graph if performance is not an issue.	Very old and not supported for many years now

Appendix D - Excel

Just as a side note, I used ClosedXML as the library to create my excel report. I really liked it. It was easy to learn and use and it’s free.

In our company, we are using SpreadSheetGear which I prefer but it cost money.

Appendix E - Smallest Enclosing Circle

Also called the minimum enclosing circle, the smallest enclosing circle is, as its name implies, the smallest circle of a set "S" of points that include all "S" points in it. A good definition could be found in Wikipedia: Smallest-circle problem. I added the code from Rod Stephens web page: Find a minimal bounding circle of a set of points in C# as an available algorithm. The smallest enclosing circle takes a set of points as its source and produce a result compose of a point, the center, and a length, the ray of the circle. By adding a small, and easy to do algorithm, we can transform the center point and the ray into a circle. In fact the circle is represented as an array of consecutive points forming that circle. The count of points should be enough to give the illusion of a circle. We then can use that algorithm with the workbench exactly the same as we use for the convex hull. The actual implementation code from Rod Stephens works well, but it is quite slow. I also added another available algorithm which is based on Rod algorithm. The difference is that "S" points are pre-filtered with Ouellet convex hull algorithm. The result of the convex hull is then used as a source for Rod algorithm. The algorithm become a lot faster because it drastically reduce the number of source points. This is another advantage of having an efficient convex hull algorithm. Please note that Rod algorithm seems to be in O(n²). You can easily test performance with the provided test workbench where I included both implementation. I recommend to stay under ~1000 points because of the complexity of the original Rod algorithm. See a quick sample of a benchmark test results between the 2 algorithms:

The effect of pre-filter a "Smallest enclosing circle" algorithm with an efficient "convex hull" algorithm (here Ouellet ST)

Appendix F - Generic implementation of the fastest algorithm generating all possible permutations of a set of items.

As a side note. I think I have made the quickest C# implementation for generating all possible permutations of a set of items. You can take a look at my answer at StackOverflow question generating permutations of a set (most efficiently) to have an idea of it. I did it in order to make some test on the Convex Hull. All the code is included in the provided source code of this article.

Main advantages over few algorithms found are:

Heap's algorithm (Single swap per permutation)
No multiplication (like some implementations seen on the web)
Inline swap
Generic
No unsafe code
In place (very low memory usage)
No modulo (only first bit compare)

Appendix G – The benchmark

The benchmark code has been made with Visual Studio 2017 in C#. It uses ".Net framework" 4.5.2 and newer version. All the source code for the benchmark and each algorithm implementation is provided. The binary could also be downloaded. The language is C# for all but 2 Convex Hull implementation, Chan done by Pat Morin and Ouellet CPP done by me.

It is not fully MVVM but uses some features of it. I had to decide to stop somewhere and it is like it is now. I feel it is working enough to be usable and easy to add your own algorithm.

How to Add Your Own Algorithm

Please look into the project "ConvexHullWorkbench" for the class "AlgorithmManager". Just add a line: "_algorithms.Add(..." at the end of the constructor.

Screenshots

The benchmark tool

Projects of the solution

Acknowledgment

Thanks to Omar Saad, my project manager, for explaining to me what a convex Hull is and left me the time to test my idea, write articles and code.
Thanks to Paul Labbé, a co-worker, for showing me Cross Product. It was one of the best improvements on my algorithm.
Thanks to Jacques Bherer and Chuma Francis Mugombozi, co-workers, to revise my article. My article would have been a killer one if I would have followed each and every of their recommendations! But I'm too lazy...

References

Wikipedia, Convex hull algorithms
Liu, G. & Chen, C. J. Zhejiang Univ. - Sci. A (2007) 8: 1210. A new algorithm for computing the convex hull of a planar point set (https://rd.springer.com/article/10.1631%2Fjzus.2007.A1210)
Eric Ouellet, IREQ, A Convex Hull Algorithm and its implementation in O(n log h)
Wikipedia, Smallest-circle problem
Rod Stephens, C# Helper, 2014-08-13, Find a minimal bounding circle of a set of points in C#
Wikipedia, Heap's algorithm
SimpleVar, Stack Overflow question/answer, Generating permutations of a set (most efficiently)
Qwertie (David Piepgrass), Monotone Chain implementation at GitHub and 2D Convex hull in C#: 40 lines of code at Loyc.net
Ceometric, .Net library geometry solutions
futura-sciences.com forums for Tryss answer about Cross product
bitlush, Efficient AVL Tree in C#

Version

Date of version submited. There could be a delay (day(s) ?) before the article become online.

2017-10-13, Initial version of this article.
2017-10-18, Corrected some algorithm complexity. Re-arrange convex hull algorithm/complexity tables in order to make them easier to read. Added some information. Minor corrections.
2017-10-28, Updated binaries executable files to inlcude MSVCRT. This is to ensure 'C' code works fine on any Windows.
2017-11-17, Minor syntax corrections or few hyperlinks added.
2017-11-30, Minor syntax corrections and added some clarifications. Some updated graphics with Delaunay/Voronoi information. Added 1 graphic (conclusion with circle generator).
2017-12-16, Minor corrections in text.
2018-01-30, Added support for 'Online' Convex Hull (dynamic add) in O (log h) per point. Also added support for Peek (IsHullPoint) in order to know if a point would be a hull point before inserting it. All of that only in code for the moment, included in project OuelletConvexHullAvl2Online in GitHub. REmoved some dead code. Slight improvement (standardization). Added a Graham Scan modified version implementation , from Rod Stephens.
2018-03-01, Update executable to reflect recent changes. Add table or related articles.

Thank you for taking time to let me know what you think about this article ...

Fast and improved 2D Convex Hull algorithm and its implementation in O(n log h)

Introduction

This article is the 2nd of a series of 3

Little request

Reason for writing back a new article about almost the same algorithm

What is a Convex Hull

Definition

Applications of Convex Hull in 2D and in 3D

Performance and complexity of Convex Hull algorithms

Ouellet Convex Hull Algorithm

How it works

The best detailed description of the inner working of Ouellet algorithm could be found at

Algorithm main idea

Differences of Ouellet VS. Liu and Chen Convex Hull Algorithm

Optimizations

Slope VS Cross Product

Results in algorithm

Multiplication VS. Division

Quick Discard (QD)

Opposite quadrant check (OQC)

Opposite quadrant check bypass (OQCB)

Disjoint quadrant check (DQC)

PCZ

DQC Using PCZ

PCZ point belonging (PCZB)

Previous quadrant (PQ)

Algorithms

Algorithms Comparison

General description

Complexity

Ouellet Convex Hull Algorithm Implementation comparison

The Code

Implementation choices

Array VS. List

Array vs Tree

Convex Hull in O(n)

C# vs C++

Multithreading

C# - Safe code VS. Unmanaged code

Container usage remark

C# Array.Copy vs MSVCRT memmove vs MSVCRT memcpy (for immutable)

C# Buffer.Copy

C# sizeof(T)

Compiling/running the code

Algorithm Benchmark

Random generators

Hardware used for testing

Results for "Speed Test"

Conclusion about "Speed Test" results

Observations

Warning about the "Arc" Point Generator

Derived Conclusions

Results for "Deep performance test"

How "Deep performance test" are achieving

Some sample results:

What’s left

Appendix A - History of Ouellet Convex Hull

Appendix B – Wikipedia

Appendix C - WPF Chart Control

Appendix D - Excel

Appendix E - Smallest Enclosing Circle

Appendix F - Generic implementation of the fastest algorithm generating all possible permutations of a set of items.

Appendix G – The benchmark

How to Add Your Own Algorithm

Screenshots

Acknowledgment

References

Version

This article is the 2^nd of a series of 3