I can give you a simple idea: this is very difficult, nearly impossible.—SA
I can tell, because I have a very long experience working with photo editing, using different software, including very best products. Many people are very impressed with background removal/replacement feature. But I was never impressed, because experienced eye always see those problems around the edge. Mind it, the software went a log way to solve this problem better and better, but results are not impressive at all. By the way, the method are seemingly kept in secret. If you look at Open Source GIMP (where the most intellectual algorithm like "Healing Brush" are implemented), there is not such thing. Well, not for a serious photographer.
I also tried to solve similar problems. My solution worked and had some benefits over other tools, but I'm still not enthusiastic — not at all.
I can give you idea why. Set aside busy background "before" or "after". Imagine you have green background (in cinema, they routinely used "blue screen" or "green screen"), want to replace it with red (black or transparent is just harder to explain, but the conclusion will be the same). On a photograph, it means that you have a human head with hair and skin. The problem is not about contour (strictly speaking, there is no contour — surprise!). The problem is that on every single hair, on every pore of the skin you have a fuzzy green highlights, each one at its individual size and brightness, depending and on orientation and other factors, the highlights is blended with individual material color and light from other sources, not just from light reflected from this green background. This effect is always visible. So, the problem is to imitate all those highlights in different color. But separation of the light reflected from background and other sources is already a difficult image recognition problem. It's much easier to fake with very simple matte objects, but even this is very difficult. Usually, I can spot the forgery from a first glance.
I don't say I never saw good forged images. Yes I did; some works are amazing, but all I knew were manual crafts of real masters who spend hours, weeks or months on each work.
Only a cheap, low-resolution original picture can yield the result very roughly matching the quality of the original. Very common cheap trick is to apply some Gaussian blur on a fuzzy area (fuzzy region is critically important) around the contour. If you need a very poor quality image, you can achieve it, but even this is not easy and not working out automatically in 100% of cases. Just look at the picture you posted. Who would ever need such disgust?
So, you can get some solution, but I cannot believe the quality could worth the effort. Of course, this is just my opinion, but I though it will help you to get the idea of the problem.
P.S.: I can imagine outrage caused by this post, as well as down-votes. You're welcome!
However, I would eagerly want to see anything that could seed a doubt in my opinion. Right now I think that I can spot a forgery on every single image created via automated image processing. I will be highly excited if anyone can proof otherwise. Any suggestions?
let us start with your statement that you can classify the pixels as being background (in your example the white ones, that you painted red) or not background.
The effect along edges is indeed awful because of the sharp decision background/not background. To achieve better results, you must use fuzzy logics and allow some intermediate degree between pure background and pure non-background, let us say foreground. In other words, assign every pixel an alpha opacity, from 0 to 255 (instead of just 0/1).
Now how do you assign the alpha values ? All your background pixels (close to white color) will get opacity 0. The pixels at least two pixels away from any background pixels will be arbitrarily considered foreground and get opacity 255. In image processing terminology, you will erode the non-background areas. We arbitrarily assume that edges are two pixels wide.
For the remaining unclassified pixels, you will consider the nearest background pixel and nearest foreground pixel (nearest in the geometric sense) and assign a transparency computed as
255 * D(pixel, background pixel) / D(foreground pixel, background pixel), where
D are distances in color space.
There can be different variations on this theme. The main challenge is to determine the true background and foregound colors locally and find in what proportion they were mixed.
Also note that some intermediate pixels (in case of thin features) will have no foreground neighbor and their alpha value is arbitrary.
I don't mean that this will give you perfect matting as it is a rather crude method, but it should diminish the staircase effect.