Click here to Skip to main content
13,193,590 members (50,130 online)
Click here to Skip to main content
Add your own
alternative version

Stats

4.7K views
6 bookmarked
Posted 13 Sep 2016

Stack Overflow - Performance Lessons (Part 2)

, 13 Sep 2016
Rate this:
Please Sign up or sign in to vote.
Stack Overflow - Performance lessions (Part 2)

In Part 1, I looked at some of the more general performance issues that can be learnt from Stack Overflow (the team/product). In Part 2, I’m looking at some of the examples of coding performance lessons.

Please don’t take these blog posts as blanket recommendations of techniques that you should go away and apply to your code base. They are specific optimizations that you can use if you want to squeeze every last drop of performance out of your CPU.

Also, don’t optimize anything unless you have measured and profiled first, you will probably optimize the wrong thing!

Battles with the .NET Garbage Collector

I first learnt about the performance work done in Stack Overflow (the site/company), when I read the post on their battles with the .NET Garbage Collector (GC). If you haven’t read it, the short summary is that they were experiencing page load times that would suddenly spike to the 100s of msecs, compared to the normal sub 10 msecs they were use to. After investigating for a few days, they narrowed the problem down to the behaviour of the GC. GC pauses are a real issue and even the new modes available in .NET 4.5 don’t fully eliminate them, see my previous investigation for more information.

One thing to remember is that to make this all happen, they needed the following items in place:

  • Monitoring in production - These issues would only show up under load, once the application had been running for a while, so they would be very hard to recreate in staging or during development.
  • Multiple measurements - They recorded both ASP.NET and IIS web server response times and were able to cross-reference them (see image below).
  • Storing outliers - These spikes rarely happened so having detailed metrics was needed, averages hide too much information.
  • Good knowledge of the .NET GC - According to the article, it took them 3 weeks to identify and fix this issue “So Marc and I set off on a 3 week adventure to resolve the memory pressure.”

You can read all the gory details of the fix and the follow-up in the posts below, but the tl;dr is that they removed all of the work that the .NET Garbage Collector had to do, thus eliminating the pauses:

Jil - A Fast JSON (de)serializer, With a Number of Somewhat Crazy Optimization Tricks

But if you think that the struct based code they wrote is crazy, their JSON serialisation library, Jil, takes things to a new level. This is all in the pursuit of the maximum performance and based on their benchmarks, it seems to be working! Note: protobuf-net is a binary serialisation library, but doesn’t support JSON, it’s only included is a base-line:

For instance, instead of writing code like this:

public T Serialise<T>(string json, bool isJSONP)
{
  if (isJSONP)
  {
    // code to handle JSONP
  }
  else 
  {
    // code to handle regular JSON
  }
}

They write code like this, which is a classic memory/speed trade-off.

public ISerialiser GetSerialiser(bool isJSONP)
{
  if (isJSONP)
    return new SerialiseWithJSONP();
  else
    return new Serialiser();
}

public class SerialiserWithJSONP : ISerialiser
{
  private T Serialiser<T>(string json)
  {
    // code to handle JSONP  
  }
}

public class Serialiser : ISerialiser
{
  private T Serialise<T>(string json)
  {
    // code to handle regular JSON
  }
}

This means that during serialisation, there doesn’t need to be any “feature switches”. They just emit the different versions of the code at creation time and based on the options you specify, hand you the correct one. Of course, the classes (SerialiserWithJSONP and Serialiser in this case) are dynamically created just once and then cached for later re-use, so the cost of the dymanic code generation is only paid once.

By doing this, the code plays nicely with CPU branch prediction, because it has a nice predictable pattern that the CPU can easily work with. It also has the added benefit of making the methods smaller, which may make them candidates for in-lining by the .NET JITter.

For more examples of optimizations used, see the links below:

Jil - Marginal Gains

On top of this, they measure everything to ensure that the optimizations actually work! These tests are all run as unit-tests, allowing easy generation of the results, take a look at ReorderMembers for instance.

Note: All the times are in milliseconds, but timed over 1000s of runs, not per call.

Feature nameOriginalImprovedDifference
ReorderMembers272127129
SkipNumberFormatting1661633
UseCustomIntegerToString589339250
SkipDateTimeMathMethods1081008
UseCustomISODateFormatting399269130
UseFastLists27726710
UseFastArrays48646917
UseFastGuids744304440
AllocationlessDictionaries1341277
PropagateConstants773542
AlwaysUseCharBufferForStrings63567
UseHashWhenMatchingMembers14113110
DynamicDeserializer_UseFastNumberParsing945143
DynamicDeserializer_UseFastIntegerConversion1311312
UseHashWhenMatchingEnums381028
UseCustomWriteIntUnrolledSigned21821765417

This is very similar to the “Marginal Gains” approach that worked so well for British Cycling in the last Olympics:

There’s fitness and conditioning, of course, but there are other things that might seem on the periphery, like sleeping in the right position, having the same pillow when you are away and training in different places. Do you really know how to clean your hands? Without leaving the bits between your fingers? If you do things like that properly, you will get ill a little bit less. “They’re tiny things but if you clump them together it makes a big difference.”

Summary

All-in-all, there is a lot to be learnt from code and blog posts that have come from Stack Overflow developers. I’m glad they’ve shared everything so openly. Also, by having a high-profile website running on .NET, it stops the argument that .NET is inherently slow.

The post Stack Overflow - performance lessons (part 2) first appeared on my blog Performance is a Feature!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

matt warren
Software Developer
United Kingdom United Kingdom
No Biography provided

You may also be interested in...

Comments and Discussions

 
-- There are no messages in this forum --
Permalink | Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.171018.2 | Last Updated 14 Sep 2016
Article Copyright 2016 by matt warren
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid