Introducing Server Side Analytics for ASP.NET Core
Simple middleware to add server side analytics functions to ASP.NET Core
Introduction
I wanted to keep trace of visitors and know the usual stuff of web analytics: visitors, source, nationality, behaviour and so on.
And client side analytics are not so reliable:
- Ad Blockers interfere with them
- Using a third party service requires to annoy the user with those huge cookie consent banners
- They drastically increase the loading time of the web application
- They don't register API calls and any other not-HTML calls like web API
So I developed by myself a very simple server side analytics system for .NET Core, which is running on my website.
- Live demo: https://matteofabbri.org/stat
- GitHub repo: https://github.com/matteofabbri/ServerSideAnalytics
- NuGet: https://www.nuget.org/packages/ServerSideAnalytics
The Middleware
The idea is to implement a middleware that will be invoked on every request, no matter if a route was specified or not.
This middleware will be put into the task pipeline and set up using only fluid methods.
The middleware will write incoming request into a generic store after the processing of the request is completed.
The middleware will be inserted into the task pipeline by using UserServerSideAnalytics
extension method in app startup.
This method requires an IAnalyticStore
interface that is going to be the place where our received request will be stored.
public void Configure(IApplicationBuilder app)
{
app.UseServerSideAnalytics(new MongoAnalyticStore("mongodb://192.168.0.11/matteo"));
}
Inside the extension, I will create a FluidAnalyticBuilder
and bind it to the task pipeline via the method Use
.
public static FluidAnalyticBuilder UseServerSideAnalytics
(this IApplicationBuilder app,IAnalyticStore repository)
{
var builder = new FluidAnalyticBuilder(repository);
app.Use(builder.Run);
return builder;
}
The FluidAnalyticBuilder
is a fluid class that will handle the configuration of the analytics that we want to collect (like filtering unwanted URL, IP address and so on) and practically implement the core of the system via the method Run
.
In this method, ServerSideAnalytics
will use two methods of the store:
ResolveCountryCodeAsync
: Retrieve (if existing) the country code of remote IP address.
If not existing,CountryCode.World
is expected.StoreWebRequestAsync
: Store the received request into the database
internal async Task Run(HttpContext context, Func<Task> next)
{
//Pass the command to the next task in the pipeline
await next.Invoke();
//This request should be filtered out ?
if (_exclude?.Any(x => x(context)) ?? false)
{
return;
}
//Let's build our structure with collected data
var req = new WebRequest
{
//When
Timestamp = DateTime.Now,
//Who
Identity = context.UserIdentity(),
RemoteIpAddress = context.Connection.RemoteIpAddress,
//What
Method = context.Request.Method,
UserAgent = context.Request.Headers["User-Agent"],
Path = context.Request.Path.Value,
IsWebSocket = context.WebSockets.IsWebSocketRequest,
//From where
//Ask the store to resolve the geo code of given ip address
CountryCode = await _store.ResolveCountryCodeAsync(context.Connection.RemoteIpAddress)
};
//Store the request into the store
await _store.StoreWebRequestAsync(req);
}
(Maybe, I should add other fields to collected requests? Let me know. 😊)
Via the List<Func<HttpContext, bool>> _exclude
, it also provides easy methods to filter out requests that we don't care about.
//Startup.cs
// This method gets called by the runtime. Use this method to configure the HTTP request pipeline.
public void Configure(IApplicationBuilder app, IHostingEnvironment env)
{
app.UseDeveloperExceptionPage();
app.UseBrowserLink();
app.UseDatabaseErrorPage();
app.UseAuthentication();
//Let's create our middleware using Mongo DB to store data
app.UseServerSideAnalytics(new MongoAnalyticStore("mongodb://localhost/matteo"))
// Request into those url spaces will be not recorded
.ExcludePath("/js", "/lib", "/css")
// Request ending with this extension will be not recorded
.ExcludeExtension(".jpg", ".ico", "robots.txt", "sitemap.xml")
// I don't want to track my own activity on the website
.Exclude(x => x.UserIdentity() == "matteo")
// And also request coming from my home wifi
.ExcludeIp(IPAddress.Parse("192.168.0.1"))
// Request coming from local host will be not recorded
.ExcludeLoopBack();
app.UseStaticFiles();
}
And that is all the middleware. 😀
The Store
Have you seen above that the middleware writes collected data into a generic store expressed by the interface IAnalyticStore
, the component that will handle all the dirty work of this job.
I wrote three stores:
- https://www.nuget.org/packages/ServerSideAnalytics.Mongo for Mongo DB
- https://www.nuget.org/packages/ServerSideAnalytics.SqlServer for Microsoft SQL Server
- https://www.nuget.org/packages/ServerSideAnalytics.Sqlite for SQLite
In the attached code, you will find a sample site using SQLite, so no external process is needed to run the example.
The store has to implement an interface with two methods invoked by Server Side Analytics and some method to query stored requests.
This is because database types isolation is so cool but also means that you cannot cast an Expression<Func<MyType,bool>>
to Expression<Func<WebRequest,bool>>
, no matter how similar MyType
and WebRequest
would be.
We will see the use of those methods in the last part of the article regarding the exposition of our data inside the web application.
public interface IAnalyticStore
{
/// <summary>
/// Store received request. Internally invoked by ServerSideAnalytics
/// </summary>
/// <param name="request">Request collected by ServerSideAnalytics</param>
/// <returns></returns>
Task StoreWebRequestAsync(WebRequest request);
/// <summary>
/// Return unique identities that made at least a request on that day
/// </summary>
/// <param name="day"></param>
/// <returns></returns>
Task<long> CountUniqueIndentitiesAsync(DateTime day);
/// <summary>
/// Return unique identities that made at least a request inside the given time interval
/// </summary>
/// <param name="from"></param>
/// <param name="to"></param>
/// <returns></returns>
Task<long> CountUniqueIndentitiesAsync(DateTime from, DateTime to);
/// <summary>
/// Return the raw number of request served in the time interval
/// </summary>
/// <param name="from"></param>
/// <param name="to"></param>
/// <returns></returns>
Task<long> CountAsync(DateTime from, DateTime to);
/// <summary>
/// Return distinct Ip Address served during that day
/// </summary>
/// <param name="day"></param>
/// <returns></returns>
Task<IEnumerable<IPAddress>> IpAddressesAsync(DateTime day);
/// <summary>
/// Return distinct IP addresses served during given time interval
/// </summary>
/// <param name="from"></param>
/// <param name="to"></param>
/// <returns></returns>
Task<IEnumerable<IPAddress>> IpAddressesAsync(DateTime from, DateTime to);
/// <summary>
/// Return any request that was served during this time range
/// </summary>
/// <param name="from"></param>
/// <param name="to"></param>
/// <returns></returns>
Task<IEnumerable<WebRequest>> InTimeRange(DateTime from, DateTime to);
/// <summary>
/// Return all the request made by this identity
/// </summary>
/// <param name="identity"></param>
/// <returns></returns>
Task<IEnumerable<WebRequest>> RequestByIdentityAsync(string identity);
/// <summary>
/// Add a geocoding ip range.
/// </summary>
/// <param name="from"></param>
/// <param name="to"></param>
/// <param name="countryCode"></param>
/// <returns></returns>
Task StoreGeoIpRangeAsync(IPAddress from, IPAddress to, CountryCode countryCode);
/// <summary>
/// Makes the geeo ip resolution of incoming request. Internally invoked by ServerSideAQnalytics
/// </summary>
/// <param name="address"></param>
/// <returns></returns>
Task<CountryCode> ResolveCountryCodeAsync(IPAddress address);
/// <summary>
/// Remove all item in request collection
/// </summary>
/// <returns></returns>
Task PurgeRequestAsync();
/// <summary>
/// Remove all items in geo ip resolution collection
/// </summary>
/// <returns></returns>
Task PurgeGeoIpAsync();
}
Identities
Have you maybe noticed every WebRequest
has got a field name Identity
. This is because the most important data is to know Who made What.
But how is it evaluated?
- If it is from a registered user, we are going to use username
- If not, we are going to use the default AspNetCore cookie
- If not available, we use the connection id of the current context
- Then we are going to try to save the result in our own cookie, so we don't have to do it again
In code:
public static string UserIdentity(this HttpContext context)
{
var user = context.User?.Identity?.Name;
const string identityString = "identity";
string identity;
if (!context.Request.Cookies.ContainsKey(identityString))
{
if (string.IsNullOrWhiteSpace(user))
{
identity = context.Request.Cookies.ContainsKey("ai_user")
? context.Request.Cookies["ai_user"]
: context.Connection.Id;
}
else
{
identity = user;
}
context.Response.Cookies.Append("identity", identity);
}
else
{
identity = context.Request.Cookies[identityString];
}
return identity;
}
IP Geocoding
One of the most interesting data of every analytic system is to know where your user comes from.
So the IAnalyticStore
of SSA implement methods to make the IP address geo coding of incoming requests.
Sadly, in 2018, there is a well established protocol although Int128 is not a well established data type, especially in database.
So we need to implement a cool workaround to have an efficient query to our database.
Or at least this is the strategy that I used in my three stores, if you have a better idea you can implement your analytic store or even better contribute to the project.
We are going to save every IP address range as a couple of string
s.
Algorithm:
- If the IP address is a IPV4, it should be mapped to IPV6 so they can be stored together
- Then we are going to take the bytes of our new IP address
- We are going to revert them, so "
10.0.0.0
" will keep being "10.0.0.0
" instead of "10
" - Now we have a string of bytes that represent a very big number
- Let's print this number using every digit so they can correctly compared by the database
(from000000000000000000000000000000000000000
to340282366920938463463374607431768211455
)
Or in code:
private const string StrFormat = "000000000000000000000000000000000000000";
public static string ToFullDecimalString(this IPAddress ip)
{
return (new BigInteger(ip.MapToIPv6().GetAddressBytes().Reverse().ToArray())).ToString(StrFormat);
}
I implemented this function in ServerSideAnalytics.ServerSideExtensions.ToFullDecimalString
so if you want to reuse it, you don't have to become mad like me.
Now that we have our IP address normalized into a well defined string
format, finding the relative country saved in our database is really simple.
public async Task<CountryCode> ResolveCountryCodeAsync(IPAddress address)
{
var addressString = address.ToFullDecimalString();
using (var db = GetContext())
{
var found = await db.GeoIpRange.FirstOrDefaultAsync
(x => x.From.CompareTo(addressString) <= 0 &&
x.To.CompareTo(addressString) >= 0);
return found?.CountryCode ?? CountryCode.World;
}
}
But to query the database, we need to fill it first.
Find a reliable and cheap database of countries and their relative IP address ranges can be quite difficult.
For this reason, I wrote other three analytic stores that act as wrappers around an existing one to provide fallback geo-IP resolution.
If the first repository doesn't contain a valid IP range for the client, it will ask the second one and so on.
If at the end chain a valid geo-IP has been found this, I saved into the main store.
I wrote three of them, if you want to add more, please contribute on GitHub.
You can find those analytic store in ServerSideAnalytics.Extensions.
IpApiAnalyticStore
: Add ip-geocoding using Ip Api (ip-api.com)IpInfoAnalyticStore
: Add ip-geocoding using Ip Stack (ipinfo.io)IpStackAnalyticStore
: Add ip-geocoding using Ip Stack (ipstack.com)
Personally, I'm using a pre-loaded IP range database with all three failovers enabled:
public IAnalyticStore GetAnalyticStore()
{
var store = (new MongoAnalyticStore("mongodb://localhost/"))
.UseIpStackFailOver("IpStackAPIKey")
.UseIpApiFailOver()
.UseIpInfoFailOver();
return store;
}
Let's see how it works inside one of those as example:
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
namespace ServerSideAnalytics.Extensions
{
class IpApiAnalyticStore : IAnalyticStore
{
readonly IAnalyticStore _store;
public IpApiAnalyticStore(IAnalyticStore store)
{
_store = store;
}
public Task<long> CountAsync(DateTime from, DateTime to) => _store.CountAsync(from, to);
public Task<long> CountUniqueIndentitiesAsync(DateTime day) =>
_store.CountUniqueIndentitiesAsync(day);
public Task<long> CountUniqueIndentitiesAsync(DateTime from, DateTime to) =>
_store.CountUniqueIndentitiesAsync(from, to);
public Task<IEnumerable<WebRequest>> InTimeRange(DateTime from, DateTime to) =>
_store.InTimeRange(from, to);
public Task<IEnumerable<IPAddress>> IpAddressesAsync(DateTime day) =>
_store.IpAddressesAsync(day);
public Task<IEnumerable<IPAddress>> IpAddressesAsync(DateTime from, DateTime to) =>
_store.IpAddressesAsync(from,to);
public Task PurgeGeoIpAsync() => _store.PurgeGeoIpAsync();
public Task PurgeRequestAsync() => _store.PurgeRequestAsync();
public Task<IEnumerable<WebRequest>> RequestByIdentityAsync(string identity) =>
_store.RequestByIdentityAsync(identity);
public async Task<CountryCode> ResolveCountryCodeAsync(IPAddress address)
{
try
{
var resolved = await _store.ResolveCountryCodeAsync(address);
if(resolved == CountryCode.World)
{
var ipstr = address.ToString();
var response = await (new HttpClient()).GetStringAsync
($"http://ip-api.com/json/{ipstr}");
var obj = JsonConvert.DeserializeObject(response) as JObject;
resolved = (CountryCode)Enum.Parse(typeof(CountryCode),
obj["country_code"].ToString());
await _store.StoreGeoIpRangeAsync(address, address, resolved);
return resolved;
}
return resolved;
}
catch (Exception)
{
return CountryCode.World;
}
}
public Task StoreGeoIpRangeAsync(IPAddress from, IPAddress to, CountryCode countryCode)
{
return _store.StoreGeoIpRangeAsync(from, to, countryCode);
}
public Task StoreWebRequestAsync(WebRequest request)
{
return _store.StoreWebRequestAsync(request);
}
}
}
And that's all, folks! :)