Click here to Skip to main content
13,861,454 members
Click here to Skip to main content
Add your own
alternative version

Stats

3.1K views
31 downloads
2 bookmarked
Posted 20 Jan 2019
Licenced CPOL

Choose Your Poison on the rPi: Entity Framework, Linq2DB, or ADO.NET

, 20 Jan 2019
Rate this:
Please Sign up or sign in to vote.
Performance differences on the rPi

Contents

Introduction

In my earlier article on setting up an rPi including Postgres, I was dismayed to discover that Entity Framework (EF) Core took on average around 8 seconds to establish an initial connection and obtain some data from the Postgres database running on the rPi.  So I decided to dig into the issue further, testing EF Core, Linq2DB, and ADO.NET (the old DataTable / DataSet / DataAdapter since .NET 1.0).

Something to keep in mind: my rPi is set up with an external spinny hard drive (not SSD) over USB 2.0, which may or may not be affecting performance.

Regarding Linq2DB, do not think this is a replacement for Linq2SQL.  Linq2DB is awesome but does not include features like change tracking, so you're existing Linq2SQL code, where you simply update the property in the model and "SaveChanges" will not work.  Linq2DB is leaner and does partial updates using the Set method.

NuGet

The Linq2DB package:

For setting up Postgres on the rPi (or .NET Core in general), please see my earlier article.

After installing the necessary packages, your NuGet references should look similar to this:

The Console

I wrote the console to set up a variety of test scenarios, so you'll see a menu of options:

Creating 100,000 records takes a long time.

The Test Code

Below are the test routines I used for ADO.NET, Linq2DB, and EF Core.  They are all very similar.

Schema

As generated by pdgadmin4, these are the two test tables I'm using -- they are identical except for the table names:

CREATE TABLE public."TestTable"
(
  "ID" integer NOT NULL DEFAULT nextval('"TestTable_ID_seq"'::regclass),
  "FirstName" text COLLATE pg_catalog."default",
  "LastName" text COLLATE pg_catalog."default",
  CONSTRAINT "TestTable_pkey" PRIMARY KEY ("ID")
)
WITH (
  OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public."TestTable"
  OWNER to pi;

CREATE TABLE public."TestTable2"
(
  "ID" integer NOT NULL DEFAULT nextval('"TestTable_ID_seq"'::regclass),
  "FirstName" text COLLATE pg_catalog."default",
  "LastName" text COLLATE pg_catalog."default",
  CONSTRAINT "TestTable2_pkey" PRIMARY KEY ("ID")
)
WITH (
  OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public."TestTable2"
  OWNER to pi;

The Models

public class TestTable
{
  [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
  public int ID { get; set; }
  public string FirstName { get; set; }
  public string LastName { get; set; }
}

public class TestTable2
{
  [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
  public int ID { get; set; }
  public string FirstName { get; set; }
  public string LastName { get; set; }
}

public class Context : DbContext
{
  public Context(DbContextOptions options) : base(options) { }

  public DbSet<TestTable> TestTable { get; set; }
  public DbSet<TestTable2> TestTable2 { get; set; }
}

ADO.NET

static void TestPostgresAdoNet()
{
  string connStr = Configuration.GetValue<string>("ConnectionStrings:rpidb");
  DataTable dt = new DataTable();
  long total = 0;
  long ms;

  Console.WriteLine("Connection opening...");
  Stopwatch stopwatch = new Stopwatch();
  stopwatch.Start();
  var conn = new NpgsqlConnection(connStr);
  conn.Open();
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine(ms + "ms");

  var cmd = new NpgsqlCommand("select \"ID\", \"FirstName\", \"LastName\" from public.\"TestTable\"", conn);
  stopwatch.Restart();
  NpgsqlDataAdapter da = new NpgsqlDataAdapter(cmd);
  da.Fill(dt);
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("First Query: " + ms + "ms");
  Console.WriteLine($"Subtotal: {total} ms");
  Console.WriteLine($"Count: {dt.Rows.Count}");
  conn.Close();

  stopwatch.Restart();
  conn = new NpgsqlConnection(connStr);
  conn.Open();
  dt = new DataTable();
  da.Fill(dt);
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("Second Query: " + ms + "ms");
  conn.Close();

  stopwatch.Restart();
  conn = new NpgsqlConnection(connStr);
  conn.Open();
  dt = new DataTable();
  da.Fill(dt);
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("Third Query: " + ms + "ms");
  conn.Close();

  cmd = new NpgsqlCommand("select \"ID\", \"FirstName\", \"LastName\" from public.\"TestTable2\"", conn);
  stopwatch.Restart();
  conn = new NpgsqlConnection(connStr);
  conn.Open();
  da = new NpgsqlDataAdapter(cmd);
  dt = new DataTable();
  da.Fill(dt);
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("Test Table 2: " + ms + "ms");
  Console.WriteLine($"Total: {total} ms");
  conn.Close();
}

ADO.NET + Model

static void TestPostgresAdoNetPopulateModel<T>() where T : new()
{
  string connStr = Configuration.GetValue<string>("ConnectionStrings:rpidb");
  DataTable dt = new DataTable();
  long total = 0;
  long ms;

  Console.WriteLine("Connection open...");
  Stopwatch stopwatch = new Stopwatch();
  stopwatch.Start();
  var conn = new NpgsqlConnection(connStr);
  conn.Open();
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine(ms + "ms");

  var cmd = new NpgsqlCommand("select \"ID\", \"FirstName\", \"LastName\" from public.\"TestTable\"", conn);
  stopwatch.Restart();
  NpgsqlDataAdapter da = new NpgsqlDataAdapter(cmd);
  da.Fill(dt);
  PopulateModel<T>(dt);
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("First Query: " + ms + "ms");
  Console.WriteLine($"Subtotal: {total} ms");
  Console.WriteLine($"Count: {dt.Rows.Count}");
  conn.Close();

  stopwatch.Restart();
  conn = new NpgsqlConnection(connStr);
  conn.Open();
  dt = new DataTable();
  da.Fill(dt);
  PopulateModel<T>(dt);
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("Second Query: " + ms + "ms");
  conn.Close();

  stopwatch.Restart();
  conn = new NpgsqlConnection(connStr);
  conn.Open();
  dt = new DataTable();
  da.Fill(dt);
  PopulateModel<T>(dt);
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("Third Query: " + ms + "ms");
  Console.WriteLine($"Total: {total} ms");
  conn.Close();
}

static List<T> PopulateModel<T>(DataTable dt) where T : new()
{
  Type t = typeof(T);
  var props = t.GetProperties(BindingFlags.Instance | BindingFlags.Public);
  List<T> items = new List<T>();

  foreach (DataRow row in dt.Rows)
  {
    var item = new T();

    foreach (DataColumn col in dt.Columns)
    {
      props.Single(p => p.Name == col.ColumnName).SetValue(item, row[col]);
    }

    items.Add(item);
  }

  return items;
}

Linq2DB

static void TestPostgresLinq2Db()
{
  string connStr = Configuration.GetValue<string>("ConnectionStrings:rpidb");
  long total = 0;
  long ms;

  Console.WriteLine("Create Data Connection...");
  Stopwatch stopwatch = new Stopwatch();
  stopwatch.Start();
  var db = PostgreSQLTools.CreateDataConnection(connStr);
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine(ms + "ms");
  db.Close();

  stopwatch.Restart();
  db = PostgreSQLTools.CreateDataConnection(connStr);
  var recs = db.GetTable<TestTable>().ToList();
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("First Query: " + ms + "ms");
  Console.WriteLine($"Count: {recs.Count}");
  Console.WriteLine($"Subtotal: {total} ms");
  db.Close();

  stopwatch.Restart();
  db = PostgreSQLTools.CreateDataConnection(connStr);
  recs = db.GetTable<TestTable>().ToList();
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("Second Query: " + ms + "ms");
  Console.WriteLine($"Count: {recs.Count}");
  db.Close();

  stopwatch.Restart();
  db = PostgreSQLTools.CreateDataConnection(connStr);
  recs = db.GetTable<TestTable>().ToList();
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("Third Query: " + ms + "ms");
  Console.WriteLine($"Count: {recs.Count}");
  Console.WriteLine($"Total: {total} ms");
  db.Close();

  stopwatch.Restart();
  db = PostgreSQLTools.CreateDataConnection(connStr);
  var recs2 = db.GetTable<TestTable2>().ToList();
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine("Table 2 Query: " + ms + "ms");
  Console.WriteLine($"Count: {recs2.Count}");
  Console.WriteLine($"Total: {total} ms");
  db.Close();
}

EF Core

static long TestPostgresEfCore()
{
  long firstTotal = 0;
  long total = 0;
  long ms;

  var contextBuilder = new DbContextOptionsBuilder();
  // Database name is case-sensitive
  string connStr = Configuration.GetValue<string>("ConnectionStrings:rpidb");
  contextBuilder.UseNpgsql(connStr);

  Stopwatch stopwatch = new Stopwatch();
  stopwatch.Restart();
  using (var context = new Context(contextBuilder.Options))
  {
    var items = context.TestTable.ToList();
    ms = stopwatch.ElapsedMilliseconds;
    total += ms;
    firstTotal = ms;
    Console.WriteLine("First query: " + ms + "ms");
    Console.WriteLine($"Count: {items.Count}");
    Console.WriteLine($"Subtotal = {total} ms");
  }

  // Query again to see how long a second query takes.
  stopwatch.Restart();
  using (var context = new Context(contextBuilder.Options))
  {
    var items2 = context.TestTable.ToList();
    ms = stopwatch.ElapsedMilliseconds;
    total += ms;
    Console.WriteLine("Second query: " + ms + "ms");
    Console.WriteLine($"Count: {items2.Count}");
}

  // Query again to see how long a third query takes.
  stopwatch.Restart();
  using (var context = new Context(contextBuilder.Options))
  {
    var items2 = context.TestTable.ToList();
    ms = stopwatch.ElapsedMilliseconds;
    total += ms;
    Console.WriteLine("Third query: " + ms + "ms");
    Console.WriteLine($"Count: {items2.Count}");
    Console.WriteLine($"Total: {total} ms");
  }

  // Query again to see how long a different table query takes.
  total = 0;
  stopwatch.Restart();
  using (var context = new Context(contextBuilder.Options))
  {
    var items2 = context.TestTable2.ToList();
    ms = stopwatch.ElapsedMilliseconds;
    total += ms;
    Console.WriteLine("Table 2 query: " + ms + "ms");
    Console.WriteLine($"Count: {items2.Count}");
    Console.WriteLine($"Total: {total} ms");
  }

  return firstTotal;
}

Test Results

Before we get into the code, let's look at the test results.  The focus here is on the time it takes to perform the first query -- the query itself is essentially irrelevant, it's 10 rows consisting of an ID, first name, and last name.  Each test performs three queries effectively using the connection, either directly or via .NET's connection pooling.  All times are in milliseconds.

rPi Timing Results

The test "ADO.NET + Model" implements deserializing the DataTable records into a .NET object model that I implemented mainly out of curiosity.

Windows Timing Results

Now, just for comparison, here's the the for the first query running the same four tests from my Windows laptop connecting to Postgres on the rPi over my home WIFI (both laptop and rPi on the wireless network):

Given that the EF query running client code on the rPi takes 8000ms vs. running the same client code on a Windows box (1500ms), there is definitely a performance bottleneck on the rPi.  Second and third queries take about 3ms.

What is Entity Framework Doing?

Regardless of the approach you take, the time it takes to perform the first query on the rPi is abysmal.  None-the-less, I discovered something interesting about Entity Framework reading this SO post:

"With EF Core, you can cheat and load the model early...This will create the model of the dbcontext in another thread while the rest of the initialization of the application is done (and maybe other warmups) and the beginning of a request. This way, it will be ready sooner. When you need it, EFCore will wait for the Model to be created if it hasn't finished already. The Model is shared across all DbContext instances so it is ok to fire and forget this dummy dbcontext."

It doesn't really seem like the model is created in another thread.  Doing this first:

Console.WriteLine("Forcing model creating...");
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();

using (var dbContext = new Context(contextBuilder.Options))
{
  var model = dbContext.Model; //force the model creation
  ms = stopwatch.ElapsedMilliseconds;
  total += ms;
  Console.WriteLine(ms + "ms");
}

I note these results, running on the rPi:

=== Testing Postgres EF ===
Forcing model creating...
4265ms
First query: 4847ms
Count: 10
Subtotal = 9112 ms

Notice how the time (granted, this added one whole second to the entire process!) is now split between "model creating" and actually performing the first-time query.  I'm loathe to find out what happens if I have a hundred models that need to be "created."

Incidentally, further on in the SO post someone makes a comment about setting AutoDetectChangesEnabled and LazyLoadingEnabled to false.  This has no affect on the performance (or lack thereof, at least in EF Core.)

Is This Delay Per Table or Per Connection?

Modifying the code to query a different table TestTable2, we see that the query time is similar to the the second and third queries of TestTable:

=== Testing Postgres EF ===
First query: 4774ms
Count: 10
Subtotal = 4774 ms
Second query: 20ms
Count: 10
Third query: 5ms
Count: 10
Total: 4799 ms

Table 2 query: 4ms
Count: 10
Total: 4 ms

How Long Does the Connection Persist?

According to this writeup on SQL Server connection pooling in ADO.NET 2.0: "The connection pooler removes a connection from the pool after it has been idle for approximately 4-8 minutes..."  We can test this out by repeatedly testing a query and determining (with minute granularity) when the connection pool discards the connection.

int min = 1;
int maxWait = 10;

while (min <= maxWait)
{
  Console.WriteLine("=============================");
  Console.WriteLine($"Waiting {min} minute...");
  int sleepFor = min * 60 * 1000;
  Thread.Sleep(sleepFor);
  long q = TestPostgresEfCore();

  if (q > 1000)
  {
    break;
  }

  ++min;
}

if (min <= maxWait)
{
  Console.WriteLine($"Connection pool released all connections in approximately {min} minutes.");
}

The verdict is that even after waiting 10 minutes, the connection was still "live."  I can only assume that the connection pooling, either as implemented in .NET Core or in Postgres adapter, is implemented differently than from the docs.  Or, a third possibility is that there Thread.Sleep is not the right way to test this when everything is running on the same thread.

I also note, from the Microsoft page mentioned above: "ADO.NET 2.0 introduced two new methods to clear the pool: ClearAllPools and ClearPool." that executing these methods:

NpgsqlConnection.ClearAllPools();
NpgsqlConnection.ClearPool(conn);

has no affect on the 2nd and 3rd query timing.

Querying Large Sets of Data

Let's look at the performance of the average of the second and third queries, reading 100,000 records:

I'm actually quite surprised as to how fast Linq2DB is.

Multithreaded Queries

For this test, I'm doing something you normally wouldn't be doing on the "client", which is querying to return all 100,000 records in the test table.  Remember, the database is running on the rPi.

Simultaneous EF Core Queries

Note that in this code, I'm ensuring that the number of tasks do not exceed the number of logical processors.  I'm also priming EF and one connection with the "model build - connect - query" process (we'll play with this a bit later on.)

static void EfCoreMultithreading()
{
  var contextBuilder = new DbContextOptionsBuilder();
  string connStr = Configuration.GetValue<string>("ConnectionStrings:rpidb");
  contextBuilder.UseNpgsql(connStr);
  int numProcs = Environment.ProcessorCount;
  Console.WriteLine($"Logical processors = {numProcs}");
  var trial1 = new Task<(DateTime start, DateTime end)>[numProcs];
  var trial2 = new Task<(DateTime start, DateTime end)>[numProcs];

  var stopwatch = new Stopwatch();
  stopwatch.Start();

  // Priming: run a query to deal with how long it takes for the initial connection+query.
  using (var context = new Context(contextBuilder.Options))
  {
    var items = context.TestTable.Where(t => t.ID == -1).ToList();
  }

  long priming = stopwatch.ElapsedMilliseconds;
  stopwatch.Stop();
  Console.WriteLine($"connect + query took {priming}ms");

  var now = DateTime.Now;

  numProcs.ForEachWithIndex(idx => trial1[idx] = StartEfCoreTask());
  Task.WaitAll(trial1);

  numProcs.ForEachWithIndex(idx => trial2[idx] = StartEfCoreTask());
  Task.WaitAll(trial2);

  Console.WriteLine("Trial 1:");
  trial1.ForEach(t =>
  {
    Console.WriteLine($" Start: {(t.Result.start - now).TotalMilliseconds}ms Duration: {(t.Result.end - t.Result.start).TotalMilliseconds}ms");
  });

  Console.WriteLine("Trial 2:");
  trial2.ForEach(t =>
  {
    Console.WriteLine($" Start: {(t.Result.start - now).TotalMilliseconds}ms Duration: {(t.Result.end - t.Result.start).TotalMilliseconds}ms");
  });
}

static Task<(DateTime start, DateTime end)> StartEfCoreTask()
{
  return Task.Run(() =>
  {
    var contextBuilder = new DbContextOptionsBuilder();
    string connStr = Configuration.GetValue<string>("ConnectionStrings:rpidb");
    contextBuilder.UseNpgsql(connStr);
    var start = DateTime.Now;

    using (var context = new Context(contextBuilder.Options))
    {
      var items = context.TestTable.ToList();
    }

    var end = DateTime.Now;

    return (start, end);
  });
}

Running on Windows

Recall that it takes about 2200ms for one "primed" query of 100,000 records to run on a single thread.

Logical processors = 4
Priming took 1589ms
Trial 1:
Start: 3.3745ms Duration: 3269.5232ms
Start: 3.3745ms Duration: 5263.6895ms
Start: 3.3745ms Duration: 2403.0416ms          <=== re-use of connection?
Start: 3.3745ms Duration: 5459.1072ms
Trial 2:
Start: 5462.846ms Duration: 5404.7242ms
Start: 5462.8351ms Duration: 5399.7877ms
Start: 5462.8351ms Duration: 5403.8799ms
Start: 5462.846ms Duration: 5411.174ms

Running on rPI

Recall that it takes about 4000ms for one "primed" query of 100,000 records to run on a single thread.

Logical processors = 4
Priming took 9221ms
Trial 1:
Start: 24.6442ms Duration: 9613.1905ms
Start: 24.8993ms Duration: 9636.8815ms
Start: 24.6441ms Duration: 9673.5425ms
Start: 24.8986ms Duration: 9655.8864ms
Trial 2:
Start: 9699.3142ms Duration: 9103.8668ms
Start: 9699.3141ms Duration: 8743.3299ms
Start: 9699.314ms Duration: 8969.2488ms
Start: 9699.3142ms Duration: 9024.2545ms

What if we don't "prime" the connection pool with a single connection? 

  • Windows making the queries:
Logical processors = 4
Trial 1:
Start: 6.0236ms Duration: 9810.3092ms
Start: 6.0236ms Duration: 10585.4694ms
Start: 6.0236ms Duration: 10558.1815ms
Start: 6.0236ms Duration: 10564.1273ms
Trial 2:
Start: 10591.8437ms Duration: 5625.977ms
Start: 10591.8437ms Duration: 4376.5792ms
Start: 10591.8595ms Duration: 5570.2906ms
Start: 10591.8595ms Duration: 1770.9706ms       <=== re-use of connection?
  • rPi making the queries:
Logical processors = 4
Trial 1:
Start: 27.7581ms Duration: 20175.8031ms
Start: 27.7581ms Duration: 20175.7824ms
Start: 27.8305ms Duration: 20263.7383ms
Start: 27.7581ms Duration: 20175.778ms
Trial 2:
Start: 20292.7195ms Duration: 10079.8114ms
Start: 20292.7195ms Duration: 9850.5694ms
Start: 20292.7195ms Duration: 10071.8997ms
Start: 20293.1121ms Duration: 10213.0811ms

An Analysis of the Results

To understand these results we need to remember that we can prime EF Core with a model building step that reduces the "connect + query" time.  I'm essentially doing that in the "priming" tests by making an initial query on the main thread. 

  1. It seems that this connection is re-used by one thread when running from Windows
  2. When the model "building" priming is omitted, all four threads running from Windows appear to require building the model, but on the second trial, this model building isn't necessary but again we see that only one thread is taking advantage of a pooled connection even though we should, theoretically, have four pooled connections from the first trial.
  3. Running the test on the rPi yields similar results except that in this case, and I'm not sure what to conclude about connection pooling.  The timing improvement we're seeing in the second trial appears to be the result of EF Core having built the model but seems to indicate that it isn't smart enough to know that it's being built by other threads. 

The supposition in #3 can be proved by looking at the timing with performing just model building on the rPi.  For this, we replace the full connect + query priming with this code:

using (var dbContext = new Context(contextBuilder.Options))
{
  dbContext.ChangeTracker.AutoDetectChangesEnabled = false; // <----- trick
  dbContext.ChangeTracker.LazyLoadingEnabled = false; // <----- trick
  var model = dbContext.Model; //force the model creation
}

long priming = stopwatch.ElapsedMilliseconds;
stopwatch.Stop();
Console.WriteLine($"Model building took {priming}ms");

Results:

Logical processors = 4
Model building took 4330ms
Trial 1:
Start: 25.583ms Duration: 14884.9089ms
Start: 25.1461ms Duration: 14754.3145ms
Start: 25.1461ms Duration: 14797.9605ms
Start: 25.4297ms Duration: 14854.3302ms
Trial 2:
Start: 14911.8329ms Duration: 11586.8008ms
Start: 14911.6562ms Duration: 10127.6391ms
Start: 14911.6561ms Duration: 11580.0408ms
Start: 14911.6561ms Duration: 11671.2985ms

By priming with the model "build", the trial 1 timing is reduced by 5000ms and the trial 2 timing may or may not be re-using the connection pools from trial 1.  Interestingly, the timing of trial 2 is 1 - 2 seconds longer than when the "build+connect+query" priming was used.  I'm not going to make any guesses as to why that is.

Similarly on Windows, trial 2 takes longer when we perform just the "build" priming step, but there's very little difference between trial 1 and trial 2.

Logical processors = 4
Model building took 816ms
Trial 1:
Start: 3.9416ms Duration: 7166.4065ms
Start: 4.244ms Duration: 8147.1518ms
Start: 3.9416ms Duration: 4244.7402ms                <== why is this one faster?
Start: 3.9416ms Duration: 7848.992ms
Trial 2:
Start: 8151.7212ms Duration: 7786.576ms
Start: 8151.7212ms Duration: 6795.6511ms
Start: 8151.722ms Duration: 6242.4902ms
Start: 8151.722ms Duration: 8786.9947ms

We do see a reduction in trial 1 timing, but we don't see a reduction in time in trial 2, though oddly, in trial 1, we see one thread that took half the time.  It's odd that we don't see trial 2 taking less time when running on Windows as it's the same code and packages, so there must be something else going on.

Furthermore, when running this 100,000 record query on all 4 cores simultaneously, we see that each core doubles the amount of time it takes as compared to a single thread query.  On a single thread, the rPi averages 4000ms, and 4 multithreaded queries took around 9000ms.  What happens if we use only 2 cores, theoretically leaving 2 cores free for maybe the internals of Postgres?

Logical processors = 2
Model + connect + query priming took 9567ms
Trial 1:
Start: 25.1111ms Duration: 5637.642ms
Start: 25.1111ms Duration: 5612.4357ms
Trial 2:
Start: 5663.9105ms Duration: 5467.9455ms
Start: 5663.9104ms Duration: 5528.9189ms

Much better.  We're suffering a 28% penalty running on 2 threads instead of a 56% penalty running the query on all four threads.

Creating Disaster

Let's ignore the number of logical processors and force 8 threads to execute the query.  We'll focus just on the rPi.

And they're off!  The first 4 threads are out of their gates and heading down the straightaway while there seems to be some problem with the other 4 threads leaving the gate!

Logical processors = 8
Model + connect + query priming took 9452ms
Trial 1:
Start: 24.623ms Duration: 10444.4752ms
Start: 24.623ms Duration: 11347.9301ms
Start: 24.623ms Duration: 11646.0459ms
Start: 24.623ms Duration: 11261.6374ms
Start: 4130.0397ms Duration: 12144.0146ms
Start: 9234.6732ms Duration: 9702.6407ms
Start: 10479.0248ms Duration: 9170.6971ms
Start: 11313.1258ms Duration: 8702.2047ms
Trial 2:
Start: 20016.4353ms Duration: 17668.6654ms
Start: 20016.4353ms Duration: 17616.3623ms
Start: 20016.4353ms Duration: 15696.1627ms
Start: 20016.4352ms Duration: 17027.2403ms
Start: 20017.9037ms Duration: 15747.4193ms
Start: 20017.9037ms Duration: 17643.9682ms
Start: 26699.7751ms Duration: 12650.1307ms
Start: 34472.6065ms Duration: 6260.0021ms

First, by noticing the start times, we see the .NET thread pool throttling kicking in but getting a bit confused in trial 2 as it starts off 6 threads simultaneously.  At it's worse, a query that should take around 4000ms is taking 17,500ms, a penalty of 77% as compared to a single thread query, mainly because in trial 2, the thread pool started of six threads simultaneously.

Simultaneous Linq2DB Queries

Priming

var db = PostgreSQLTools.CreateDataConnection(connStr);
var recs = db.GetTable<TestTable>().Take(1).ToList();

The Linq2DB Query Task

static Task<(DateTime start, DateTime end)> StartLinq2DBTask()
{
  return Task.Run(() =>
  {
    string connStr = Configuration.GetValue<string>("ConnectionStrings:rpidb");
    var start = DateTime.Now;
    var db = PostgreSQLTools.CreateDataConnection(connStr);
    var recs = db.GetTable<TestTable>().ToList();
    var end = DateTime.Now;

    return (start, end);
  });
}

Let's do the Numbers

Recall that a single thread query of 100,000 records tool Linq2DB a little less than 1400ms.  Running on 4 cores we see around a 65% penalty.

Logical processors = 4
priming took 6943ms
Trial 1:
Start: 14.6684ms Duration: 3829.2075ms
Start: 14.6683ms Duration: 3872.5ms
Start: 14.8168ms Duration: 3850.6253ms
Start: 8.7451ms Duration: 3840.4848ms
Trial 2:
Start: 3888.2774ms Duration: 3646.7527ms
Start: 3888.4093ms Duration: 3609.2581ms
Start: 3888.2774ms Duration: 3623.5665ms
Start: 3888.3238ms Duration: 3134.8659ms

Running on 2 cores, we see no penalty in trial 1 and a 33% penalty in trial 2.  No idea why that is but it seems consistent.

Logical processors = 2
priming took 7095ms
Trial 1:
Start: 8.7998ms Duration: 1458.5924ms
Start: 8.7998ms Duration: 1467.26ms
Trial 2:
Start: 1477.2797ms Duration: 2092.9932ms
Start: 1477.4778ms Duration: 2085.4642ms

Conclusion

I consider the initial connection and query time to be unacceptable regardless of whether we're connecting to Postgres on rPi from Windows or the rPi itself.  I don't think this is related to my having installed the OS and apps on an external USB drive and utilizing the USB 2.0 interface on the rPi but rather the underlying implementation in .NET Core.  Furthermore, EF Core clearly has a lot of overhead when setting up that initial connection + query, which can only be partially mitigated by forcing EF Core to create the model.  My conclusion that it's a .NET Core "problem" is also based on the fact that I've never experienced these kinds of delays using the Postgres pgadmin tools.

Workarounds

  1. When the application starts up, open the connection and query a table -- any query will do, even one that doesn't return any records.  Even though I didn't experience the connection pool removing connections, even after 10 seconds, it might be wise to perform this "open & query" activity at some regular interval.
  2. Avoid long-running queries if at all possible.  This might involve optimizing the query, for example, let the database sum or average a value, don't do it on the client-side.
  3. On the rPi, don't exceed 2 cores for queries.  This means you'll have to manage your own thread pool, but the idea is that we need the other 2 cores for other processes.
  4. Consider not using EF Core and instead using Linq2DB.  This is easiest accomplished if you're starting a project fresh rather than porting an existing EF project. 

Probably the most important workaround is not to run a database server like Postgres on the same rPi as is making the queries.  rPi's are $35, cheap enough that you can run the database on one rPi and the application (web server, whatever) on a second rPi.  At some point I'll give that a try, but I need another rPi, or maybe I'll use one of the Beaglebone's I have.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Marc Clifton
Architect Interacx
United States United States
Blog: https://marcclifton.wordpress.com/
Home Page: http://www.marcclifton.com
Research: http://www.higherorderprogramming.com/
GitHub: https://github.com/cliftonm

All my life I have been passionate about architecture / software design, as this is the cornerstone to a maintainable and extensible application. As such, I have enjoyed exploring some crazy ideas and discovering that they are not so crazy after all. I also love writing about my ideas and seeing the community response. As a consultant, I've enjoyed working in a wide range of industries such as aerospace, boatyard management, remote sensing, emergency services / data management, and casino operations. I've done a variety of pro-bono work non-profit organizations related to nature conservancy, drug recovery and women's health.

You may also be interested in...

Comments and Discussions

 
QuestionThank you for this Pin
AFell28-Feb-19 10:42
memberAFell28-Feb-19 10:42 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web02 | 2.8.190214.1 | Last Updated 20 Jan 2019
Article Copyright 2019 by Marc Clifton
Everything else Copyright © CodeProject, 1999-2019
Layout: fixed | fluid