Breaking

Wednesday, June 30, 2021

Avoiding index fragmentation with sequential guids

Using a non sequential GUID in an index is not a good idea as it leads to index fragmentation and decreased performance. We could switch to an identity field but this is not ideal in a highly distributed (micro)services architecture.

RT.Comb to the rescue!

RT.Comb implements the “COMB” technique, as described by Jimmy Nilsson, which replaces the portion of a GUID that is sorted first with a date/time value. This guarantees (within the precision of the system clock) that values will be sequential, even when the code runs on different machines.

RT.Comb is available as a NuGet package and provides different strategies for generating the timestamp optimized for different database platforms:

  • RT.Comb.Provider.Legacy: The original technique. Only recommended if you need to support existing COMB values created using this technique.
  • RT.Comb.Provider.Sql: This is the recommended technique for COMBs stored in Microsoft SQL Server.
  • RT.Comb.Provider.Postgre: This is the recommended technique for COMBs stored in PostgreSQL.

This technique works great for most scenario’s. Unless you are in a high write scenario and records are inserted faster than the precision offered by DateTime.UtcNow(which is 1/300th of a second). In that case you will still not have collisions but it could be that ids will not be sorted correctly.

But don’t worry, a solution is provided by RT.Comb itself.  RT.Comb has a timestamp provider called UtcNoRepeatTimestampProvider which ensures that the current timestamp is at least X ms greater than the previous one. Here is an example on how to use the UtcNoRepeatTimestampProvider:

No comments:

Post a Comment