I’ve been occasionally writing posts about old design patterns or techniques that are still occasionally useful despite the decades long backlash to the old “Gang of Four” book:
- The Lowly Strategy Pattern is Still Useful
- The Occasionally Useful State Pattern
- The Decorator Pattern is sometimes helpful
- Linked Lists in Real Life (this post)
A linked list structure is a simple data structure where each element has a reference to the next element. At its simplest, it’s no more than this single linked list implementation:
public abstract class Item
{
public Item Next { get; set; }
}
Now, I got into software development through Shadow IT rather than a formal Computer Science degree, so I can’t really get into what Big O notation stuff a linked list gives you for sorting or finding or whatnot (and don’t really care either). What I can tell you that I have occasionally used linked list structures to great effect, with at least two samples within the greater “Critter Stack” / JasperFx codebases.
For the first example, consider the complex SQL generation from this LINQ query in Marten using Marten’s custom “Include Related Documents” feature:
// theSession is an IDocumentSession from Marten
var holders = await theSession.Query<TargetHolder>()
.Include<Target>(x => x.TargetId, list, t => t.Color == Colors.Green)
.ToListAsync();
Behind the scenes, Marten is generating this pile of lovely SQL:
drop table if exists mt_temp_id_list1;
create temp table mt_temp_id_list1 as (select d.id, d.data from public.mt_doc_targetholder as d);
select d.id, d.data from public.mt_doc_target as d where (d.id in (select CAST(d.data ->> 'TargetId' as uuid) from mt_temp_id_list1 as d) and CAST(d.data ->> 'Color' as integer) = $1);
$1: 2
If you squint really hard, you can notice that Marten is actually executing four different SQL statements in one logical query. Internally, Marten is using a linked list structure to “plan” and then generate the SQL from the raw LINQ Expression
tree using the Statement
type, partially shown below:
public abstract partial class Statement: ISqlFragment
{
public Statement Next { get; set; }
public Statement Previous { get; set; }
public StatementMode Mode { get; set; } = StatementMode.Select;
/// <summary>
/// For common table expressions
/// </summary>
public string ExportName { get; protected internal set; }
public bool SingleValue { get; set; }
public bool ReturnDefaultWhenEmpty { get; set; }
public bool CanBeMultiples { get; set; }
public void Apply(ICommandBuilder builder)
{
configure(builder);
if (Next != null)
{
if (Mode == StatementMode.Select)
{
builder.StartNewCommand();
}
builder.Append(" ");
Next.Apply(builder);
}
}
public void InsertAfter(Statement descendent)
{
if (Next != null)
{
Next.Previous = descendent;
descendent.Next = Next;
}
if (object.ReferenceEquals(this, descendent))
{
throw new InvalidOperationException(
"Whoa pardner, you cannot set Next to yourself, that's a stack overflow!");
}
Next = descendent;
descendent.Previous = this;
}
/// <summary>
/// Place the descendent at the very end
/// </summary>
/// <param name="descendent"></param>
public void AddToEnd(Statement descendent)
{
if (Next != null)
{
Next.AddToEnd(descendent);
}
else
{
if (object.ReferenceEquals(this, descendent))
{
return;
}
Next = descendent;
descendent.Previous = this;
}
}
public void InsertBefore(Statement antecedent)
{
if (Previous != null)
{
Previous.Next = antecedent;
antecedent.Previous = Previous;
}
antecedent.Next = this;
Previous = antecedent;
}
public Statement Top()
{
return Previous == null ? this : Previous.Top();
}
// Find the selector statement at the very end of
// the linked list
public SelectorStatement SelectorStatement()
{
return (Next == null ? this : Next.SelectorStatement()).As<SelectorStatement>();
}
// And a some other stuff...
}
The Statement
model is a “double linked list,” meaning that each element is aware of both its direct ancestor (Previous
) and the next descendent (Descendent
). With the Statement
model, I’d like to call out a couple wrinkles that made the linked list strategy a great fit for complex SQL generation.
First off, the SQL generation itself frequently requires multiple statements from the top level statement down to the very last statement, and you can see that happening in the Apply()
method above that writes out the SQL for the current Statement
, then calls Next.Apply()
all the way down to the end of the chain — all while helping Marten’s (really Weasel’s) batch SQL generation “know” when it should start a new command (see NpgsqlBatch for a little more context on what I mean there).
Also note all the methods for inserting a new Statement
directly before or after the current Statement
. Linked lists are perfect for when you frequently need to insert new elements before or after or even at the very end of the chain rather than at a known index. Here’s an example from Marten that kicks in sometimes when a user uses the LINQ Distinct()
operator:
For more context if you’ve lived a more charmed life and have never run across them, here’s an explanation of Common Table Expressions from PostgreSQL (this is commonly supported in other databases).
internal class DistinctSelectionStatement: Statement
{
public DistinctSelectionStatement(SelectorStatement parent, ICountClause selectClause, IMartenSession session)
{
parent.ConvertToCommonTableExpression(session);
ConvertToCommonTableExpression(session);
parent.InsertAfter(this);
selectClause.OverrideFromObject(this);
var selector = new SelectorStatement { SelectClause = selectClause };
InsertAfter(selector);
}
protected override void configure(ICommandBuilder sql)
{
startCommonTableExpression(sql);
sql.Append("select distinct(data) from ");
sql.Append(Previous.ExportName);
sql.Append(" as d");
endCommonTableExpression(sql);
}
}
When Marten has to apply this DistinctSelectionStatement
, it modifies its immediate parent statement that probably started out life as a simple select some_field from some_table
query into a common table expression query, and appends the DistinctSelectionStatement
behind its parent to do the actual SQL distinct
mechanics.
In this particular case of the SQL generation, it’s been frequently necessary for a Statement
to “know” about its immediate ancestor, as you can see in the sample code above where the DistinctSelectStatement
picks off the ExportName
(the CTE name of the preceding statement) to use in generating the right SQL in DistinctSelectStatement.Apply()
.
For another example, both Marten and Wolverine use a runtime code generation model to build a lot of their “glue” code between the framework and user’s application code (you can see an example of that runtime code generation in this post). One of the core conceptual abstractions in the shared code generation model is the abstract Frame
class which roughly equates to logical step in the generated code — usually just a single line of code. During the code generation process, the Frame
objects are assembled in a single linked list structure so each Frame
.
When actually writing out the generated source code, a typical Frame
will write its code, then tell the next Frame
to write out its code and so on — as shown by this sample class:
// This is from Wolverine, and weaves in code
// to add selected tags from incoming messages to
// message handlers into the current Open Telemetry Activity
public class AuditToActivityFrame : SyncFrame
{
private readonly Type _inputType;
private readonly List<AuditedMember> _members;
private Variable? _input;
public AuditToActivityFrame(IChain chain)
{
_inputType = chain.InputType()!;
_members = chain.AuditedMembers;
}
public override IEnumerable<Variable> FindVariables(IMethodVariables chain)
{
_input = chain.FindVariable(_inputType);
yield return _input;
}
public override void GenerateCode(GeneratedMethod method, ISourceWriter writer)
{
writer.WriteComment("Application-specific Open Telemetry auditing");
foreach (var member in _members)
writer.WriteLine(
$"{typeof(Activity).FullNameInCode()}.{nameof(Activity.Current)}?.{nameof(Activity.SetTag)}(\"{member.OpenTelemetryName}\", {_input!.Usage}.{member.Member.Name});");
// Tell the next frame to write its code too!
Next?.GenerateCode(method, writer);
}
}
Where the linked list structure really comes into play with the source generation is when you need to wrap the inner Next
Frame
with some kind of coding construct like a using
block or a try/finally
block maybe. Here’s an example of doing just that where the following CatchStreamCollisionFrame
places a try/catch
block around the code generated by the Next
frame (and all of the other frames after the Next
frame as well):
// This is the actual middleware that's injecting some code
// into the runtime code generation
internal class CatchStreamCollisionFrame : AsyncFrame
{
public override void GenerateCode(GeneratedMethod method, ISourceWriter writer)
{
writer.WriteComment("Catches any existing stream id collision exceptions");
writer.Write("BLOCK:try");
// Write the inner code here
Next?.GenerateCode(method, writer);
writer.FinishBlock();
writer.Write($@"
BLOCK:catch({typeof(ExistingStreamIdCollisionException).FullNameInCode()} e)
await {typeof(StreamCollisionExceptionPolicy).FullNameInCode()}.{nameof(StreamCollisionExceptionPolicy.RespondWithProblemDetails)}(e, httpContext);
return;
END
");
}
The ugly generated code above will catch a Marten exception named ExistingStreamIdCollisionException
in the generated code for a Wolverine.HTTP endpoint and return a ProblemDetails
result explaining the problem and an HTTP status code of 400 (Invalid) instead of letting the exception bubble up.
By having the linked list structure where each Frame
is aware of the next Frame
, it makes it relatively easy to generate code when you need to wrap the inner code in some kind of C# block structure.
Summary
I spent a lot of time as a kid helping my Dad on his construction crew and helping my grandfather on his farm (you can’t possibly imagine how often farming equipment breaks). Both of them obviously had pretty large toolboxes — but there’s some tools that don’t come out very often, but man you were glad you had them when you did need them. The average developer probably isn’t going to use linked lists by hand very often, but I’ve found them to be very helpful when you need to either model a problem as outer/inner handlers or ancestor/descendents. Linked lists are also great when you need to be able to easily insert items into the greater collection relative to another item.
Anyway, dunno if these examples are too involved or too specious, but there the only two times I’ve used a linked list in the past decade.
I hope it’s obvious, but the JasperFx Software logo is meant to be a tractor tire around a cattle brand. The company name and logo is a little tribute to my family heritage, such as it is:)