Make your LINQ Queries as readable as a good book
In my most recent article on how to write readable code, I mentioned that good LINQ Queries are way more readable and a good replacement, than any kind of loops in most cases. For beginners, LINQ might take a few days to learn, but I found, that I’m still improving on how I write my LINQ queries six years after the first learning of them.
In this article, I am going to present you a few rules to stick to when writing LINQ queries. These rules will make your queries readable like a book.
Breaking your code can change the look of your code from a pile of garbage to a beautiful artwork. I found, that you can read code faster vertically, than horizontally. As a result, I break at least all lines of code, that would go beyond half the screen. For LINQ queries, I even break them before every dot. This makes them way better readable and the logic gets more clear, as you can see below:
After breaking your code, you sometimes need to adjust the indentation level of your code-line, as your IDE might not do it for you. I found it feasible to indent each nested layer by one more indentation level, as shown here:
The constructor of
ReceivedMessage is a new logical layer, which gets a new indentation level, and optionally you can also break and indent your arguments.
Don’t forget to break your closing brackets and move them to the same indentation level as your opening bracket. This way, you see very quickly, where a statement ends and a new — with the same indentation level — is beginning.
Sometimes, you might have complicated transform rules in your
SelectMany clauses. You might be tempted to just write the whole code inside your Lambda Function. However, the next dev will most likely not understand, what’s happening there. So why not just extract this part of the code to a separate method. You can even leave the braces and lambda-function-boilerplate code away, as you can write short lambdas like this:
A similar case to number three happens in
Where clauses. Everything is good if you evaluate one boolean. However, if there are two or more, your intention will fade. The next developer will struggle to understand, what you want to filter here. To fix this, just extract the lambda into a local function and give it a good name, like below:
LINQ already comes with extremely useful extension methods. However, there still might be some functions, which are very useful and not yet included in the default library. For example, I created the extension method
DistinctBy() way before Microsoft included it in .NET 6.
What I especially miss in LINQ is better handling of Tasks and asynchronous code. To make life a little easier, I implemented the following extension methods:
They make asynchronous LINQ way easier to handle, as you can see here:
In nearly every Clean Code Guideline, you will read that you should provide well-named variables and methods. However, in this case, I would dare to contradict. A LINQ query is often a little longer horizontally than your usual code. When you now try to give every lambda variable a good name, your line will certainly get way longer, you might use your variable repeatedly, and you might find it harder to break your code.
I found it feasible, to just use the first letter of the class, you want to pass to the function. The intention of the variable should be clear by the source enumerable anyway. This is why you should pay special attention to the name of this enumerable, nevertheless. Have a look at this example:
The type and intention of the variable gets clear from its source enumerable, we save a few characters, and we can even directly see which variable is used for the lambda function.
Every method in your project, which expects some kind of
IEnumerable should always state that it expects this. Never expect an
List or whatever, when you don’t really need it (you can still pass it to the method though). The only exceptions I found in my projects are
Dictionaries. Also when you explicitly want an already enumerated
IEnumerableyou can expect a specific type according to your liking.
You should establish a rule, to never pass
null as an
IEnumerable. You can just pass en empty
Enumerable.Empty<T>(). This way you will drastically reduce NullReferenceExceptions and your code will not break. Normally you don’t even need to worry about empty ones, as your LINQ queries will also work with empty ones.
Immutability is an extremely important concept for clean coding. It basically means that you cannot change the state of any object by design. Sometimes of course you can, but you mustn’t! Especially in your
Select() Queries, you have to pay attention to not change any state of your source. This way you make sure, you can enumerate the query multiple times, without changing the outcome. Have a look at the following example:
If you enumerate this the first time, the streams will be read and then disposed, thus changing the state of the source! If you try to enumerate this a second time, you will get an
ObjectDisposedException() and it is not too obvious, where you did wrong.
Last but not least, you should keep in mind how your LINQ query performs. I admit, this point may reduce the readability for the sake of performance, which is why I put it in parentheses.
It may easily happen that you iterate through whole
IEnumerables a few times more than you really need. Many Leetcode exercises feature an optimization of iterations and thus reducing your time and space complexity. A good question to ask yourself if you want to optimize a query is:
Can I use a HashSet or a Dictionary here?
Oftentimes, you can reduce the time complexity by just using one of those as they feature
O(1) time complexity.
Also, a nice way to vastly improve your LINQ Query performance is by just using
.AsParallel(). This introduces your LINQ query to a PLINQ query, where everything runs parallelized on your CPU. Here you should pay special attention to immutability and not depend on other changing states, as parallelization can only be used on independent queries efficacy.
You should use PLINQ only when you have very long-running queries, as it can even slow down certain queries that would otherwise be quicker.