How to create your own lightweight parallel library the easy way
Yes yes I know. In .NET there is Parallel.ForEach that comes out of the box. But let me try to explain. I’ve found myself in situations where I need a smaller piece of code that simply performs an action (work) over a set of inputs.
Parallel.ForEach He tries to pay attention a lot, prepares to deal with many random situations, and for me, that was just noise.
So, I was wondering, what if we Write a smaller library that simply takes care of executing a procedure on a bunch of inputs in parallel, lets the procedure take care of everything (even error handling), and keeps my library as dumb as possible.
why? Well, this was the most common scenario for me. Example: “Send a notification to each of these users” or “Contact this service and write the result to a file for each of these entries.”
Essentially “do this task for each of these inputs” and these tasks (work) can be executed in parallel.
Let’s take my current job as an example of a good fit with this library. I do ETL, to put my function in a few words: Based on the config file, I’m querying a single source system and writing to some data source.
From the config file I get for example a long list of tables, and each object required for a single table is created to be either processed via a DI framework or when needed, created by some factory quickly.
With this in mind, what I need is something to help me organize all the work. Currently, I have developed two algorithms, one aggressive based on signals and one passive which I will show here and it is based on the idea of a lightweight tool that simply performs one task after another.
Based on my requirements, let’s create an extension method for
IEnumerable<T> This gets
Action<T> To implement each component in
As you can see in this for everyone, this can be as simple as setting up a file
ActionsThen start each file
Task take advantage of
Task.WhenAll() To make sure you run them all to complete.
To prepare an action, we must use what has been passed
Action to create a list
Actionsone for each entry.
Action It is configured, the next step is to run it inside a file
to keep track of all
Task We will save their reference in
List<Task>. This way we can take advantage of
Task.WhenAll() Let the waiter be.
Effective care only to start tasks in parallel and nothing else.
What about the case where we need to specify a number
Threads Which program can be used? Probably because the source system is unstable and cannot handle many requests in parallel. In this case, we simply cannot start all the tasks and hopefully for the best, we must keep track of the number of active ones.
For this, we can change the loop over the trigger
Actions. Once again initialize the list
Actionsbut this time let’s start by initializing a subset of
We will name this subgroup as
workInProgress. While there is work in progress, we assume it is still active
Tasks And keep the episode alive waiting
Tasks To complete it, this time take advantage of
There are two important things that make this work so well happen in the episode, we’re keeping the show locked in waiting
WhenAny Let the program continue to run:
- We know the task is complete, so we can run a new task.
WhenAny()Also returns a reference to a file
Taskwe can only
RemoveThis is exactly
workInProgressBased on its reference.
workInProgress Empty, we know that every item in our input list has been processed.
As you can see, I don’t deal with bugs. In real life, I take care of logging and error handling in multiple places thanks to using patterns like IoC and Factories.
We hope these ideas are useful for your business, let’s keep the creative ideas flowing.