The problem
Many allocation problems can be reduced to the maximum bipartite matching: assigning workers to jobs, students to courses or residencies, tasks to available servers, and so on. This series of articles walks through a concrete example and then extends it to more advanced scenarios that you might encounter in the real world. I’ll assume you’re familiar with graphs and flow networks. If not, you can always learn these subjects — there is a lot of excellent material out there.
For everything I talk about here there is working code you can download and play with.
Example: matching applicants to jobs
Suppose we have a single job with these requirements:
And three applicants whose skills look like this:
Our goal is to assign applicants to roles so that the maximum number of positions is filled.
If, for a moment, we ignore how many people each role needs, the problem can be modelled as a bipartite graph:
Nodes on the left represent the applicants. Nodes on the right represent the roles. The edges that connect them signify that an applicant has the skills required for the role. For example A-1 has skill-1 and skill-2 and that’s why they are connected to S-1 and S-2.
There are numerous algorithms that find the maximum matching in bipartite graphs. Some of them do it via flow networks — the problem is reduced to finding a maximum flow in a network and then the flow is converted back into matching. For our problem I prefer the flow network approach as it allows us to easily encode that we need 2 people with skill-2 and 2 more people with skill-3. Here’s how our problem would look like in a flow network form.
The new nodes are the source and the sink. The flow emanates from the source and needs to find its way to the sink via directed edges without exceeding the capacity of any of the edges.
The edges that go from the source to the nodes that represent applicants all have a capacity of one. This represents the fact that we have only one of each of the applicants. Edges that connect applicants to the role nodes also have a capacity of one — each applicant can only do one role. If applicant-1 is assigned the role that requires skill-1 then that’s it, they can’t be assigned anything else. The capacities of edges that connect role nodes to the sink represent the number of people required for that role. For example, S-2 needs two people and so has the capacity to take the flow from both A-1 and A-2.
Once we have the flow network the goal is to find the path the flow should take in order to reach its maximum. Just as there could be several maximum matchings in a bipartite graph, there could be several maximum flows in the network. Here’s one of them:
Here’s another:
In both cases the flow is 3, meaning that all three applicants are assigned roles. It’s easy to convert the flow back to role assignments. Using the first flow: applicant-1 is assigned the role that requires skill-1; applicant-2 is assigned skill-3; applicant-3 is assigned skill-2.
Try it yourself
If you want to explore the examples above and experiment with matching applicants to jobs, you can download the code here: https://github.com/denissudak/applicant-job-matching
Continue to Part 2 that talks about how to calculate what roles are still in demand (what skills are still required to complete the team). 🙂