Module trase.tools.matching.fair_allocation

Fair allocation deals with the question: if we have a list of candidate matches, some of which may be overlapping, how do we most fairly distribute the matches? For us the "distribution" typically refers to volume.

Functions

def direct_sourcing(matches: Union[pandas.core.indexes.multi.MultiIndex, Iterable[Tuple]], source_capacity: Union[pandas.core.series.Series, dict], sink_demand: Union[pandas.core.series.Series, dict], relative_tolerance=0, run_qa=True) ‑> Tuple[pandas.core.series.Series, pandas.core.series.Series]

Allocates volumes between sinks and sources proportional to sink capacities without exhausting supply.

Suppose that we have:

  • A number of sinks, each of which has a demand
  • A number of sources which each have a (supply) capacity
  • A list matching sources to sinks ("matches")

This algorithm will allocate volume to each match in such a way that demand is met at the sinks, supply is not exhausted at the sources, and demand is distributed proportional to the capacity of the sources.

Example

Suppose we have two sources A and B, two sinks X and Y, and all are matched to all:

┌───┐        ┌───┐
│ A ├─┬──────┤ X │
│300│ │┌─────┤300│
└───┘ ││     └───┘
      ││
┌───┐ ││     ┌───┐
│ B │ └│─────┤ Y │
│150├──┴─────┤300│
└───┘        └───┘

Each sink now splits its demand among the sources proportional to the relative capacities of the sources. For example X is connected to A (300) and B (150) so it apportions 300 / (300 + 150) = 66% of its volume to A and 150 / (300 + 150) = 33% of its volume to B:

┌───┐        ┌───┐
│ A ├─┬──200─┤ X │
│300│ │┌─100─┤300│
└───┘ ││     └───┘
      ││
┌───┐ ││     ┌───┐
│ B │ └│─200─┤ Y │
│150├──┴─100─┤300│
└───┘        └───┘

Now we calculate the total demand on each source and we see that both A and B are overprovisioned. A has a total demand of 200 + 200 = 400, of which it can only supply 300 (75%). B has a total demand of 100 + 100 = 200 of which it can only supply 150 (75%).

Each sink is reduced equally: we reduce all of the flows to A to 75% of their amount and all of the flows to B to 75% of their amount:

┌───┐        ┌───┐
│ A ├─┬──150─┤ X │
│300│ │┌─ 75─┤300│
└───┘ ││     └───┘
      ││
┌───┐ ││     ┌───┐
│ B │ └│─150─┤ Y │
│150├──┴─ 75─┤300│
└───┘        └───┘

In code the above example would look like this:

>>> sources = {"A": 300, "B": 150}
>>> sinks = {"X": 300, "Y": 300}
>>> matches = [("A", "X"), ("A", "Y"), ("B", "X"), ("B", "Y")]
>>> allocation, _ = direct_sourcing(matches, sources, sinks)
>>> allocation
A  X    150
   Y    150
B  X     75
   Y     75

Example

This function also supports a relative tolerance. This is designed to allow for a percentage discrepancy between the volumes of the sources and the sinks.

The procedure is as follows: first, try to make the original sinks demand fit in the sources capacities. Then, reduce the sink demand by a fixed percentage and try again. Finally, take the greater of the two allocations.

Let's work through an example to demonstrate this. Suppose that the relative tolerance is 0.1 (i.e. 10%) and we have three sources and three sinks:

┌───┐     ┌───┐
│ A ├─────┤ X │  # Source is overprovisioned (500 > 100), even with the sink
│100│     |500 │  # adjusted (500 - 10% > 100)
└───┘     └───┘
┌───┐     ┌───┐
│ B ├─────┤ Y │  # Source is overprovisioned (500 > 460). However, it's OK when you
│460│     |500│  # adjust the sink (500 - 10% < 460).
└───┘     └───┘
┌───┐     ┌───┐
│ C ├─────┤ Z │  # Source is not overprovisioned (500 < 900).
│900│     |500│
└───┘     └───┘

In this example, the following matches will be made:

┌───┐     ┌───┐
│ A ├─111─┤ X │  # We seek the most volume that can be assigned. 111 is the largest
│100│     |500│  # number that fits into the tolerance, since 111 - 10% = 100
└───┘     └───┘
┌───┐     ┌───┐
│ B ├─500─┤ Y │  # Source is able to meet the demand since it's within the
│460│     |500│  # tolerance
└───┘     └───┘
┌───┐     ┌───┐
│ C ├─500─┤ Z │  # Source is able to meet unadjusted demand
│900│     |500│
└───┘     └───┘

In code the above example would look like this. Note that the function returns two dataframes. The dataframes contain volumes in different units. The first dataframe is in the units of the source volume. For example, since A was exhausted, it supplied 100 source-units. The second dataframe is in the units of the sink volume. For example, X received 111 sink-units from A.

>>> sources = {"A": 100, "B": 460, "C": 900}
>>> sinks = {"X": 500, "Y": 500, "Z": 500}
>>> matches = [("A", "X"), ("B", "Y"), ("C", "Z")]
>>> source_allocation, sink_allocation = direct_sourcing(
...     matches, sources, sinks, relative_tolerance=0.1
... )
>>> source_allocation
A  X    100
B  Y    460
C  Z    500
>>> sink_allocation
A  X    111
B  Y    460
C  Z    500

Args

matches
an iterable of pairs (source_index, sink_index) each containing a match of of a source to a sink. The values should correspond to the index of the sources and sinks respectively.
source_capacity
the capacities of the sources, which should not be exceeded.
sink_demand
the capacities of the sinks which should be met.
relative_tolerance : optional
an optional factor by which to allow the sink demands to be reduced by to increase matched volume. For example, 0.1 would allow the sink demands to reduce by up to 10%.
run_qa : optional, bool
run additional checks that sources/sinks have not been exceeded. You can skip these checks to improve the performance of the function.

Returns

A tuple of Pandas Series (source_allocation, sink_allocation) representing the volumes that have been allocated. The values in the two series will be equal unless relative_tolerance is non-zero. The index of both series is a multiindex of (source, sink) which matches the matches argument exactly, albeit with duplicates removed.

def proportional_sourcing(matches, sink_demand, source_capacity, on_zero_capacity='even')

Allocates volumes between sinks and sources proportional to sink capacities.

Suppose that we have:

  • A number of sinks, each of which has a demand
  • A number of sources which each have a (supply) capacity
  • A list matching sources to sinks ("matches")

This algorithm will allocate volume to each match in such a way that demand is met at the sinks and demand is distributed proportional to the capacity of the sources.

Example

Suppose we have two sources A and B, two sinks X and Y, and all are matched to all:

┌───┐        ┌───┐
│ A ├─┬──────┤ X │
│300│ │┌─────┤300│
└───┘ ││     └───┘
      ││
┌───┐ ││     ┌───┐
│ B │ └│─────┤ Y │
│150├──┴─────┤300│
└───┘        └───┘

Each sink now splits its demand among the sources proportional to the relative capacities of the sources. For example X is connected to A (300) and B (150) so it apportions 300 / (300 + 150) = 66% of its volume to A and 150 / (300 + 150) = 33% of its volume to B:

┌───┐        ┌───┐
│ A ├─┬──200─┤ X │
│300│ │┌─100─┤300│
└───┘ ││     └───┘
      ││
┌───┐ ││     ┌───┐
│ B │ └│─200─┤ Y │
│150├──┴─100─┤300│
└───┘        └───┘

Note that if we allocate volume in this way, the capacities of both A and B are exceeded. The function does not take this into consideration.

Args

matches
an iterable of pairs (source_index, sink_index) each containing a match of of a source to a sink. The values should correspond to the index of the sources and sinks respectively.
source_capacity
the capacities of the sources.
sink_demand
the capacities of the sinks which should be met.
on_zero_capacity
what to do if none of the sources matched to a particular sink have any capacity. One of "raise" (raise an error), "even" (distribute evenly) or "zero" (assign zero volume).

Returns

A Pandas Series representing the volumes that have been allocated. The index of the series matches the matches argument exactly, albeit with duplicates removed.