Monday, October 31, 2016

Dynamic Programming in 5 easy steps - Examples - Set partitioning

Please find the updated version of this post here: https://piotr.westfalewicz.com/blog/2016/10/dynamic-programming-in-5-easy-steps---examples---set-partitioning/


Let's continue Dynamic Programming series. Last time we have covered Text Justification problem, this time we will do something harder. The problem statement is as follows:
Given a set S of positive integers, divide it into two sets S1 and S2 such that the absolute difference between their sums is minimum.

If there is a set S with n elements, then if we assume S1 has m elements, S2 must have n-m elements and the value of abs(sum(S1) – sum(S2)) should be minimum.
The 5 easy steps are:
  1. define subproblems, count number of subproblems
  2. guess (part of solution), count number of choices
  3. relate subproblem solutions, compute time per subproblem
  4. recurse + memoize OR build DP table bottom-up check subproblems acyclic/topological order, time = time per subproblem · number of subproblems
  5. solve original problem: = a subproblem OR by combining subproblem solutions ⇒ extra time 
Taking an example, assume a set S={ 1, 6, 11, 5}.
Step 1. As previously, we want to divide the problem to subproblems.
How this can be done? Suffixes, prefixes or the mix of those two are usually the answer. It's natural to imagine that in the middle of processing the set looks at some point as follows .., 11, 5. At this very point we already assumed we are working on suffixes. 
In that situation we can either include the 11 in the set S1 or in the set S2. I'll say that we take or leave the number, because we have only two choices.
Related, very important aspect - we have to track only one sum! The other one is just totalSum-chosenSetSum.
Getting back to our situation, what is the "context" of the subproblem? Does it matter how we got to this point? Not at all, what matters is only the chosenSetSum. Obviously, we have to also keep track of our current item index. Here we have it, the subproblems.

Step 2. How our guess will look like? We can increase the sum and progress with the problem,
or we can just progress with the problem, assuming we leave the item and it will be in the second set.

Step 3. Now we have to glue two pieces togeter. Our DP function will have two arguments as noted above, chosenSetSum and index.
We want to optimize for the minimum difference between the sets and that's what will be returned by the DP function. All in all, the equation is as follows:
DP(chosenSetSum, i) = min from:
  • DP(chosenSetSum, i+1) //when skipping the i'th item
  • DP(chosenSetSum + inputSet[i], i+1)) //when taking the i'th item
The corner case is DP(X, inputSet.Length) = totalSum-X, because that's the end of partitioning.
The time complexity of this approach is:
time = time per subproblem · number of subproblems, so
time = O(1) * O(sum * n) = O(sum * n)
Yey, we have pseudo-polynomial time solution.

Step 4. Is it actually an acyclic graph of DP function invocations? My reasoning here is that we have two dimensions (chosenSetSum and item index) where both increase and there is an end when the second variable reaches input set cardinality, thus it's finite algorithm.

Step 5. We start with chosenSetSum=0 and at the first index, therefore our solution is at DP(0,0).

The Code

Having the 5 steps figured out, it's relatively easy to code the solution.

Pure DP algorithm coded without memorization

private class Partitioner
{
    private readonly int[] _inputSet;
    private readonly int _inputSetSum;

    public Partitioner(int[] inputSet)
    {
        _inputSet = inputSet;
        _inputSetSum = _inputSet.Sum();
    }

    public void Solve()
    {
        DP(0, 0);
    }

    private int DP(int chosenSetSum, int i)
    {
        if (i == _inputSet.Length)
            //Note: returning the difference between chosenSetSum and skippedSetSum, 
            //not the skippedSetSum (which would be _inputSetSum - chosenSetSum)
            return Math.Abs(_inputSetSum - 2 * chosenSetSum);

        var chooseCurrentNumberScore = DP(chosenSetSum + _inputSet[i], i + 1);
        var skipCurrentNumberScore = DP(chosenSetSum, i + 1);
        var minDiff = Math.Min(skipCurrentNumberScore, chooseCurrentNumberScore) ;

        return minDiff;
    }
}
How cool is that? 6 lines of code for almost working solution!

Memoization added

private class Partitioner
{
    private const int Marker = -1;
    private readonly int[] _inputSet;
    private readonly int _inputSetSum;
    private readonly int[][] _memo;

    public Partitioner(int[] inputSet)
    {
        _inputSet = inputSet;
        _inputSetSum = _inputSet.Sum();
        _memo = new int[_inputSetSum][];
        for (var i = 0; i < _inputSetSum; i++)
        {
            _memo[i] = new int[_inputSet.Length];
            for (var j = 0; j < _inputSet.Length; j++)
                _memo[i][j] = Marker;
        }
    }

    public void Solve()
    {
        DP(0, 0);
    }

    private int DP(int chosenSetSum, int i)
    {
        if (i == _inputSet.Length)
            //Note: returning the difference between chosenSetSum and skippedSetSum, 
            //not the skippedSetSum (which would be _inputSetSum - chosenSetSum)
            return Math.Abs(_inputSetSum - 2 * chosenSetSum);

        if (_memo[chosenSetSum][i] != Marker)
            return _memo[chosenSetSum][i];

        var chooseCurrentNumberScore = DP(chosenSetSum + _inputSet[i], i + 1);
        var skipCurrentNumberScore = DP(chosenSetSum, i + 1);
        var minDiff = Math.Min(skipCurrentNumberScore, chooseCurrentNumberScore);

        _memo[chosenSetSum][i] = minDiff;
        return minDiff;
    }
}

Full solution with parent pointers and examples

using System;
using System.Collections.Generic;
using System.Linq;

public class DP_SetPartition
{
    static void Main(String[] args)
    {
        PrintSolution(new int[0]);
        PrintSolution(new[] { 3 });
        PrintSolution(new[] { 3, 3 });
        PrintSolution(new[] { 3, 2 });
        PrintSolution(new[] { 1, 4, 2 });
        PrintSolution(new[] { 1, 2, 4 });
        PrintSolution(new[] { 1, 1, 5, 1, 1, 1 });
        PrintSolution(new[] { 1, 6, 11, 5 });
        PrintSolution(new[] { 1, 6, 11, 5 });
        PrintSolution(new[] { 25, 21, 20, 17, 8 });
        PrintSolution(new[] { 1, 5, 6, 10 });
        PrintSolution(new[] { 3, 1, 4, 2, 2, 1 });
        PrintSolution(new[] { 200, 200, 300, 300, 400, 400, 1000, 1000 });
        PrintSolution(new[] { 100, 100, 200, 200, 300, 300, 400, 400, 1000, 1000 });
        Console.ReadLine();
    }

    private static void PrintSolution(int[] set)
    {
        Console.WriteLine($"Set: {string.Join(", ", set)}");
        int[] set1;
        int[] set2;
        new Partitioner(set).Solve(out set1, out set2);
        Console.WriteLine($"A: {string.Join(", ", set1)} \t\t B: {string.Join(", ", set2)}, sumdiff: {Math.Abs(set1.Sum() - set2.Sum())}");
        Console.WriteLine();
    }

    private class Partitioner
    {
        private const int Marker = -1;
        private readonly int[] _inputSet;
        private readonly bool[][] _parentPointers;
        private readonly int _inputSetSum;
        private readonly int[][] _memo;

        public Partitioner(int[] inputSet)
        {
            _inputSet = inputSet;
            _inputSetSum = _inputSet.Sum();
            _memo = new int[_inputSetSum][];
            _parentPointers = new bool[_inputSetSum][];
            for (var i = 0; i < _inputSetSum; i++)
            {
                _memo[i] = new int[_inputSet.Length];
                _parentPointers[i] = new bool[_inputSet.Length];
                for (var j = 0; j < _inputSet.Length; j++)
                {
                    _memo[i][j] = Marker;
                }
            }
        }

        public void Solve(out int[] set1, out int[] set2)
        {
            DP(0, 0);
            var chosenSet = new List<int>();
            var skippedSet = new List<int>();
            var sum = 0;
            var i = 0;
            while (i != _inputSet.Length)
            {
                var choose = _parentPointers[sum][i];
                if (choose)
                {
                    chosenSet.Add(_inputSet[i]);
                    sum += _inputSet[i];
                }
                else
                {
                    skippedSet.Add(_inputSet[i]);
                }
                i++;
            }
            set1 = chosenSet.ToArray();
            set2 = skippedSet.ToArray();
        }

        private int DP(int chosenSetSum, int i)
        {
            if (i == _inputSet.Length)
                return Math.Abs(_inputSetSum - 2 * chosenSetSum);

            if (_memo[chosenSetSum][i] != Marker)
                return _memo[chosenSetSum][i];

            var chooseCurrentNumberScore = DP(chosenSetSum + _inputSet[i], i + 1);
            var skipCurrentNumberScore = DP(chosenSetSum, i + 1);
            var minDiff = skipCurrentNumberScore;
            if (chooseCurrentNumberScore < skipCurrentNumberScore)
            {
                minDiff = chooseCurrentNumberScore;
                _parentPointers[chosenSetSum][i] = true;
            }

            _memo[chosenSetSum][i] = minDiff;
            return minDiff;
        }
    }
}
It's working!

No comments:

Post a Comment