Subtracting Sets

A couple times now I’ve wanted to subtract one set (`HashSet`) from another. `HashSet` has a method ExceptWith that does just this, except that it modifies the current set in place. What if you don’t want that? You need to copy it first.

Or you could use Linq’s Except method, except that it returns `IEnumerable` instead of a `HashSet`. What if you don’t want that? You need to convert it.

So… which method is faster? Let’s find out!

static class Program
{
    public static HashSet<T> ToSet<T>(this IEnumerable<T> collection)
    {
        return new HashSet<T>(collection);
    }

    public static HashSet<T> Subtract<T>(this HashSet<T> set, IEnumerable<T> other)
    {
        var clone = set.ToSet();
        clone.ExceptWith(other);
        return clone;
    }

    static void Main(string[] args)
    {
        var A = new HashSet<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
        var B = new HashSet<int> { 2, 4, 6, 8, 10 };
        var sw = new Stopwatch();

        sw.Restart();
        for (int i = 0; i < 1000000; ++i)
        {
            var C = A.Except(B).ToSet();
        }
        sw.Stop();
        Console.WriteLine("Linq: {0} ms", sw.ElapsedMilliseconds);

        sw.Restart();
        for (int i = 0; i < 1000000; ++i)
        {
            var C = A.Subtract(B);
        }
        sw.Stop();
        Console.WriteLine("Native: {0} ms", sw.ElapsedMilliseconds);

        Console.ReadLine();
    }
}

On my computer, this outputs:

Linq: 1291 ms
Native: 756 ms

So, clearly the second method is faster. About 40% for you math types.

I should note that “clone” isn’t really a clone, it’s a shallow copy. Only the container class itself is “cloned”; the elements still refer their originals.

Posted in

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *