Archives

Archives / 2017
  • Top k algorithm revisited

    3 years ago, I implemented top K operator in LINQ. I was asked recently why I chose Min Heap since there are faster algorithms. To recap, we try to select top k element from a sequence of n elements. A min-heap has the following property:

    1. find-min takes O(1) time.
    2. extract-min takes O(ln k) time where k is the size of the heap.
    3. insert takes O(ln k) time.

    For each number in the sequence, I first compare the number to find-min. If the number is smaller, the number is tossed away. If the number is bigger, we do a extract-min followed by an insert. So in the worst scenario, the algorithm runs with time complexity of O(n ln k) and the space complexity of O(k).

    If we use max-heap instead, we can heapify n elements in O(n) time. Then we do k extract-max so we have the total time complexity of O(n + k ln n) and a space complexity of O(n).

    We could also use Quick Select. It is very similar to Quick Sort that we randomly select a pivot and move it to the right position. Unlike Quick Sort, we can discard the left side of the pivot whenever we have greater then k elements on the right side. This algorithm converge fairly quickly and we have the average time complexity of O(n) and space complexity of O(n). In average case, the space requirement by Quick Select is less than the max heap approach.

    So both max-heap and quick select are likely faster than the min-heap approach. Why do I used min-heap then? The reason is that the min-heap approach uses minimum amount of memory and I assume that I will work with large dataset so . Also, if we work with a stream, the min-heap provides a running top k.

  • Ever wonder on which platform Amazon AWS Lambda in C# is running?

    In last December, AWS announced C# support for AWS Lambda using .NET Core 1.0 runtime. Ever wonder on which platform is it running? I am curious too and I did not see it in any official documentation. So I decided to write a small AWS Lambda function to detect the platform:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Threading.Tasks;
    using System.Runtime.InteropServices;
    using Amazon.Lambda.Core;
    using Amazon.Lambda.Serialization;
     
    // Assembly attribute to enable the Lambda function's JSON input to be converted into a .NET class.
    [assemblyLambdaSerializerAttribute(typeof(Amazon.Lambda.Serialization.Json.JsonSerializer))]
     
    namespace SysInfoLambda
    {
        public class Function
        {
     
            /// <summary>
            /// A simple function that takes a string and does a ToUpper
            /// </summary>
            /// <param name="input"></param>
            /// <param name="context"></param>
            /// <returns></returns>
            public RuntimeInfo FunctionHandler(ILambdaContext context)
            {
                return new RuntimeInfo()
                {
                    FrameworkDescription = RuntimeInformation.FrameworkDescription,
                    OSArchitecture = RuntimeInformation.OSArchitecture,
                    ProcessArchitecture = RuntimeInformation.ProcessArchitecture,
                    OSDescription = RuntimeInformation.OSDescription,
                    OSPlatform = RuntimeInformation.IsOSPlatform(OSPlatform.Linux) ? OS.Linux :
                        RuntimeInformation.IsOSPlatform(OSPlatform.OSX) ? OS.OSX : OS.Windows
                };
            }
     
        }
     
        public class RuntimeInfo
        {
            public string FrameworkDescription { getset; }
            public Architecture OSArchitecture { getset; }
            public string OSDescription { getset; }
            public Architecture ProcessArchitecture { getset; }
            public OS OSPlatform { getset; }
        }
     
        public enum OS
        {
            Linux,
            OSX,
            Windows
        }
     
    }
    

    The result? The AWS C# Lambda runs in 64 bit Linux. The extract OS description is: Linux 4.4.35-33.55.amzn1.x86_64 #1 SMP Tue Dec 6 20:30:04 UTC 2016.