Contents tagged with .NET
-
Porting a C# Windows application to Linux
I own a Windows application. To expand our customer base, we need to create a Linux edition. In anticipating the demand, we previously decided to place the majority of logics in a few .net standard libraries and this is a big paid-off. However, there are still a few things we need to do so that the same code would work on both Windows and Linux.
- Path separator is different between Windows and Linux. Windows uses “\” as separator while Linux uses “/” as separator. The solution is to always use Path.Combine to concatenate paths. Similarly, use Path.GetDirectoryName and Path.GetFileName to split the paths.
- Linux file system is case sensitive. The solution is to be very consistent with path names and always use constants when a path is used in multiple places.
- In text files, Windows uses \r\n to end lines while Linux uses \r. The solution is to use TextReader.ReadLine and TextWriter.WriteLine. TextReader.ReadLine reads Windows text files correctly on Linux and vice versa. If we have to face line-ending characters explicitly, use Environment.NewLine.
- Different locations for program files and program data. Windows by defaults store programs in “c:\Program Files” folder and store program data in “c:\ProgramData”. The exact location can be determined from the %ProgramFile% and %ProgramData% environment variables. Linux, in contrast, has a different convention and one often install programs under /opt and write program data under /var. For complete reference, see: http://www.tldp.org/LDP/Linux-Filesystem-Hierarchy/html/. This is an area we have to branch the code and detect operating system using RuntimeInformation.IsOSPlatform.
- Lack of registry in Linux. The solution is to just use configuration files.
- Windows has services while Linux has daemon. The solution is to create a Windows Service application on Windows and create a console application on Linux. RedHat has a good article on creating Linux daemon in C#: https://developers.redhat.com/blog/2017/06/07/writing-a-linux-daemon-in-c/. For addition information on Systemd, also see: https://www.digitalocean.com/community/tutorials/understanding-systemd-units-and-unit-files.
- Packaging and distribution. Windows application are usually packaged as msi or Chocolatey package. Linux applications are usually packaged as rpm. This will be the subject of another blog post.
-
Building .net core on an unsupported Linux platform
Introduction
I need to a product that I own from Windows to Amazon Linux. However, Amazon Linux is not a supported platform for running .net core by Microsoft. Although there is a Amazon Linux 2 image with .net core 2.1 preinstalled and it is possible to install the CentOS version of .net core on Amazon Linux 1, I went on a journey to build and test .net core on Amazon Linux to have confidence that my product will not hit a wall.
.net core require LLVM 3.9 to build. However, we can only get LLVM 3.6.3 from the yum repository. So we have to build LLVM 3.9.LLVM 3.9 requires Cmake 3.11 or later, but we can only get Cmake 2.8.12 from the yum repository. So we have to start from building CMake.
Building CMake
The procedure to build CMake can be found in https://askubuntu.com/questions/355565/how-do-i-install-the-latest-version-of-cmake-from-the-command-line.
Here is what I did:
sudo yum groupinstall "Development Tools"
Sudo yum install swig python27-devel libedit-devel
version=3.11
build=1
mkdir ~/temp
cd ~/temp
wget https://cmake.org/files/v$version/cmake-$version.$build.tar.gz
tar -xzvf cmake-$version.$build.tar.gz
cd cmake-$version.$build/./bootstrap
make -j4
sudo make installBuilding CLang and LVVM
With CMake installed, we can build LLVM. My procedure of building Clang and LLVM is similar to the procedure in https://github.com/dotnet/coreclr/blob/master/Documentation/building/buildinglldb.md.
Please also refer to https://releases.llvm.org/3.9.1/docs/CMake.html for additional information.
cd $HOME
git clone http://llvm.org/git/llvm.git
cd $HOME/llvm
git checkout release_39
cd $HOME/llvm/tools
git clone http://llvm.org/git/clang.git
git clone http://llvm.org/git/lldb.git
cd $HOME/llvm/tools/clang
git checkout release_39
cd $HOME/llvm/tools/lldb
git checkout release_39Before we start building, we need to patch LLVM source code for Amazon Linux triplet.Otherwise LLVM cannot find the c++ compiler on Amazon Linux.
To patch, find file ./tools/clang/lib/Driver/ToolChains.cpp, find an array that looks like:
"x86_64-linux-gnu", "x86_64-unknown-linux-gnu", "x86_64-pc-linux-gnu",
"x86_64-redhat-linux6E", "x86_64-redhat-linux", "x86_64-suse-linux",
"x86_64-manbo-linux-gnu", "x86_64-linux-gnu", "x86_64-slackware-linux",
"x86_64-linux-android", "x86_64-unknown-linux"Append "x86_64-amazon-linux" to the last line.
Similar, append "i686-amazon-linux" to "i686-montavista-linux", "i686-linux-android", "i586-linux-gnu"
Now we can build:
mkdir -p $HOME/build/release
cd $HOME/build/release
cmake -DCMAKE_BUILD_TYPE=release $HOME/llvmmake –j4
sudo make install
Building CoreCLR and CoreFx
With Clang/LLVM 3.9 installed, we can now build CoreCLR and CoreFx.
https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/developer-guide.mdWe need to install the prerequisites first:
sudo yum install lttng-ust-devel libunwind-devel gettext libicu-devel libcurl-devel openssl-devel krb5-devel libuuid-devel libcxx
sudo yum install redhat-lsb-core cppcheck sloccount
mkdir ~/git
git clone https://github.com/dotnet/coreclr.git
git clone https://github.com/dotnet/corefx.gitGo to each directory and check out a version, for eample:
git checkout tags/v2.0.7
Now just follow https://github.com/dotnet/coreclr/blob/master/Documentation/building/linux-instructions.md to the build.
./clean.sh -all
./build.sh -RuntimeOS=linux
./build-tests.shAlso look at: https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/developer-guide.md and
https://github.com/dotnet/corefx/issues/22509
Conclusions
With the steps above, I was able to build and test .net core on Amazon Linux 1 and 2.
Note that .net core requires GLIBC_2.14 to run. To find the version of GLIBC on your version of Amazon Linux, run:
strings /lib64/libc.so.6 | grep GLIBC
If you don’t see 2.14 on the list, .net core will not run. try “sudo yum update” to see if you can update to a later version of GLIBC.
Additionally, since many newer programming languages were build on LLVM, this exercise also allow us to build other languages that require newer version of LLVM than the version in the yum repository.
-
Missing methods in LINQ: MaxWithIndex and MinWithIndex
The LINQ library has Max methods and Min methods. However, sometimes we are interested in the index location in the IEnumerable<T> rather than the actual values. Hence the MaxWithIndex and MinWithIndex methods.
These methods return a Tuple. The first item of the Tuple is the maximum or minimum value just the the Max and Min methods. The second item of the Tuple is the index location.
As usually, you might get my LINQ extension from NuGet:
PM>Install-Package SkyLinq.Linq
Usage examples in the unit test.
-
A simple LINQPad query host
I am a big fan of LINQPad. I use LINQPad routinely during my work to test small, incremental ideas. I used it so much so that I bough myself a premium license.
I always wish I can run queries designed in LINQPad in my own program. Before 4.52.1 beta, there was only a command line interface. In LINQPad v4.52.1 beta, there is finally a Util.Run method that allows me to run LINQPad queries in my own process. However, I felt that I did not have sufficient control on how I can dump the results. So I decided to write a simple host myself.
As in the example below, a .linq file starts with an xml meta data section followed by a blank line and then the query or the statements.
<Query Kind="Expression"> <Reference><RuntimeDirectory>\System.Web.dll</Reference> <Reference><ProgramFilesX86>\Microsoft ASP.NET\ASP.NET MVC 4\Assemblies\System.Web.Mvc.dll</Reference> <Namespace>System.Web</Namespace> <Namespace>System.Web.Mvc</Namespace> </Query> HttpUtility.UrlEncode("\"'a,b;c.d'\"")
The article “http://www.linqpad.net/HowLINQPadWorks.aspx” on the LINQPad website gives me good information on how to compile and execute queries. LINQPad uses CSharpCodeProvider (or VBCodeProvider) to compile queries. Although I was tempted to use Roslyn like ScriptCS, I decided to use CSharpCodeProvider to ensure compatible with LINQPad.
We only need 3 lines of code to the LINQPad host:
using LINQPadHost; ... string file = @"C:\Users\lichen\Documents\LINQPad Queries\ServerUtility.linq"; Host host = new Host(); host.Run<JsonTextSerializer>
(file); As I mentioned at the beginning. I would like to control the dumping of the results. JsonTextSerializer is one of the three serializers that I supplied. The other two serializers are IndentTextSerializer and XmlTextSerializer. Personally, I found that the JsonTextSerializer and IndentTextSerializer the most useful.
The source code could be found here.
Examples could be found here.
-
Why every .net developer should learn some PowerShell
It has been 8 years since PowerShell v.1 was shipped in 2006. I have looked into PowerShell closely except for using it in the Nuget Console. Recently, I was forced to have a much closer look at PowerShell because we use a product that exposes its only interface in PowerShell.
Then I realized that PowerShell is such a wonderful product that every .net developer should learn some. Here are some reasons:
- PowerShell is a much better language that the DOS batch language. PowerShell is real language with variable, condition, looping and function calls.
- According to Douglas Finke in Windows Powershell for Developers by O’Reilly, PowerShell is a stop ship event, meaning no Microsoft server products ship without a PowerShell interface.
- PowerShell now has a pretty good Integrated Scripting Environments (ISE). We can create, edit, run and debug PowerShell. Microsoft has release OneScript, a script browser and analyzer that could be run from PowerShell ISE.
- We can call .NET and COM objects from PowerShell. That is an advantage over VBScript.
- PowerShell has a wonderful pipeline model with which we can filter, sort and convert results. If you love LINQ, you would love PowerShell.
- It is possible to call PowerShell script from .net, even ones on a remote machine.
Recently, I have to call some PowerShell scripts on a remote server. There are many piecewise information on the internet, but no many good examples. So I put a few pointers here:
- When connecting to remote PowerShell, the uri is : http://SERVERNAME:5985/wsman.
- It is possible to run PowerShell in a different credential using the optional credential.
- Powershell remoting only runs in PowerShell 2.0 or later. So download the PowerShell 2.0 SDK (http://www.microsoft.com/en-us/download/details.aspx?id=2560). When installed, it actually updates the 1.0 reference assemblies . On my machine, they are in: C:\Program Files (x86)\Reference Assemblies\Microsoft\WindowsPowerShell\v1.0
So the complete code runs like:
using System.Management.Automation; // Windows PowerShell namespace using System.Management.Automation.Runspaces; // Windows PowerShell namespace using System.Security; // For the secure password using Microsoft.PowerShell; Runspace remoteRunspace = null; //System.Security.SecureString password = new System.Security.SecureString(); //foreach (char c in livePass.ToCharArray()) //{ // password.AppendChar(c); //} //PSCredential psc = new PSCredential(username, password); //WSManConnectionInfo rri = new WSManConnectionInfo(new Uri(uri), schema, psc); WSManConnectionInfo rri = new WSManConnectionInfo(new Uri(""http://SERVERNAME:5985/wsman")); //rri.AuthenticationMechanism = AuthenticationMechanism.Kerberos; //rri.ProxyAuthentication = AuthenticationMechanism.Negotiate; remoteRunspace = RunspaceFactory.CreateRunspace(rri); remoteRunspace.Open(); using (PowerShell powershell = PowerShell.Create()) { powershell.Runspace = remoteRunspace; powershell.AddCommand(scriptText); Collection
results = powershell.Invoke(); remoteRunspace.Close(); foreach (PSObject obj in results) { foreach (PSPropertyInfo psPropertyInfo in obj.Properties) { Console.Write("name: " + psPropertyInfo.Name); Console.Write("\tvalue: " + psPropertyInfo.Value); Console.WriteLine("\tmemberType: " + psPropertyInfo.MemberType); } } } -
Converted ASP Classic Compiler project from Mercurial to Git
Like some other open source project developers, I picked the Mercurial as my version control system. Unfortunately, Git is winning in the Visual Studio echo systems. Fortunately, it is possible to contact Codeplex admin for manual conversion from Mercurial to Git. I have done exactly that for my open source ASP Classic Compiler project. Now I can add new examples in response to forum questions and check them in using my Visual Studio 2013. Now I am all happy.
-
LINQ query optimization by query rewriting using a custom IQueryable LINQ provider
In my SkyLinq open source project so far, I tried to make LINQ better and created several extension methods (for example, see GroupBy and TopK) that are more memory efficient than the standard methods in some scenarios. However, I do not necessarily want people to bind into these extensions methods directly for the following reasons:
- Good ideas evolve. New methods may come and existing methods may change.
- There is a parallel set of methods in IEnumerable<T>, IQueryable<T> and to some degree, in IObservable<T>. Using custom LINQ extension breaks the parallelism.
A better approach would be to write the LINQ queries normally using the standard library methods and then optimize the queries using a custom query provider. Query rewriting is not a new idea. SQL server has been doing this for years. I would like to bring this idea into LINQ.
If we examine the IQueryable<T> interface, we will find there is a parallel set of LINQ methods that accepts Expression<Func> instead of <Func>.
There is a little C# compiler trick here. If C# see a method that accepts Expression<Func>, it would create Expressions of Lamda instead of compiled Lambda expression. At run time, these expressions are passed to an IQueryable implementation and then passed to the underlying IQueryProvider implementation. The query provider is responsible for executing the expression tree and return the results. This is how the magic of LINQ to SQL and Entity Frameworks works.
The .net framework already has a class called EnumerableQuery<T>. It is a query provider than turns IQueryable class into LINQ to objects queries. In this work, I am going one step further by creating an optimizing query provider.
A common perception of writing a custom LINQ provider is that it requires a lot of code. The reason is that even for the most trivial provider we need to visit the every possible expressions in the System.Linq.Expressions namespace. (Note that we only need to handle expressions as of .net framework 3.5 but do not need to handle the new expressions added as part of dynamic language runtime in .net 4.x). There are reusable framework such as IQToolkit project that makes it easier to create a custom LINQ provider.
In contrast, creating an optimizing query provider is fairly easy. The ExpressionVistor class already has the framework that I need. I only needed to create a subclass of ExpressionVisitor called SkyLinqRewriter to override a few methods. The query rewriter basically replaces all call to the Queryable class with equivalent calls to the Enumerable class and rewrite queries for optimization when an opportunity presents.
It is fairly easy to consume the optimizing query provider. All we need is to call AsSkyLinqQueryable() to convert IEnumerable<T> to IQueryable<T> and remaining code can stay intact:
An end-to-end example can be found at https://skylinq.codeplex.com/SourceControl/latest#SkyLinq.Example/LinqToW3SVCLogExample.cs.
To conclude this post, I would recommend that we always code to IQueryable<T> instead of IEnumerable<T>. This way, the code has the flexibility to be optimized by an optimizing query provider, or be converted from pull based query to push based query using Reactive Extension without rewriting any code.
-
Expression tree visualizer for Visual Studio 2013
Expression tree visualizer, as the name indicates, is a Visual Studio visualizer for visualizing expression trees. It is a must if you work with expressions frequently. Expression Tree Visualizer is a Visual Studio 2008 sample. There is a Visual Studio 2010 port available on codeplex. If you want to use it with a later version of Visual Studio, there is not one available. Fortunately, porting it to another version of Visual Studio is fairly simple:
- Download the original source code from http://exprtreevisualizer.codeplex.com/.
- Replace the existing reference to Microsoft.VisualStudio.DebuggerVisualizers assembly to the version in the Visual Studio you want to work with. For Visual Studio 2013, I found it in C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\ReferenceAssemblies\v2.0\Microsoft.VisualStudio.DebuggerVisualizers.dll on my computer.
- Compile the ExpressionTreeVisualizer project and copy ExpressionTreeVisualizer.dll to the visualizer directory. To make it usable by one user, just copy it to My Documents\VisualStudioVersion\Visualizers. To make it usable by all users of a machine, copy it to VisualStudioInstallPath\Common7\Packages\Debugger\Visualizers.
If you do not want to walk through the process, I have one readily available at http://weblogs.asp.net/blogs/lichen/ExpressionTreeVisualizer.Vs2013.zip. Just do not sue me if it does not work as expected.
-
An efficient Top K algorithm with implementation in C# for LINQ
The LINQ library currently does not have a dedicated top K implementation. Programs usually use the OrderBy function followed by the Take function. For N items, an efficient sort algorithm would scale O(n) in space and O(n log n) in time. Since we are counting the top K, I believe that we could devise an algorithm that scales O(K) in space.
Among all the sorting algorithms that I am familiar, the heapsort algorithm came to my mind first. A Heap has a tree-like data structure that can often be implemented using an array. A heap can either be a max-heap or a min-heap. In the case of min-heap, the minimum element of at the root of the tree and all the child elements are greater than their parents.
Binary heap is a special case of heap. A binary min-heap has the following characteristics:
- find-min takes O(1) time.
- delete-min takes O(log n) time.
- insert takes O(log n) time.
So here is how I implement my top K algorithm:
- Create a heap with an array of size K.
- Insert items into the heap until the heap reaches its capacity. This takes K O(log K) time.
- For each the remaining elements, if an element is greater than find-min of the heap, do a delete-min and then insert the element into the heap.
- Then we repeated delete-min until the heap is empty and arrange the deleted element in reverse order and we get our top 10 list.
The time it takes to find top K items from an N item list is:
O(N) * t1 + O((K + log N - log K) * log K) * t2
Here t1 is the time to compare an element in step 3 to find-min and t2 is the combined time of delete-min and the subsequent insert. So this algorithm is much more efficient that a call to OrderBy followed by a call to Take.
I have checked-in my implementation to my Sky LINQ project. The LINQ Top and Bottom is at in LinqExt.cs. The internal binary heap implementation is in BinaryHeap.cs. The LINQ example can be found in HeapSort.cs. The Sky Log example has also been updated to use the efficient Top K algorithm.
Note that this algorithm combined OrderBy and Take by GroupBy itself still uses O(N) splace where N is number of groups. There are probabilistic approximation algorithms that can be used to further alleviate memory foot print. That is something I could try in the future.
-
A c# implementation of duck typing
Eric Lippert’s blogs (and his old blogs) have been my source to the inner working of VBScript and C# in the past decade. Recently, his blog on “What is ‘duck typing’?” has drawn lots of discussions and generated additional blogs from Phil Haack and Glenn Block. The discussion injects new ideas into my thoughts on programming by composition. I previously argued that interfaces are too big a unit for composition because they often live in different namespaces and only a few of the well-known interfaces are widely supported. I was in favor of lambdas (especially those that consume or return IEnumerable<T>) as composition unit. However, as I worked through my first end-to-end example, I found that lambdas are often too small (I will blog about this in details later).
Duck typing may solve the interface incompatibility problem in many situations. As described in Wikipedia, The name came from the following phrase:
“When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.”
If I would translate it into software engineering terms, I would translate it as:
The client expects certain behavior. If the server class has the behavior, then the client can call the server object even though the client does not explicitly know the type of the server objects.
That leads to the following implementation in a strongly-typed language like c#, consisting of:
- An interface defined by me to express my expected behaviors of duck, IMyDuck.
- An object from others that I want to consume. Let’s call its type OtherDuck.
- A function that checks if Duck has all the members of IMyDuck.
- A function (a proxy factory) that generates a proxy that implements IMyDuck and wrap around OtherDuck.
- In my software, I only bind my code to IMyDuck. The proxy factory is responsible for bridging IMyDuck and OtherDuck.
This implementation takes the expected behaviors as a whole. Except for the one time code generation for the proxy, everything else is strongly-typed at compile time. It is different to the “dynamic” in c# which implements late binding at the callsite.
I have checked my implementation into my Sky LINQ project under the c# project of SkyLinq.Composition. The name of the class is DuckTypingProxyFactory. The methods look like:
I have also provided an example called DuckTypingSample.cs in the SkyLinq.Sample project.
So when would duck-typing be useful? Here are a few:
- When consuming libraries from companies that implement the same standard.
- When different teams in a large enterprise each implement the company’s entities in their own namespace.
- When a company refactors software (see the evolution of asp.net for example).
- When calling multiple web services.
Note that:
- I noticed the impromptu-interface project that does the same thing after I completed by project. Also, someone mentioned TypeMock and Castle in the comment of Eric’s log. So I am not claiming to be the first person with the idea. I am still glad to have my own code generator because I am going to use it to extend the idea of strongly-typed wrapper as comparing to data-transfer-objects.
- Also, nearly all Aspect-Oriented-Programming (AOP) and many Object-Relational-Mapping (ORM) frameworks have the similar proxy generator. In general, those in AOP frameworks are more complete because ORM frameworks primary concern with properties. I did looked at the Unity implementation when I implemented mine (thanks to the Patterns & Practices team!). My implementation is very bare-metal and thus easy to understand.