Custom configuring an Azure SSIS integration runtime

I've been doing some Power BI work recently where my customers had on-premises databases that were fairly large. After building a data model in Power BI Desktop that accessed the data directly, we published it up to PowerBI.Com, installed and configured a Power BI Data Gateway, and configured scheduled data refreshes. 

In one instance, the on-premises data was hosted by Postgres, and for the other, Sybase SQL Anywhere. 

At some point, we decided in each case that we wanted to build a data warehouse (or data mart, if you wish) and use it for the Power BI data model, rather than the source databases directly. The primary reason was that some of the tables were quite large, and most of the data didn't change, so we wanted to load it incrementally. Sales data, in this case - we just load the warehouse staging table with the past 24-48 hours of transactions and incrementally merge it into the main table.

Neither of these customers had SQL Server, but I wanted to use SSIS packages to do the loading, and use Azure SQL database as the target location. 

Now that Azure Data Factory V2 supports running SSIS packages, I delved into that configuration. Most of the docs for doing this are pretty good

However, when it comes to customizing the Azure-SSIS integration runtime, the instructions are ok but I certainly learned a few things along the way. I've done it twice now, and I'm going to show you the things I learned here. 

The Azure-SSIS integration runtime is just a docker container running Windows Server of some sort. What I mean by docker container is that it is essentially a virtual machine provisioned and started when necessary, from a template containing the OS and whatever other software it needs. You don't really get any console or remote access to this container, but you can have custom configuration steps to install other software as needed.

Be prepared for a long debug cycle though. It takes about 30-40 minutes for the runtime to start, and your custom steps happen at the very end of that. Hopefully the log files that are gathered will show you enough of what happened, or what didn't happen, or what failed, for you to fix your mistake, update your scripts, and try again. 

Your configuration happens via an Azure blog storage account, containing a main.cmd file that you write and whatever other files your main.cmd references. 

This is what my main.cmd file looks like:

@echo off
powershell.exe -ExecutionPolicy RemoteSigned -File %~dp0\InstallNetFx35.ps1
msiexec /i %~dp0\sqlany12-64bit.msi /passive /l %CUSTOM_SETUP_SCRIPT_LOG_DIR%\sqlanyclient.txt
powershell [reflection.assembly]::Load('iAnywhere.Data.SQLAnywhere.v4.5, Version=, Culture=neutral, PublicKeyToken=f222fc4333e0d400') > %CUSTOM_SETUP_SCRIPT_LOG_DIR%\sqlanyverify.log
There are two calls to Powershell and one execution of MSIEXEC, the Microsoft installer tool. I'm going to come back to the first PowerShell call in a bit.
The MSIEXEC call runs the setup for the file sqlany12-64bit.msi. The command-line options for MSIExec are here
/i installs the package file 
/passive means the install process won't stop for any user input
/l defines the logging level and location
There's also some magic tokens in this file: %~dp0 is the current drive and directory of this script. See StackOverflow for some discussions of magic tokens.
The environment variable %CUSTOM_SETUP_SCRIPT_LOG_DIR% is a folder in the docker configuration that will be copied to your Azure Storage log folder, so anything your script puts there you will see. 
In my first try, I didn't have the first powershell call (I'll let you guess what it does), and I just had the msiexec line. 
When you start the Azure SSIS Integration Runtime (from the ADF portal), after about 30-40 minutes, you'll see a folder in your Azure storage container with the name main.cmd.log. In there will be a folder for the Azure runtime you just tried to start. This screen shot shows the folders created for several attempts at starting my runtime. The first part of the folder is the name of the container and its timestamp, and the part after the ## is another timestamp. 
Inside these folders are any files that were put in the magic log directory as well as a stderr.log and stdout.log. 
This is what showed up in the sqlanyclient.txt file the first time:
=== Logging started: 10/31/2018  20:41:31 ===
Action start 20:41:31: INSTALL.
Action start 20:41:32: AppSearch.
Action ended 20:41:32: AppSearch. Return value 1.
Action start 20:41:32: LaunchConditions.
MSI (s) (08:64) [20:41:32:477]: Product: SQL Anywhere 12 Deployment -- .NET Framework 2.0 is required by this product.
.NET Framework 2.0 is required by this product.
Action ended 20:41:32: LaunchConditions. Return value 3.
Action ended 20:41:32: INSTALL. Return value 3.
MSI (s) (08:64) [20:41:32:482]: Product: SQL Anywhere 12 Deployment -- Installation failed.
MSI (s) (08:64) [20:41:32:483]: Windows Installer installed the product. Product Name: SQL Anywhere 12 Deployment. Product Version: 12.01.4403. Product Language: 1033. Manufacturer: iAnywhere Solutions. Installation success or error status: 1603.
=== Logging stopped: 10/31/2018  20:41:32 ===
The Az-SSIS docker container also failed to start, telling me that an error occured during the custom installation script. (Thanks!)
So, the setup log tells me that sqlany12-64bit.msi has a dependency on .NET Framework 2.0. 
In some ways, this error message is deceiving. I went on a half-day rabbit trail, looking for the install packages for .NET Framework 2.0. It's not hard to find, but it didn't install properly, and it was very quiet about it.
At some point, my subconsious surfaced a vague memory that .NET Framework 3.5 actually includes support for the 2.0 framework. I'm glad it did, but that wasn't the end of my rabbit trail. I tried the full installer for 3.5 and it would not work either, and not give me any log entries telling me why. 
After a bit more searching, I tried this:
DISM /Online /Enable-Feature /FeatureName:NetFx3 /All > %CUSTOM_SETUP_SCRIPT_LOG_DIR%\dism.log
My dism.log file contained this:

Deployment Image Servicing and Management tool
Version: 10.0.14393.0
Image Version: 10.0.14393.0
Enabling feature(s)
[                           0.1%                           ]
[                           1.1%                           ]
Error: 0x800f081f
The source files could not be found.
Use the "Source" option to specify the location of the files that are required to restore the feature. For more information on specifying a source location, see
The DISM log file can be found at C:\Windows\Logs\DISM\dism.log
My next try was this:
powershell Install-WindowsFeature Net-Framework-Core > %CUSTOM_SETUP_SCRIPT_LOG_DIR%\dotnetfx.log
Thankfully, I also get stderr.log, which showed this:
Install-WindowsFeature : The request to add or remove features on the
specified server failed.
Installation of one or more roles, role services, or features failed.
The source files could not be found.
Use the "Source" option to specify the location of the files that are required
to restore the feature. For more information on specifying a source location,
see Error: 0x800f081f
At line:1 char:1
+ Install-WindowsFeature Net-Framework-Core
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (@{Vhd=; Credent...Name=localh
   ost}:PSObject) [Install-WindowsFeature], Exception
    + FullyQualifiedErrorId : DISMAPI_Error__Failed_To_Enable_Updates,Microsof

Ok, in both those tries, it tells me it needs source files. Where might they be? What's actually in this virtual machine?

I decided to do some sleuthing to get a list of folders in that container:

dir /s /w /b 

This is what came back (after 40 minutes):

c:\License Terms
c:\Program Files\Attunity
c:\Program Files\Common Files
c:\Program Files\internet explorer
c:\Program Files\Microsoft Analysis Services
c:\Program Files\Microsoft Office
c:\Program Files\Microsoft OLE DB Provider for DB2
c:\Program Files\Microsoft SQL Server
c:\Program Files\Microsoft Visual Studio 10.0
c:\Program Files\SharePoint Client Components
c:\Program Files\Uninstall Information
c:\Program Files\Windows Defender
c:\Program Files\WindowsApps
c:\Program Files\WindowsPowerShell
c:\Program Files (x86)\Attunity
c:\Program Files (x86)\Common Files
c:\Program Files (x86)\internet explorer
c:\Program Files (x86)\Microsoft Analysis Services
c:\Program Files (x86)\Microsoft Help Viewer
c:\Program Files (x86)\Microsoft SQL Server
c:\Program Files (x86)\Microsoft Visual Studio 10.0
c:\Program Files (x86)\Microsoft Visual Studio 14.0
c:\Program Files (x86)\Microsoft.NET
c:\Program Files (x86)\MSECache
c:\Program Files (x86)\Uninstall Information
c:\Program Files (x86)\Windows Defender
c:\Program Files (x86)\WindowsPowerShell
c:\SSIS64Unique\License Terms

Actually, that's a heavily redacted list - there was actually 1.5 MB of folders listed there. Anyway, it's interesting to see what all is installed in this container. But I'm still not sure where the Windows source files could be. 

THen my sub-conscious reminded me that I'd seen something about customization samples somewhere. Sure enough, there is a set of sample scripts available, it's mentioned on the How To Configure Azure SSIS IR Custom Setup page. Once you plug this URL into Azure Storage Explorer, you get some scenarios that are likely quite helpful.

Hey look! A scenario that installs .NET Framework 3.5! I think that's what I want to do. 

That folder contains a Powershell script, InstallNetFx35.ps1, and a file 

The script looks like this:

if ((Get-WindowsFeature -Name Net-Framework-Core).Installed)
    Write-Output ".NET framework 3.5 has already been installed."
    if ((Install-WindowsFeature -Name Net-Framework-Core -Source (Get-Location).Path -LogPath %CUSTOM_SETUP_SCRIPT_LOG_DIR%\install.log).Success)
        Write-Output ".NET framework 3.5 has been installed successfully"
        throw "Failed to install .NET framework 3.5"

I have no idea how I would have found that cab file on my own, but I'm glad it was there. 

So after I added the first Powershell to call that script, the setup log for the SQL Anywhere MSI was happy:

=== Logging started: 11/1/2018  18:25:01 ===
Action start 18:25:01: INSTALL.
Action start 18:25:01: AppSearch.
Action ended 18:25:01: AppSearch. Return value 1.
<bunch of lines deleted>
Action ended 18:25:40: InstallFinalize. Return value 1.
Action ended 18:25:40: INSTALL. Return value 1.
MSI (s) (B4:80) [18:25:40:710]: Product: SQL Anywhere 12 Deployment -- Installation completed successfully.
MSI (s) (B4:80) [18:25:40:711]: Windows Installer installed the product. Product Name: SQL Anywhere 12 Deployment. Product Version: 12.01.4403. Product Language: 1033. Manufacturer: iAnywhere Solutions. Installation success or error status: 0.
=== Logging stopped: 11/1/2018  18:25:40 ===

My final Powershell call is a sanity check that the data provider is actually installed correctly. 

The SQL Anywhere provider is an ADO.NET Provider, so it uses ADO.NET and the .NET Framework to run in. I can use .NET reflection to instantiate the class from PowerShell, to prove that it loads ok. 

powershell [reflection.assembly]::Load('iAnywhere.Data.SQLAnywhere.v4.5, Version=, Culture=neutral, PublicKeyToken=f222fc4333e0d400') > %CUSTOM_SETUP_SCRIPT_LOG_DIR%\sqlanyverify.log

The trick is to get the string inside the Load() method correct. That's pretty easy, once you know how. 

In SSDT where I've built my SSIS package, I have my SQL Anywhere connection managers defined at the project level. So if I open the MyConnection.connmgr file (using View Code rather than View Designer), I see something like this:

The magic string is in line 5. After "ADO.NET:" the string is a .NET class name ("iAnywhere.Data.SQLAnywhere.SAConnection") and assembly information ("iAnywhere.Data.SQLAnywhere.v4.5, Version=, Culture=neutral, PublicKeyToken=f222fc4333e0d400") for where to load the class from. We use Powershell to attempt to load the assembly and it either raises an exception if the assembly is not installed or registered properly, or it succeeds. 

DTS:CreationName="ADO.NET:iAnywhere.Data.SQLAnywhere.SAConnection, iAnywhere.Data.SQLAnywhere.v4.5, Version=, Culture=neutral, PublicKeyToken=f222fc4333e0d400">

Here is the line in main.cmd:

powershell [reflection.assembly]::Load('iAnywhere.Data.SQLAnywhere.v4.5, Version=, Culture=neutral, PublicKeyToken=f222fc4333e0d400') > %CUSTOM_SETUP_SCRIPT_LOG_DIR%\sqlanyverify.log

This is what the sqlanyverify.log file contained:

GAC    Version        Location                                                
---    -------        --------                                                
True   v4.0.30319     C:\Windows\Microsoft.Net\assembly\GAC_MSIL\iAnywhere.D...

At this point, my SSIS package will be able to create the connection manager for Sybase successfully. Assuming that all my networking configuration for this container is done correctly (which I didn't touch on in this blog, but the instructions are fairly straightforward), my package should be able to run.

I'll make another blog post to show a few details around installing the Postgres ADO.NET provider in a container.

I hope this helps someone who is trying to configure their SSIS Integration Runtime with customized components. 


1 Comment

  • Nice Article
    <a href=""> DOT NET training in bangalore </a>

    <a href=""> dot net training institute in bangalore</a>

    <a href="">dot net course in bangalroe</a>

    <a href="">best dot net training institute in bangalore </a>

Comments have been disabled for this content.