Creating Zip archives in .NET (without an external library like SharpZipLib)

Overview

SharpZipLib provides best free .NET compression library, but what if you can't use it due to the GPL license? I'll look at a few options, ending with my favorite - System.IO.Packaging.

SharpZipLib is good, but there's that GPL thing

SharpZipLib includes good support for zip. I've written about it a few times, and I think it's great. Unfortunately, it's under a wacky "GPL but pretty much LGPL" license - it's GPL, but includes a clause that exempts you from the "viral" effects of the GPL:

Linking this library statically or dynamically with other modules is making a combined work based on this library. Thus, the terms and conditions of the GNU General Public License cover the whole combination. As a special exception, the copyright holders of this library give you permission to link this library with independent modules to produce an executable, regardless of the license terms of these independent modules, and to copy and distribute the resulting executable under terms of your choice, provided that you also meet, for each linked independent module, the terms and conditions of the license of that module.

Bottom line In plain English this means you can use this library in commercial closed-source applications.

I'm pretty sure that the reason for this odd "sort-of-GPL" license is because some of the SharpZipLib is based on some GPL's Java code. However, most companies have policies which forbid or greatly restrict their use of GPL code, and for very good reason: GPL has been set up as an alternative to traditional commercial software licensing, and while it's possible to use GPL code in commercial software, it's something that requires legal department involvement. So, my bottom line is that I can't use your code due to your license.

.NET Zip Library

UPDATE: DotNetZip has been released on CodePlex, and the one issue I ran into has been fixed. I'd recommend giving this a try instead of System.IO.Packaging (as I'd originally recommended), because it's a lot easier to use.

The Zip format allows for several different compression methods, but the most common is Deflate. System.IO.Compression includes a DeflateStream class. You'd think that System.IO would include Zip, but... no. The problem is that, while System.IO.DeflateStream can write to a stream, it doesn't write the file headers required for Zip handlers to read them.

Microsoft Interop blog posted a .NET Zip Library which adds the correct headers to the output of a System.IO.Compression DeflateStream.

ZipFile zip= new ZipFile("MyNewZip.zip");
zip.AddDirectory(
"My Pictures", true); // AddDirectory recurses subdirectories
zip.Save();

Note: DotNetZip has been released to CodePlex, and the issue I reported has been fixed. 

This works, but with some caveats. First of all, adding files causes an identical structure to be created in the zip. For instance, if I use the following:

zip.AddFile("C:\My Documents\Sample\File.txt");

The resulting Zip will contain File.txt, but it will be within the \My Documents\Sample\ hierarchy. There's no way to control the structure of the zip file when you add individual files, unless you want to modify the zip library (which is under MsPL license). That proved to be a big problem in my case, because the zip structure I'm creating is pretty rigid. So, if you're just zipping an entire folder full of files, this library may work for you, but if you need more control you may need to modify the library. I'm guessing if this were published on CodePlex it would have been fixed a while ago.

Another larger problem to keep in mind is that stream based compression is much less efficient than file based compression. File compression can optimize the compression used based on the content of all included files; stream based compression compresses data as it comes in, so it can't take advantage of data it hasn't seen yet.

The J# Zip Library

J# has included zip since day one, to keep compatible with the Java libraries. So, if you're willing to bundle the appropriate Java library (specifically, vjslib.dll), you can use the zip classes in java.util.zip. It works, but it seems like a really goofy hack to distribute a 3.6 MB DLL just to support zip.

System.IO.Packaging includes Zip support

In .NET 3.0, you can use the the System.IO.Packaging  ZipPackage class in WindowsBase.DLL. It's just 1.1 MB, and it just seems to fit a lot better than importing Java libraries. It's not very straightforward, but it does work. The "not straightforward" part comes from the fact that this isn't a generic Zip implementation, it's a packaging library for formats like XPS that happen to use Zip.

First, you'll need to find WindowsBase.dll so you can add a reference to it. If it's not on your .NET references, you'll probably find it in C:\Program Files\Reference Assemblies\Microsoft\Framework\v3.0\WindowsBase.dll.

It's not as simple as it should be, but it does work. Here's a sample that creates a Zip archive and adds two files:

 

using System;
using System.IO;
using System.IO.Packaging;

namespace ZipSample
{
class Program
{
static void Main(string[] args)
{
AddFileToZip(
"Output.zip", @"C:\Windows\Notepad.exe");
AddFileToZip(
"Output.zip", @"C:\Windows\System32\Calc.exe");
}

private const long BUFFER_SIZE = 4096;

private static void AddFileToZip(string zipFilename, string fileToAdd)
{
using (Package zip = System.IO.Packaging.Package.Open(zipFilename, FileMode.OpenOrCreate))
{
string destFilename = ".\\" + Path.GetFileName(fileToAdd);
Uri uri
= PackUriHelper.CreatePartUri(new Uri(destFilename, UriKind.Relative));
if (zip.PartExists(uri))
{
zip.DeletePart(uri);
}
PackagePart part
= zip.CreatePart(uri, "",CompressionOption.Normal);
using (FileStream fileStream = new FileStream(fileToAdd, FileMode.Open, FileAccess.Read))
{
using (Stream dest = part.GetStream())
{
CopyStream(fileStream, dest);
}
}
}
}

private static void CopyStream(System.IO.FileStream inputStream, System.IO.Stream outputStream)
{
long bufferSize = inputStream.Length < BUFFER_SIZE ? inputStream.Length : BUFFER_SIZE;
byte[] buffer = new byte[bufferSize];
int bytesRead = 0;
long bytesWritten = 0;
while ((bytesRead = inputStream.Read(buffer, 0, buffer.Length)) != 0)
{
outputStream.Write(buffer,
0, bytesRead);
bytesWritten
+= bufferSize;
}
}
}
}

 

Zip

One weird side-effect of using the ZipPackage to create Zips is that Packages contain a content type manifest named "[Content_Types].xml". If you create a ZipPackage, it will automatically include "[Content_Types].xml"., and if you try to read from a ZIP file which doesn't contain a file called "[Content_Types].xml" in the root, it will fail.

You'll notice that the compression in my test is not that great. In fact, pretty bad - Notepad.exe got bigger. Binary files don't compress nearly as well as text-based files - for example, I tested on a 55KB file and it compressed to less than 1KB - but the compression in this library doesn't appear to be fully implemented yet. For example, the CompressionOption enum includes CompressionOption.Maximum, but that setting is ignored. Normal is the best you'll get right now.

Another possible reason for low compression ratios in this sample is that I'm adding files separately rather than adding several files at a time. As I mentioned earlier, Zip compression works better when it has access to the entire file or group of files when creating the archive.

You can use the packaging library for your own file format. For example, here's an example that stores object state using XmlWriters to write to a Zip stream.

But where's System.IO.Zip?

That's a good question. All the Zip handling in System.IO.Packaging is in an internal class MS.Internal.IO.Zip. It would have been a lot more useful to implement a public System.IO.Zip which was used by System.IO.Packaging so that we could directly create and access Zip files without pretending we were creating XPS packages with manifests and Uri's.

Published Thursday, October 25, 2007 1:07 AM by Jon Galloway

Comments

# Creating Zip Archives In .Net Without An External Library

You've been kicked (a good thing) - Trackback from DotNetKicks.com

Thursday, October 25, 2007 10:31 AM by DotNetKicks.com

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

Jon, if you think a CodePlex project would be in order for the Zip library, I am happy to launch it.  

Let me know.

Thursday, October 25, 2007 11:12 AM by Dino

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

I cannot believe it has taken until .net 3.0 for this to have been added.  You have to wonder that if you end up having to write wrappers around this just to make using the objects easier that people will just stick with SharpZipLib "'cause it works"

Thursday, October 25, 2007 12:04 PM by Dave

# hostedhg &raquo; Creating Zip archives in .NET (without an external library like &#8230;

Pingback from  hostedhg &raquo; Creating Zip archives in .NET (without an external library like &#8230;

# Zip Archives in .NET at Justin Wendlandt

Pingback from  Zip Archives in .NET  at  Justin Wendlandt

Thursday, October 25, 2007 2:50 PM by Zip Archives in .NET at Justin Wendlandt

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

If I'm not mistaken the initial (.NET 1.1 and .NET 2.0) available compression libraries from System.IO.Compression is not based on the Zip algorithm, but rather, on the GZip standard. Hence the problems with compatibility in deflation.

Friday, October 26, 2007 1:09 AM by Jon Limjap

# Links of the (Yester)day, 4

Links of the (Yester)day, 4

Friday, October 26, 2007 3:00 PM by PhilloPuntoIt

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

I can't believe the basis for this whole article is the author's terrible understanding of the GPL. I don't write GPL software, but I've certainly incorporated both modified and unmodified GPL components in several .NET applications that I've deployed within the authoring organization (i.e., NOT for resale outside the organization). If you aren't redistributing GPL components outside your organization, you don't have to honor the GPL. It's a *distribution* license, not an EULA! If you are planning to redistribute outside your org, the GPL kicks in and you need to honor the license. About sharpziplib, you wrote:

"I'm pretty sure that the reason for this odd "sort-of-GPL" license is because some of the SharpZipLib is based on some GPL's Java code."

This claim is baseless, and serves only to intentionally FUD sharpziplib. The actual terms of the license--which you quote and then completely contradict--is eminently agreeable, with no need to involve the legal department. Further, you wrote:

"...most companies have policies which forbid or greatly restrict their use of GPL code, and for very good reason: GPL has been set up as an alternative to traditional commercial software licensing, and while it's possible to use GPL code in commercial software, it's something that requires legal department involvement."

What the hell is "commercial software" in the context of these ramblings? To me, commercial software is something I want to sell to other people. Dealing with GPL code in my product isn't different than any other license I might have to honor for 3rd-party components I seek to resell, all of which I WANT MY LAWYER TO LOOK AT.

These solutions don't make a lot of sense if your goal is actually COMPRESS DATA, and seem more about not paying money than dealing with compression in a sensible way. Why not suggest licensing a 3rd party compression framework? What's the real motivation here?

Saturday, October 27, 2007 4:55 AM by Good luck

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

@Anonymous "Good luck" ranter -

I've posted a few times about SharpZipLib, including some working code samples. I've also put in a lot of work in support of Mono, by co-founding the Monoppix Linux Live CD project.

Yes, there are good external libraries (GPL, products like Xceed, etc.), and we're both happy to use them when appropriate.

If your goal is to compress data for code you're not going to distribute, the title of the post should tell you that you've come to the wrong place. This post is about creating zip archives without external libraries.

There are times when you want to distribute your application without someone else dictating your license, and that doesn't just include selling them. I'm sure that you've encountered this when you were selecting your license in your projects on CodePlex, SourceForge, Google Code, etc., right?

Your FUD comment is silly rhetoric - it's a fact that SharpZipLib is under GPL, and I was speculating on why that might be. I can't conceive of how that speculation could cause frear, uncertainty, or doubt - the fact is that it's under GPL.

Saturday, October 27, 2007 1:01 PM by Jon Galloway

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

@Jon Limjap:

Some clarifications (my ex-coworkers at Xceed know I hate when people mix compression algorithm and archive format):

- DEFLATE is a compression algorithm, which transforms data X to data Y without adding any headers or footers (other than checksums and markers to help decompress)

- GZip is an archive format, made of a header, compressed data and a footer. The header contains a field that tells what compression algorithm was used. It happens that it's ALWAYS deflate.

- Zip is another (better) archive format, also made of headers, compressed data, footers et al. The header also contains a field that tell what is the compression method. It happens that Deflate is the most popular and supported, though better compression algorithms exist.

- .NET 1.1 did not have compression

- .NET 2.0 has the DeflateStream, which compresses data using the Deflate algorithm, but does not add a header or footer.

- .NET 2.0 also has the GZipStream, which uses the DeflateStream underneath, and also creates a header and footer.

So it's not a question of compatibility of a GZip compression versus Zip compression. These FORMATS both use the SAME deflate compression, but different header/footer formats. In short, if .NET 2.0 did not have a ZipStream or equivalent, the ONLY reason is that it wasn't a priority for MS (or they think Xceed Zip is too awesome q;-) )

Saturday, October 27, 2007 2:24 PM by Martin Plante

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

Great post! I was writing the same post last week but yours is way better! Thanks for doing the homework.

Sunday, October 28, 2007 3:45 AM by Scott Hanselman

# Code-Inside Blog &raquo; W&ouml;chentliche Rundablage: .NET 3.5, ToJson, Zippen, WPF Tutorials&#8230;

Pingback from  Code-Inside Blog  &raquo; W&ouml;chentliche Rundablage: .NET 3.5, ToJson, Zippen, WPF Tutorials&#8230;

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

The DotNetZip library is now a CodePlex project.  

www.codeplex.com/DotNetZip

Wednesday, October 31, 2007 1:14 AM by Dino

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

The 1.2 release of the DotNetZip library on codeplex addresses the first concern you raised, Jon, which is that

"There's no way to control the structure of the zip file when you add individual files"

There's now a way to control the structure of the directory path with each additional file or directory you add to the archive.

www.codeplex.com/.../ProjectReleases.aspx

Friday, November 02, 2007 1:19 PM by Dino

# Kompresja ZIP &amp; DeflateStream &laquo; .NET i takie tam

Pingback from  Kompresja ZIP &amp; DeflateStream &laquo; .NET i takie tam

Friday, November 02, 2007 6:52 PM by Kompresja ZIP & DeflateStream « .NET i takie tam

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

Please read my article:

www.codeproject.com/.../ZipStorer.asp

Tuesday, November 27, 2007 2:11 AM by Jaime

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

What about if you use the .Net framework 2

Thursday, December 20, 2007 5:48 AM by Werner

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

I tried using DotNetZip but unfortunately it just doesn't currently have enough options. For example, it doesn't support password-protecting the file and also doesn't stream the compressed contents to disk until it is done. In my case, I'm trying to programmatically compress and encrypt  multi-gigabyte database backups but DotNetZip's limitations currently don't work for me.

I do hope though that the creators of DotNetZip continue to develop it because all the other options are just not ideal.

Friday, January 25, 2008 8:40 PM by Ken McNamee

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

How can one unzip programmatically with the "WindowsBase.dll" file?

Monday, January 28, 2008 3:50 PM by Mark Kamoski

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

The zip is OK and works. Nice. However, if one unzips the created zip file (using >Windows XP Pro, >Windows Explorer, >Right click, >Extract All), then one will note that it puts all the files into 1 output directory. That is, the folder stucture is not preserved in any way. Is there a way to preserve the folder structure?

Monday, January 28, 2008 4:46 PM by Mark Kamoski

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

BTW, I could not agree more about the GPL. Furthermore, as a contractor, my clients have consistently felt the same way-- it goes too far. What is particularly disingenuous about the GPL is that they tag it as somehow "free", (as in "Free Software Foundation"), when, in fact, it is just a set of rules about what one can do, cannot do, and must do. That's not "free" at all. As Wikipedia notes "In order to preserve the freedom to use, study, modify, and redistribute free software, most free software licences carry requirements and restrictions which apply to distributors. There exists an ongoing debate within the free software community regarding the fine line between restrictions which preserve freedom and those which reduce it." Furthermore, just to spill a little more gasoline on it, this "copy left" idea, which states "that when modified versions of free software are distributed, they must be distributed under the same terms as the original software" is another joke. Let's face facts here. "Free software" is something with "NO STRINGS ATTACHED AT ALL". Period. That's about as clear as it gets. Oh well, back to topic, thanks for the article. I see the unzip stuff is here... msdn2.microsoft.com/.../ms771414(VS.85).aspx ...so that is great.

Tuesday, February 05, 2008 5:48 AM by Mark Kamoski

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

I would recommend http://www.zipmap.info/ for any US zip code map.

You can use this to find any US zip code maps. You can search by (City,

State OR Zip). they have all the maps showing zip codes and major mile

streets in all of the united states. The service is free. Search for five

Digit Zip Code and find the Maps for all 50 states. A Free ZIP Code Finder

which can do wonders for you. <a href="http://www.zipmap.info/"> Zip Map</a>

Tuesday, February 05, 2008 6:16 AM by zip map

# sharpziplib windows compressed files

Pingback from  sharpziplib windows compressed files

Friday, May 02, 2008 11:41 PM by sharpziplib windows compressed files

# Effexor side effects.

Effexor impotence. Effexor xr. Effexor. Effexor withdrawal duration. Effexor xr side effects. When will effexor xr start to work for me. Side effects of effexor.

Monday, July 07, 2008 7:39 PM by Effexor impotence.

# &raquo; Effexor side effects. A Side: What The World Is Saying About A Side

Pingback from  &raquo; Effexor side effects. A Side: What The World Is Saying About A Side

# &raquo; ?? Effexor side effects. A Side: What The World Is Saying About A A Side: What The World Is Saying About A Side

Pingback from  &raquo; ?? Effexor side effects. A Side: What The World Is Saying About A A Side: What The World Is Saying About A Side

# re: Creating Zip archives in .NET (without an external library like SharpZipLib)

I ended up doing this trick with J# myself.  Seeing those java namespaces in my C# was peanut butter in my chocolate.

<a href="www.codepraxis.com/.../a>

Wednesday, July 16, 2008 11:24 PM by Derek

Leave a Comment

(required) 
(required) 
(optional)
(required)