Jaycent Drysdale

File System search via LINQ to Objects

LINQ provides a standard way for developers to query data in diverse locations, ranging from in memory objects, XML data, or relational data living in an SQL Server Database.Lets take a look at a scenario where we use Linq to objects to query a directory on the local drive for files that match a given extension, and show the results in a Datagrid control.

Here is the plan of action for the application

  1. Create a function to return a list of all the files in a directory.  We will call this method GetFiles, it will take one string parameter representing the base search directory, and will return a strongly typed list of FileInfo Objects.
  2. Use Linq to Objects to filter the returned list for files that matches a user-specified file extension.
  3. Bind the results to a grid
Lets start by taking a look at the GetFiles function:

public
static System.Collections.Generic.List<FileInfo> GetFiles(string sPath, string sFileExtension)
{
     
DirectoryInfo _dirInfo = new DirectoryInfo(sPath);
     
return System.Linq.Enumerable.ToList(_dirInfo.GetFiles(string.Format("*{0}",sFileExtension), SearchOption.AllDirectories));
}

The GetFiles() function uses objects from the System.IO namespace to do the heavy lifting of searching the file system. Results are returned in a strongly typed list of FileInfo Objects.

Next we will look at the code that calls the GetFiles method defined above, and uses Linq to filter the list for the given file type. You would typically put this code in the click event for a button control on a windows form.

//get all files contained in the path supplied by the user
System.Collections.Generic.List<FileInfo> _theFiles = GetFiles(c:\myDirectory, ".doc"); 

//we now have a list of files...next we use LINQ to query the file list and sort the results by name
var _files = from file in _theFiles 
     
orderby file.Name 
                
select file;

this.dataGridView1.DataSource = _files.ToList();

And there you have it. Note the use of the orderby clause to sort the results before you bind the results to the grid. 

Posted: Feb 15 2008, 11:21 AM by jaycent | with 26 comment(s) |
Filed under: , , ,

Comments

Cyril Gupta said:

Haven't you filtered only for .doc files? Hmm... Since all the extensions are the same, how does sorting on extensions help? Did miss something here?

# February 15, 2008 1:34 PM

jaycent said:

Cyril, you are correct sorting by file-extension would be meaningless in this particular scenario. Sorting by Name or last modified date would be a better option. Thanks for your feedback!

# February 15, 2008 2:02 PM

Judah Himango said:

FYI, calling ToLower() will allocate a new string. Since this is called for every file in directory, this could get expensive.

A more efficient way would be to write the query like so:

var _files = from file in _theFiles

         where file.Extension.Equals(".doc", StringComparison.InvariantCultureIgnoreCase)

                   orderby file.Name

                             select file;

# February 15, 2008 2:09 PM

Chris Pietschmann said:

Nice, but I wonder how this scales when there are thousands of files in the folder. I would think it still performs nicely, but have you done any performance testing?

# February 15, 2008 2:46 PM

tobsen said:

Chris, somebody said "performance isn't an issue until it is an issue." Since it isn't an issue don't worry about perfomrance ;)

# February 15, 2008 4:56 PM

Andemann said:

This is like the hello world of Linq.

(and if you want performance, am enterprise or desktop search tool is what your're looking for)

# February 16, 2008 2:05 AM

Andrey Shchekin said:

Your GetFiles function looks strange to me. Why not:

protected IList<FileInfo> GetFiles(string sPath)

{

     try

     {

           var _dirInfo = new DirectoryInfo(sPath);

           return _dirInfo.GetFiles("*.*", SearchOption.AllDirectories))

     }

     catch (Exception ex)

     {

           MessageBox.Show(ex.Message.ToString());

     }

}

Even this way it is bad, since functions should not translate exception to message boxes, especially if it is a linq provider function to be used anywhere.

# February 16, 2008 3:32 AM

jaycent said:

To Andemann and Chris: The intent is not to provide an  efficient file search tool, but to demonstrate how one could use LINQ to quary a list of objects, filter the list, sort the items, and bind the results to a grid.

To Andrey Shchekin: The MessageBox.Show code in the catch block for the GetFiles() function is a good way to get my point accross. in production code, you would perhaps want to re-throw the error, or return a status code to the calling app, or something to that effect. And yes, the code in the GetFiles function could have been written otherwise...the code for that function was written about 2 years ago when Type Inference wasnt around. But thanks for your suggestion!

# February 16, 2008 6:38 AM

Andrew Robinson said:

Never "re-throw" exceptions. Just "throw". You will preserve your stack trace.

# February 16, 2008 7:45 AM

Will Asrari said:

Hey Andrew!

@Andrey: why not take it one step further and write like:

try

{

   return new DirectoryInfo(path).GetFiles("*.*", SearchOption.AllDirectories);

}

catch (Exception exception)

{

   throw;

}

# February 16, 2008 3:18 PM

Andrey Shchekin said:

@Will, in this case you should remove try catch as well. ;)

# February 16, 2008 4:15 PM

jaycent said:

Hey Andrew, thanks for pointing that out (rethrowing exceptions). Old habits die hard I suppose :)

# February 19, 2008 10:04 AM

James Curran said:

While we are on the subject, you're using Linq on one end of this, but not the other.  We can reduce GetFiles to:

public static System.Collections.Generic.List<FileInfo> GetFiles(string sPath)

{

DirectoryInfo _dirInfo = new DirectoryInfo(sPath);

return System.Linq.Enumerable.ToList(_dirInfo.GetFiles("*.*",SearchOption.AllDirectories));

}

Note, also, that you explicitly use System.Collections.Generic, thereby assuming that it's not in a "using" (even though Visual Studio automatically inserts it), but DO assume that there's a "using System.IO;"  (which must be manually added)

# February 20, 2008 8:29 AM

Glenn said:

To Andrey Shchekin:

What are the benefits of using Ilist rather than List ?

# August 20, 2008 4:03 AM

ashutosh said:

It is really good.

# December 11, 2008 7:15 AM

Richard Thomas said:

Nice example, thanks!

However, it doesn't seem to scale very well.  Trying to get a list of files (over 600) by descending LastWriteTime takes forever or doesn't complete at all:

Dim di As New DirectoryInfo(path)

Dim result = (From files In di.GetFiles(wildCard) _

Order By files.LastWriteTime Descending _

Select files.Name, files.LastWriteTime).First()

For Each myFile In result

Console.WriteLine("{0} {1}", result.Name, result.LastWriteTime.ToString)

Next

What I was trying to do is get the file with the latest LastWriteTime.  Any LINQ or non-LINQ ideas?

Thanks.

# August 26, 2009 11:32 AM

developing ipad app said:

Sharp tools make good work.

-----------------------------------

# December 18, 2010 5:42 AM

labatterie said:

I was trying to do is get the file with the latest LastWriteTime.

# December 21, 2010 7:56 AM

best ipad application said:

-----------------------------------------------------------

You made some good points there. I are you aware a look for within the topic and discovered most folks will concur together with your weblog.

# January 7, 2011 6:43 PM

Francis Halbert said:

Hi, i feel that i saw you visited my internet web page therefore i came to ??return the favor??.I'm trying to uncover things to increase my web page!I suppose its ok to use some of your ideas!!

# June 29, 2011 9:34 PM

john kelvin said:

this is helpful.

Can you select top 10 files.

this will be great

# February 22, 2012 4:08 PM

seo资源 said:

Anyone didn't remember to include Playlist. com, just where it certainly is not actually needed for you to definitely sign-up and you could steady stream any kind of song you wish.

# October 24, 2012 9:37 PM