Omar AL Zabir blog on ASP.NET Ajax and .NET 3.5

Working hard to enrich millions of peoples' lives

Sponsors

News

I was
Co-Founder and CTO of Pageflakes, acquired by LiveUniverse - founded by MySpace founder.

I am
Chief Architect, SaaS Platform, British Telecom

I will be
Chief Architect, Mi...

Follow omaralzabir on Twitter

My Public Page
www.pageflakes.com/omar

View Omar AL Zabir's profile on LinkedIn

Read my blog on:

Omar AL Zabir

www.oazabir.com



Views:

Open source projects

A significant part of sql server process memory has been paged out. This may result in performance degradation

If you are using SQL Sever Server standard edition 64 bit on a Windows 2003 64bit, you will frequently encounter this problem where SQL Server says:

A significant part of sql server process memory has been paged out. This may result in performance degradation. Duration 0 seconds. Working set (KB) 25432, committed (KB) 11296912, memory utilization 0%

The number in working set and duration will vary. What happens here is SQL Server is forced to release memory to operating system because some other application or OS itself needs to allocate RAM.

We went through many support articles like:

  • 918483: How to reduce paging of buffer pool memory in the 64-bit version of SQL Server 2005
  • 905865: The sizes of the working sets of all the processes in a console session may be trimmed when you use Terminal Services to log on to or log off from a computer that is running Windows Server 2003
  • 920739: You may experience a decrease in overall system performance when you are copying files that are larger than approximately 500 MB in Windows Server 2003 Service Pack 1

But nothing solved the problem. We still have the page out problem happening every day.

The server has 16 GB RAM where 12 GB is maximum limit allocated to SQL Server. 4 GB is left to OS and and other application. We have also turned off antivirus and any large backup job. 12 GB RAM should be plenty because there's no other app running on the dedicated SQL Server box. But the page out still happens. When this happens, SQL Server becomes very slow. Queries timeout, website throws error, transactions abort. Sometimes this problems goes on for 30 to 40 minutes and website becomes slow/unresponsive during that time.

I have found what causes SQL Server to page out. File System cache somehow gets really high and forces SQL Server to trim down.

clip_image002

You see the System cache resident bytes are very high. During this time SQL Server gets much less RAM than it needs. Queries timeout at very high rate like 15 per sec. Moreover, there's high SQL Lock Timeout/sec (around 15/sec not captured in screen shot).

clip_image004

SQL Server max memory is configured 12 GB. But here it shows it’s getting less than 8 GB.

While the file system cache is really high, there’s no process that’s taking significant RAM.

clip_image006

After I used SysInternal’s CacheSet to reset file system cache and set around 500 MB as max limit, memory started to free up.

clip_image008

SQL Server started to see more RAM free:

clip_image010

Then I hit the “Clear” button to clear file system cache and it came down dramatically.

clip_image012

Paging stopped. System cache was around 175 MB only. SQL Server lock timeout came back to zero. Everything went back to normal.

So, I believe there's either some faulty driver or the OS itself is leaking file system cache in 64bit environment.

What we have done is, we have a dedicated person who goes to production database servers every hour, runs the CacheSet program and clicks "Clear" button. This clears the file system cache and prevents it from growing too high.

There are lots of articles written about this problem. However, the most informative one I have found is from the SQL Server PSS team:

http://blogs.msdn.com/psssql/archive/2007/05/31/the-sql-server-working-set-message.aspx

UPDATE - THE FINAL SOLUTION!

The final solution is to run this program on Windows Startup:

SetSystemFileCacheSize 128 256

This sets the lower and higher limit for the System Cache. You need to run this on every windows startup because a restart will undo the cache setting to unlimited.

You can run the program without any parameter to see what is the current setting.

Download the program from this page:

http://www.uwe-sieber.de/ntcacheset_e.html

Go to the end and you will get the link to the SetSystemFileCacheSize.zip

Posted: Oct 19 2007, 08:31 PM by oazabir | with 2 comment(s)
Filed under:

Comments

Rolf said:

Thanks for such a well thought out and presented article.

We have a customer with SQL 2005 Std Ed x64 with 32GB of RAM. This issue is killing their performance.

Based on this issue I cannot see how x64 Std Ed is even a viable and stable production platform.

We'll try a few options and further research.

Thanks again.

# October 21, 2007 4:18 AM

Larry said:

I second what Rolf said.  I just walked into a customer today after they moved to x64 Standard Edition over the weekend, and they now have the very same problem.  Is there a command line version of the CacheSet tool?  Upgrading to Ent. Ed. is out of the question, but having a person hit a 'clear' button every hour is not practical.  

# October 23, 2007 10:18 PM

D Weiner said:

We had this exact problem, and it was solved via premium MS support with a hotfix related to Broadcom drivers....a bug, plain and simple.  The hotfix was from a couple of months ago, and at the time was not fully regression tested..but it solved that problem..this was after weeks of exhausting all of our resources.

FYI

# November 5, 2007 7:09 PM

P Dyball said:

D Weiner said : We had this exact problem, and it was solved via premium MS support with a hotfix related to Broadcom drivers....a bug, plain and simple

Do you have any further details - link to the hotfix etc. We have excatly the issue here with SQL 2005 Standard build 3054 x64 on Win 2003 Enterprise x64 and we use Broadcom too. I am pursuing PSS but any info you have would be greatly appreciated.

Thanks in advance for any help you can give.

# November 6, 2007 9:20 AM

P Dyball said:

Re by previous post:

For info we are currently on this level of Broadcom drivers.

Broadcom NetXtreme Gigabit Ethernet dated 28/08/2006 9.81.0.0

Broadcom BCM5708C NetXtreme II GigE (NDIS VBD Client) dated 25/05/2007 3.4.10.0

BASP Virtual Adapter dated 26/01/2007 6.2.24.0

# November 6, 2007 9:33 AM

Alfred Sehmueller said:

Same problem over here (we also have a Broadcom BCM5708C NetXtreme II GigE (NDIS VBD Client)). But I'm not sure if this is the reason.

# November 8, 2007 7:44 AM

woods said:

D Weiner  - what was the hot fix that MS supplied.

# March 18, 2008 3:29 AM

Zafer said:

We have the same issue.We have two HP DL580 G5 servers with 4 quad core CPU and 64 GB of ram running in Active-passive cluster.Both servers have the same issue.Hp has the same brand of ethernet card(Broadcom). If you tell us the hotfix or the article number we can check it out.

# June 12, 2008 9:44 AM

Paurav Chudasama said:

We use the same servers as Zafer and are facing exactly the same symptoms.

Please provide us further information on the hotfix...

Thanks

Paurav

# July 3, 2008 7:39 AM

oazabir said:

I updated the post with the final solution. Please check the bottom part of the post.

# July 3, 2008 10:23 AM

Willem said:

Does anyone know if this issue applies to the 64bit Enterprise Edition as well? We have a server, 64 bit and Enterprise edition and I noticed these messages showing up in the Event Viewer/Application Log since yesterday.

# July 18, 2008 3:26 PM

oazabir said:

It happes on Enterprise Edition as well unless you use the SetSystemFileCacheSize to cap the file cache on every windows restart.

# July 19, 2008 1:56 AM

philhege said:

We have this message appear on 32-bit instances.  Does the fix work for this flavor also?

# August 6, 2008 12:34 PM

Rem said:

What is the hotfix?  Where I can find it?

Rem

# September 2, 2008 10:34 AM

csmeutah said:

The memory/Processor/disk specs you list, is that a report I can print from SQL server?

TIA

# January 6, 2009 11:48 AM

Beat said:

@csmeutah: That is the performance monitoring management snap-in Microsoft provides (perfmon.msc, run it via Administrative Tools | Performance). Switch to "Report" view (Button 'View Report' in toolbar, or press CTRL + R) to see the numbers instead of the graphs.

# January 22, 2009 9:31 AM

Rowland Gosling said:

We've had this identical problem. The support engineer identified the Broadcom issue, so we changed the NICs to Intel. Still a huge problem!!!

We'e running the 64bit Enterprise Edition on a pair of Dell r900s with 120GB of RAM--non clustered, Windows 2003 Server R2 SP2.

The issue is still not resolved and causing us major problems in our production environment.

Interestingly enough one of the two servers seems to be ok while the other just about tanks!

# February 11, 2009 11:38 PM

Rowland Gosling said:

We'll give it a try..thanks!

# February 13, 2009 7:19 AM

dambuster said:

Nice article.

Could you tell me how you generated the stats about the memory usage, physicaldisk and sql server (gray background image)?

Sorry for the novice question but I can't find anything about it.

Thanks

# March 5, 2010 12:31 PM

oazabir said:

It's available with Windows. Control Panel -> Administrative Tools -> Performance

# March 6, 2010 12:52 AM