SharePoint statistics: the sources
First Issue in SharePoint statistics is: where can we find the information to do statistics on.
Normally for a web application I grab the IIS log files and start from there. SharePoint is another case. Besides the IIS log files there are also the STS log files. The name STS log files dates back to the SharePoint Team Sites from the past.
IIS log files: used to log ALL activities on a web site
STS log files: used to log all activities on Windows SharePoint Services (WSS) sites, also the basis for SPS area’s
There is a good reason why for SharePoint you need logs in two different places: although all web access is logged in the IIS logs, many accesses to SharePoint go through the FrontPage Server Extensions. Yes, most of SharePoint is still running on FSE, and still implemented in COM. In these URL accesses there is no detail information available on what is exactly requested. In the IIS logs you find entries like:
2004-12-31 23:58:06 SRV-P-INTRA-3 10.10.4.15 POST /_vti_bin/_vti_aut/author.dll - 443 domain\username 10.10.4.102 HTTP/1.1 MSFrontPage/6.0 - - hostname 200 0 0 1061 614 0
2005-01-01 00:08:08 SRV-P-INTRA-3 10.10.4.15 POST /_vti_bin/_vti_aut/author.dll - 443 domain\username 10.10.4.102 HTTP/1.1 MSFrontPage/6.0 - - hostname 200 0 0 1061 614 140
(I removed the author, because no one should have to know this guy does not have a life: editing SharePoint pages when everyone in the world is celebrating the new year!!!)
As you can see FrontPage does all page accesses through author.dll, but no information is available on which page is edited using FrontPage. Also access to documents in WSS goes through a FSE dll.
In the following example we access the homepage and a document test.doc in the document library docs in the site test:
IIS log (stripped down a bit to save space):
2005-01-01 00:52:22 SRV-P-INTRA-3 GET /default.aspx - 443 domain\username
2005-01-01 00:52:22 SRV-P-INTRA-3 10.10.4.15 GET /_layouts/1033/owsbrows.js - 443 domain\username
2005-01-01 00:52:22 SRV-P-INTRA-3 10.10.4.15 GET /_layouts/1033/styles/ows.css - 443 domain\username
2005-01-01 00:52:26 SRV-P-INTRA-3 10.10.4.15 GET /_layouts/images/logo_macaw.jpg - 443 domain\username
: goes on and on and on for all stylesheets, javascript files and pictures
2005-01-01 00:52:26 SRV-P-INTRA-3 10.10.4.15 GET /_vti_bin/owssvr.dll - 443 domain\username
STS log (stripped down a bit to save space):
01:52:22,1,200,2758144,1,0BAD41D9-D7D6-4892-A42F-61E4BB7AAEED,domain\username,https://servername,,default.aspx
01:52:27,1,200,1670913,1,040D5AB9-3072-45E3-975F-40C6B28CF132,domain\username,https://servername/sites/test,,docs/test.dochttps://servername/sites/test,,docs/test.doc
So in the IIS log the access to the page and the access to all it’s embedded and linked content is logged, while in the STS log only the access to the page is logged.
In the IIS log accessing a document is logged as /_vti_bin/owssvr.dll, while the STS log exactly specifies wchich document is loaded from which document library in which site.
For more information on the STS log format, have a look at the MSDN article: Usage Event Logging in Windows SharePoint Services.
Looking at the IIS and STS logs, there are some important observations to make (some directly visible, others from the literature):
- IIS logs have a log timestamp in GMT time
- STS logs have a log time stamp in local server time (honouring daylight saving time)
- IIS log files don’t look at daylight saving time
- STS logs are in a binary format, and must be converted to a usable format before processing
- IIS logs write “header lines” on each IISRESET, sospecial processing is needed
- After each page access information is directly written tot the IIS log
- STS uses caching in writing to the log file, do an IISRESET during investigating to make sure the cached log entries are written
- The timestamp written to the IIS and STS logs can be different for the same page access. See last line in example above for both IIS log and STS log. IIS log entry is written on 00:52:26 (so at 26 seconds), while STS log entry is written on 1:52:27 (so at 27 seconds)
- In the STS log only succesful requests are logged (information streamed back to the client)
- In the IIS log ALL requests are logged, request for the /_layouts “in site context” pages but also requests for missing pages
- The STS log only logs requests for pages and documents in sites, not information in for example the /_layouts directory
- The STS log entries only have a time, no date. The date is given by the folder structure where the STS log files are stored
- The available fields in STS log files is different to the avialable fields in the IIS log files
Where to go from here? I save that for my next post!