Apparently to some, it is.
I was helping a co-worker with the same issue that I blogged about in my last post on how to improve SharePoint list item delete calls. When I Googled (Binged if that makes you feel better) on the subject trying to find a clear answer to her question I found another blogger who had written on the same subject. Cool! Maybe he has the missing piece. I opened the link to find … my own post. There it was, in its entirety. With no clear attribution as to its original source. There was a link on the page back to my site, but why would anyone click it since the entire post was there for their reading pleasure.
So, tonight I did some more searches and found at least two other copies of the same post, one with absolutely no link back at all to my blog. More searches turned up other copies of my posts. So, am I upset? Heck yeah! What can I do about it? Probably nothing. I did a WhoIs search on one of the domains and practically had to download a new font to display the name of the registered domain owner. I won’t mention the sites here as I am loathe to do anything to raise their SEO rankings.
My approach then will be (for now): to not care. I will make a small gesture of adding a copyright notice to the bottom of each posting, even though legally this is not necessary, and, perhaps, adding a “Posted By: Andreas Zenker” tag as well. I’m not really thinking that this will actually stop these sites from scraping my content into their little pay-per-click campaign but maybe there stupid little servers will have to waste a few cycles parsing out some text before they copy my blog over. I’m sure the increased power consumption will cripple them. Take that! ;-)
Update: 9/19/2009
Actually the one site is in English, just hosted elsewhere. Funny thing is I see a new track-back from that site this morning. Once again my post is copied in full, the only reason there is a link to me at all is that I embedded a link to my blog in this post. Maybe that is the secret! I will make the copyright tag a link back to me. Also, they didn't waste any cycles on removing the copyright tag, it got copied as well.
Posted on ASP.NET Weblogs © Copyright 2009 - Andreas Zenker
If you had to choose one of these methods to call to delete an item in a large list which would you choose? Here are your options in code
1: myListItem.Delete()
2: mylist.Items.DeleteItemById(deleteItemId)
Option 2 sounds good here doesn’t it? You have the id, the key to the item in the list, so giving the SPLIstItemCollection this valuable piece of information should make it a simple matter to kill off said SPListItem, no? Well – no. It would be great if, behind the scenes, this was translated to a SQL query like “delete mylist where iitemId = id”. That, however, is not the case. This is the SharePoint object model where men are men and lists are searched in memory. Thanks to Reflector we can see that the DeleteById() method call follows this thread:
Public Sub DeleteItemById(ByVal id As Integer)
Me.GetItemById(id).Delete
End Sub
Which calls …
Public Function GetItemById(ByVal id As Integer) As SPListItem
Dim itemIndexById As Integer = Me.GetItemIndexById(id)
If (itemIndexById < 0) Then
Throw New ArgumentException
End If
Me.EnsureListItemIsValid(itemIndexById)
If (Me.m_iColection Is Nothing) Then
Return New SPListItem(Me, itemIndexById)
End If
Return Me.m_iColection.ItemFactory(itemIndexById)
End Function
First line of GetItemById() needs to resolve the index of the item. Only one way to do that (emphasis in red) …
Friend Function GetItemIndexById(ByVal id As Integer) As Integer
Me.EnsureListItemsData
Me.EnsureFieldMap
Dim num As Integer = 0
Dim columnNumber As Integer = Me.m_mapFields.GetColumnNumber("ID")
Do While (num < Me.m_iRowCount)
If (Convert.ToInt32(Me.m_arrItemsData(columnNumber, num), CultureInfo.InvariantCulture) <> id) Then
num += 1
Continue Do
End If
Return num
Loop
Return -1
End Function
So, Item.Delete() must be the way to go? Well … yes … if you already have the item reference, because then the work is already done. However, if you need to get the item first using GetItemById() then guess where you would end up again? That’s right, GetItemIndexById() and we are back to where we started. It really comes down to a “pay me now or pay me later” situation when choosing between these two methods.
So, how to make it better? Replace either the Get or the Delete with a CAML query which pushes the heavy lifting back to the database (what a strange and mysterious concept, maybe it will catch on one day). Here is a method that constructs a CAML query to delete an item. The method accepts in the itemFileRef, which is what the CAML query needs to pass in to indentify the item in the Document Library to delete. Note: the query syntax is different based on list type, what works for a normal list will not necessarily work on a document library and vice-versa.
Edit 9/18/2009: the owsfileref value that is needed for this query to work can be obtained from the SPListItem object. If the SpListItem is an item in a document library you can call: myItem.Item("FileRef"), which will return the needed value.
Public Sub DeleteReport(ByVal itemFileRef As String)
Dim site As SPSite = Nothing
Dim web As SPWeb = Nothing
Dim list As SPDocumentLibrary = Nothing
Try
site = New SPSite(SPContext.Current.Site.Url)
web = site.RootWeb
list = web.GetList(MySettings.RelativeListURL(web) & MyListConfig.ListName)
Dim sbDelete As New System.Text.StringBuilder
sbDelete.Append("<?xml version=""1.0"" encoding=""UTF-8""?><Batch>")
Dim delCount As Integer = 0
sbDelete.Append("<Method>")
sbDelete.Append("<SetList Scope='Request'>" & list.ID.ToString & "</SetList>")
sbDelete.Append("<SetVar Name='Cmd'>DELETE</SetVar>")
sbDelete.Append("<SetVar Name='ID'>1</SetVar>")
sbDelete.Append("<SetVar Name='owsfileref'>" & itemFileRef & "</SetVar>")
sbDelete.Append("</Method>")
sbDelete.Append("</Batch>")
Try
web.AllowUnsafeUpdates = True
web.ProcessBatchData(sbDelete.ToString())
Catch ex As Exception
Throw New Exception("Error deleting items from list: " & ex.GetBaseException.ToString)
Finally
web.AllowUnsafeUpdates = False
End Try
Catch ex As Exception
Throw New Exception("Error deleting items from list: {0}", ex)
Finally
If Not web Is Nothing Then
web.Dispose()
End If
If Not site Is Nothing Then
site.Dispose()
End If
End Try
End Sub
This Fiddler trace shows the non-trivial performance boost you can get from re-factoring a delete from using the object model to using a CAML query as shown above.

Now, if your lists are small you will not see this kind of improvement, and if you have no expectation that your lists will ever get large then you may be safe using the object model provided delete options. However, if that is not the case check out CAML before you go to production.
© Copyright 2009 - Andreas Zenker