Saturday, May 2, 2009 8:00 AM Kazi Manzur Rashid

For Us By Us

No, this is not at all a post of FubuMVC, I just borrowed the words for this post.

Jeff Atwood & Joel Spolsky thinks it is a compliment when they found there site design is copied by a Chinese site and I do agree it completely, specially when it is serving the same community that I belong to. And I also agree with Joel Spolsky that most (at least in my case) English as second language speaking developers prefer the local language when they are communicating between them. We made KiGG an Open Source Project from the very beginning and since the v2.0 is released people are really picking it for creating local .NET community sites in their own language:

 

dotnetomaniak_pl
Polish version

 

progg_ru
Russian version

 

9efish_com
Chinese version (I guess it is still under development)

And this is the power of Open Source, the existing version contains absolutely zero support for localization, but the community picked and made it local for them, it has the same features and power as the original version and my sincere thanks to them who are behind these. As a side note I want to mention that localization is the highest priority that we will be adding in v3.0

Recently, I got few requests about the story publishing process from the peoples who are also planning to launch sites based upon KiGG. So instead of answering them individually, I preferred to write a post to explain it.

Like the many social news/links site the story appearing in the front-page is done based upon some algorithms and it has the complete support to add/replace/remove any of those algorithms. By default, it comes with 6 different algorithms.

The story publish process starts when the Publish method of StoryService is called.

public virtual void Publish()
{
    using(IUnitOfWork unitOfWork = UnitOfWork.Begin())
    {
        DateTime currentTime = SystemTime.Now();

        IList<PublishedStory> publishableStories = GetPublishableStories(currentTime);

        if (!publishableStories.IsNullOrEmpty())
        {
            // First penalty the user for marking the story as spam;
            // It is obvious that the Moderator has already reviewed the story
            // before it gets this far.
            PenaltyUsersForIncorrectlyMarkingStoriesAsSpam(publishableStories);

            //Then Publish the stories
            PublishStories(currentTime, publishableStories);

            unitOfWork.Commit();
        }
    }
}

As you can see, it first calls the GetPublishableStories to prepare a list which is publishable at this moment, next it reduces the score of the users who have incorrectly marked any of those stories as spam (will explain  later) and at last it calls another method PublishStories to actually publish it.

private IList<PublishedStory> GetPublishableStories(DateTime currentTime)
{
    List<PublishedStory> publishableStories = new List<PublishedStory>();

    DateTime minimumDate = currentTime.AddHours(-_settings.MaximumAgeOfStoryInHoursToPublish);
    DateTime maximumDate = currentTime.AddHours(-_settings.MinimumAgeOfStoryInHoursToPublish);

    int publishableCount = _storyRepository.CountByPublishable(minimumDate, maximumDate);

    if (publishableCount > 0)
    {
        ICollection<IStory> stories = _storyRepository.FindPublishable(minimumDate, maximumDate, 0, publishableCount).Result;

        foreach (IStory story in stories)
        {
            PublishedStory publishedStory = new PublishedStory(story);

            foreach (IStoryWeightCalculator strategy in _storyWeightCalculators)
            {
                publishedStory.Weights.Add(strategy.Name, strategy.Calculate(currentTime, story));
            }

            publishableStories.Add(publishedStory);
        }
    }

    return publishableStories;
}

The GetPublishableStories first gets a list of active stories (the stories that has been voted, viewed, commented since the last publish, there is also an age factor which means story in specific age range will only qualify, default is 4-240 hour) form the database, next it applies the different algorithm to calculate its weight. This algorithm (Strategy Pattern) is defined as IStoryWeightCalculator interface and injected in the StoryService by the DI container. Once the calculation is done the PublishStories method is called.

private void PublishStories(DateTime currentTime, IList<PublishedStory> publishableStories)
{
    // Now sort it based upon the score
    publishableStories = publishableStories.OrderByDescending(ps => ps.TotalScore).ToList();

    // Now assign the Rank
    publishableStories.ForEach(ps => ps.Rank = (publishableStories.IndexOf(ps) + 1));

    // Now take the stories for front page
    ICollection<PublishedStory> frontPageStories = publishableStories.OrderBy(ps => ps.Rank).Take(_settings.HtmlStoryPerPage).ToList();

    if (!frontPageStories.IsNullOrEmpty())
    {
        foreach (PublishedStory ps in frontPageStories)
        {
            ps.Story.Publish(currentTime, ps.Rank);
        }

        _eventAggregator.GetEvent<StoryPublishEvent>().Publish(new StoryPublishEventArgs(frontPageStories, currentTime));
    }

    // Mark the Story that it has been processed
    publishableStories.ForEach(ps => ps.Story.LastProcessed(currentTime));
}

The PublishStories first ranks the stories based upon its total weight, then it takes the first 20 story (defined in the web.config) for the front-page and marks it as for front-page by updating its rank and published date, at last it updates the last process date of all the stories regardless its publish status so that we can verify its active status for the next publish.

As you can see the story qualifying is completely decided based upon those weight calculators and by adding/removing/replacing the new calculators you can tweak the whole story publishing process. Now lets see how the default calculators works.

VoteWeight: Returns higher value if the vote is given from a different IP address than the story was actually submitted, for example 10 if it is given from a different IP address and 5 from the same IP address. If you do not like it, you can level it in the web.config.

<type name="vote" type="IStoryWeightCalculator" mapTo="VoteWeightCalculator">
    <lifetime type="PerWebRequest"/>
    <typeConfig extensionType="Microsoft.Practices.Unity.Configuration.TypeInjectionElement, Microsoft.Practices.Unity.Configuration">
        <constructor>
            <param name="voteRepository" parameterType="IVoteRepository">
                <dependency/>
            </param>
            <param name="sameIPAddressWeight" parameterType="System.Single">
                <value type="System.Single" value="5"/>
            </param>
            <param name="differentIPAddressWeight" parameterType="System.Single">
                <value type="System.Single" value="10"/>
            </param>
        </constructor>
    </typeConfig>
</type>

CommentWeight: Returns higher value if comment is given from a different Ip  and not by the actual submitter. For example, 4 for different IP, 2 for same IP and 1 for the actual submitter no matter from which IP it was submitted. This can be changed from web.config:

<type name="comment" type="IStoryWeightCalculator" mapTo="CommentWeightCalculator">
    <lifetime type="PerWebRequest"/>
    <typeConfig extensionType="Microsoft.Practices.Unity.Configuration.TypeInjectionElement, Microsoft.Practices.Unity.Configuration">
        <constructor>
            <param name="commentRepository" parameterType="ICommentRepository">
                <dependency/>
            </param>
            <param name="ownerWeight" parameterType="System.Single">
                <value type="System.Single" value="1"/>
            </param>
            <param name="sameIPAddressWeight" parameterType="System.Single">
                <value type="System.Single" value="2"/>
            </param>
            <param name="differentIPAddressWeight" parameterType="System.Single">
                <value type="System.Single" value="4"/>
            </param>
        </constructor>
    </typeConfig>
</type>

ViewWeight: Returns the sum of each unique IP address view (view means when a user clicks a link which takes the user to the original source) multiplied by a factor, this can be also changed from the web.config.

<type name="view" type="IStoryWeightCalculator" mapTo="ViewWeightCalculator">
    <lifetime type="PerWebRequest"/>
    <typeConfig extensionType="Microsoft.Practices.Unity.Configuration.TypeInjectionElement, Microsoft.Practices.Unity.Configuration">
        <constructor>
            <param name="storyViewRepository" parameterType="IStoryViewRepository">
                <dependency/>
            </param>
            <param name="weightMultiply" parameterType="System.Single">
                <value type="System.Single" value="0.1"/>
            </param>
        </constructor>
    </typeConfig>
</type>

UserScoreWeight: Returns the sum of user score multiplied by a factor who voted the story, so that stories voted by the higher score users has much more chance to appear in the front-page. The factor can be also changed from the web.config.

Freshness: Returns value depending upon the story age, the young the story the higher the value. The freshness threshold and interval can also be changed from web.config.

KnownSource: Returns a value based upon the source grade. This ensures that stories from an well known source get higher chance to appear in the front-page. For example, a blog post from Gu should take precedence than a blog post of mine. These sources are maintained in the KnownSource database table. If the source is not known, then it returns nothing.

So, as you can understand with the above algorithms even a story that got less votes might rank better than a more voted story and another important things I would like to mention that it does not update any specific part (top/bottom) of the front-page, instead it updates the whole page, the existing story of the front-page might also appear in the front-page at the same or different location based upon its recent rank, the benifit is it will never replace much more popular but old story by a less popular new story.

Okay, if you think the above is a solid publishing process or the process su*ks or you need a much simpler algorithm, my recommendation is to create a new strategy inheriting from the StoryWeightBaseCalculator. Let us see how to create a dumb strategy which rank the story based upon the number of votes and comments it got.

public class SimpleWeightCalculator : StoryWeightBaseCalculator
{
    private readonly IVoteRepository _voteRepository;
    private readonly ICommentRepository _commentRepository;

    public SimpleWeightCalculator(IVoteRepository voteRepository, ICommentRepository commentRepository) : base("Simple")
    {
        _voteRepository = voteRepository;
        _commentRepository = commentRepository;
    }

    public override double Calculate(DateTime publishingTimestamp, IStory story)
    {
        ICollection<IVote> votes = _voteRepository.FindAfter(story.Id, story.LastProcessedAt ?? story.CreatedAt);
        ICollection<IComment> comments = _commentRepository.FindAfter(story.Id, story.LastProcessedAt ?? story.CreatedAt);

        return (votes.Count + comments.Count);
    }
}

Next, remove the existing strategies and add it in the web.config (I am assuming you know the configuration of MS Unity application block):

<type name="simple" type="IStoryWeightCalculator" mapTo="SimpleWeightCalculator">
    <lifetime type="PerWebRequest"/>
    <typeConfig extensionType="Microsoft.Practices.Unity.Configuration.TypeInjectionElement, Microsoft.Practices.Unity.Configuration">
        <constructor>
            <param name="voteRepository" parameterType="IVoteRepository">
                <dependency/>
            </param>
            <param name="commentRepository" parameterType="ICommentRepository">
                <dependency/>
            </param>
        </constructor>
    </typeConfig>
</type>

Now, when you publish stories, it will rank based upon the vote and comments it got since the last publish.

I hope the above clarifies the story publish process of KiGG and do let me what else of KiGG you want to highlight next.

Shout it
Filed under: , , , , ,

Comments

# re: For Us By Us

Saturday, May 2, 2009 3:22 AM by Vladimir (progg.ru)

great post, thanks!

# re: For Us By Us

Sunday, May 3, 2009 9:29 AM by xgluxv

hello, if you can achieve this functionality: after submitting a new story, different people can title and introduce it in different languages ,I think i will close my site and  support for your site!