ShowUsYour<Blog>

Irregular expressions regularly
Is RegExLib full of "it"?

Today I read a comment on Jeffrey Schoolcraft's regex blog from Randal L. Schwartz which I felt that I needed to respond to.  As I started writing the comment I realized that this is probably news that needs to be publicly visible, so I'm posting it to my blog and cross referencing  the original comment.  First, here is Randal's comment:


Yup. I continue to downvote and negative-comment nearly every entry at "regex lib".

Do not validate email addresses with a regex (unless it's the full regex, as you point out).

Do not parse HTML with a regex. HTML is surprisingly complex.

Do not validate a date with a regex. All these regex I see that try to compute the number of days of february based on the year number just have me going "WTF!".

These are NOT regex tasks. These are dedicated tool tasks.

And yet, "regexlib" is full of them. And full of "it", if you know what I mean.


Randal...

I hear your pain.  As the lead developer of RegExLib I also see the problems that you are mentioning and, presently we haven't really provided a good enough toolset for the newbies to really help themselves properly.  Should the newbies be randomly using regex's that they find on the site... dunno?  That's for another argument.

We implemented the rating and comment system in the middle of last year to try and give some indication about the value of individual patterns - so I'm extremely grateful that dilligent members of the community such as yourself are helping out by casting your votes.  We also implement an Rss feed for the comments so that comments such as yours are given public visibility - http://www.regexlib.com/RssComments.aspx

It's a hard battle to win as RegExLib continues to grow and, as of today contains nearly 1000 expressions.  There's good news though.  Over the past couple of months there's been a lot of effort put into helping solve these problems and, to that effect, users of the site will see a vastly improved set of tools to help deal with some of the problems that you've mentioned. 

To give you a quick example, one of the new features will provide users with a shortcut way of finding useful AND ACCURATE expressions by offering a box which says: "Enter N examples of what you want to match and N that you shouldn't and we'll provide you with a list of patterns which that match your requirements".  This will help to remove the hit and miss element of a NOOB scanning through 1000 patterns to find the veritable needle in the haystack.

The tools that allow users to manage their expressions is also getting an improvement so hopefully pattern authors might be more responsive in adjusting their patterns based on feedback received.

I hope that, once you see the new features for yourself you will agree with me that RegExLib is a much more valuable resource than it is today.

Posted: Apr 01 2005, 05:46 PM by digory | with 17 comment(s)
Filed under:
Smart UI agents and inductive UI

Continuing on from my blog entries last week about automated UI agents, I've started building a small prototype which will hopefully lead to an actual implementation.  In my prototype I have several agents accessing shared context through which they have some access to shared resources - such as logging tool and reporting agents.  As I mentioned, the output from my prototype will be an implementation, but I'm also preparing to cover it with a whitepaper on some of the lower level details.

Today I'd like to write about a small mental excercise that I've developed which should help to get into thinking mode about agents:

Update: when you do this excercise, try to mentally envisage the objective as an actual objective of a user on your own site - such as a visitor searching for an article or a site admin adding a new item or a techsupport staff member replying to an action item:


Mapping out an Objective

1) Starting Scope: Browsing

2) What is your current Objective within that scope? 
       - To learn about something that the author wrote.

3) What operations can you perform within that scope?
       - Submit feedback
       - User defined search
       - Pre-canned Search
           - View the authors favourite authors
           - View other entries within the current category
           - View similar entries
       - Learn about the author

4) Which of the items in 2 and 3 are related?

5) Do you have a history within this scope?
       - Has read blog entries?
       - Has submitted feedback?
       - Has used search?

So, in this excercise, you can see that not only can we determine the "objective" of a user (given a current scope), but we are also in a position to ascertain how capable they are of reaching that objective on their own.  If the user has no history within this scope then it is likely that they may require assistance from us to "lead" them to tools which can help to achieve their underlying objective.  It is also likely that, within a given Scope, there can be more than just one single objective.


In this entry I have discussed: Scope, Operations, Objectives and Agents.  I've also given an example of creating a high level mapping for an objective.  If you are read this and you did the mental excercise, please take the time to develop a similar mapping for a different objective and forward it to me as either a comment or as an e-mail.

We live, as we dream alone

Every now and then you will see blog entries which read along the lines of:

   "I think that product Y is the suckiest thing on the planet
    because every time I press the foo button everything
    chokes and I have to restart my PC"

Reading that causes an adjustment to my in-built impartiality-o-scope. 

Don't get me wrong, some things that fail to meet expectations really do suck, for example:

    Imagine that, when the next automatic computer update
    comes down the line it breaks something really stupid -
    ie: it forces you to restart Windows every time you press
    the Start button .  That just flat out sucks!  They should've
    tested that, somebody needs to get chewed-out for that.

On the other hand, what if product Y or the foo button are relatively new.  Does it NOT suck that pressing the new foo button causes you to restart your PC... hrmm, no, it still sucks but... rather than going down on the whole of product Y because the foo button is fundamentally broken, why not use this space of time as an opportunity to tell me something of the sense of wonderment that you felt as you were about to press that shiny new foo button for the first time.  Tell me what plans you had for it.  Tell me how you were about to use it.  Was it creative?  Was it new? 

I'm much more likely to get engaged with your blog writing if you can help me to live the software dream.



"We live, as we dream alone"
- Joseph Conrad (English novelist, 1857-1924)

"When I hear somebody sigh, "Life is hard,"
I am always tempted to ask, "Compared to what?""
- Sydney J. Harris
 
"Between the conception and the creation
between the emotion and the response
Falls the shadow"
- Joseph Conrad (English novelist, 1857-1924)

"It occurred to me that my speech or my silence, indeed any action of mine, would be a mere futility"
- Joseph Conrad (English novelist, 1857-1924)

"I dream for a living"
- Steven Spielberg

Quotes found on http://en.thinkexist.com/

 

Posted: Mar 31 2005, 02:07 PM by digory | with 4 comment(s)
Filed under:
Using generics to build generic data logic layers

Consider exposing raw Generic collections from your data logic layers, such as:

   public class PersonManager {
        ...
        public List<Person> ListPeople(...) { ... } ;
   }

When I started messing around with building applications in 2.0 I quickly wrapped Generic collections like so:

  public class PersonCollection : List<Person> {
     ... 
  }

This was normally done so that I could hang a Sort method off of them:

  public class PersonCollection : List<Person>, IBidirectionalSort {
     ... 
     public void Sort( string sortExpression, bool isAscending ) { ... }
  } 

The downside of this approach is that you end up writing fiddly code around calls to generic helper methods.

As an example, let's say that I write a nice generic helper method to page my collections:

    public static void Page(ref List<T> data, int maximumRows, int startRowIndex) 
        where T : IDataObject, new() {

        if (data.Count > 0) {
            if (maximumRows > 0 && startRowIndex >= 0) {
                List<T> tmpColl = null;

                int remainingRowCount = data.Count - startRowIndex;
                int count = (remainingRowCount >= maximumRows) ? maximumRows : remainingRowCount;

                if (count > 0) {
                    tmpColl = new List<T>();
                    tmpColl.AddRange(data.GetRange(startRowIndex, count));
                }
                data = tmpColl;
            }
        }
    }

If I’ve wrapped my collection – as per the PersonCollection example – then using the generic Page method will require temporary object creation when I’m calling it, something like:

   public PersonCollection ListPeople(...) { 
       ...
       PersonCollection people = FillList( reader ) ;
       List<Person> tmp = Page( people, 20, 0 ) ;
       PersonCollection peopleToReturn = new PersonCollection() ;
       peopleToReturn.AddRange( tmp ) ;
       return peopleToReturn ;
   }

So, you can see that by the time we have many generic methods, working with temporary objects becomes cumbersome.  Exposing List<Person> from this method would lead you to build your surrounding methods – such as FillList and Page – to work with your code better but will also lead to leaner code:

    public List<Person> ListPeople(...) { 
       ...
       List<Person> people = FillList<Person>( reader ) ;
       return Page<Person>( people, 20, 0 ) ;
    }

 

Posted: Mar 29 2005, 07:12 PM by digory | with 5 comment(s)
Filed under: ,
Using the SiteMapDataSource to display lists of links

Danny Chen just blogged about the SiteMap and showed some interesting ways to make use of custom attributes:

   http://weblogs.asp.net/dannychen/archive/2005/03/28/396099.aspx

There's another one that I'd like to add to this list.  Commonly people are using UL elements to create navigational links because they require less Html to be emitted in the page and are easily styled into nice looking links.  This is the approach that modern applications such as CommunityServer and ProjectDistributor use for their lists of links.  So how would you do that with a SiteMap?  The answer is actually pretty simple because you can bind a SiteMapDataSource directly to a Repeater.

  <ul>
    <asp:Repeater DataSourceId="myDataSource" ...>
      <ItemTemplate>
        <li>
          <a href='<%# Eval("Url") %>'><%# Eval("Title") %></a>
        </li>
      </ItemTemplate>
    </asp:Repeater>
  </ul>

<asp:SiteMapDataSource id="myDataSource" runat="server" ShowStartingNode="false" />

That will give you a nice list of clickable links.

On my site I actually only required a subset of the items in the SiteMap to be rendered on a specific menu.  For example, I had a left navigation menu which only displays a subset of the total items.  In this case I can use the technique that Danny showed off by adding a custom attribute to my SiteMapNode's like so:

  <?xml version="1.0" encoding="utf-8" ?>
  <siteMap xmlns="http://schemas.microsoft.com/AspNet/SiteMap-File-1.0" >
    <siteMapNode url="Home.aspx" title="Home"  DisplayOnLeft="true">
      <siteMapNode url="Work.aspx" title="Work" />
      <siteMapNode url="School.aspx" title="School" DisplayOnLeft="true" />
    </siteMapNode>
  </siteMap>

So, you can see that 2 of the nodes contain a DisplayOnLeft attribute which is set to true.  Now, to conditionally display those items in my sidebar navigation list I can hook the Repeater's ItemDataBound event and write logic like so...

  SiteMapNode node = e.Item.DataItem as SiteMapNode ;
  string display = node["DisplayOnLeft"];
  if( string.IsEmptyOrNull( display ) || display != "true" ) {
     e.Item.Visible = false ;
  }

Autonomous Interface Agents

Web applications that are context aware will be able to make greater use of autonomous agents to directly manipulate graphical objects and affect the users display.  The MIT paper titled "Autonomous Interface Agents" says of autonomous agents:

An autonomous agent is an agent program that operates in parallel with the user. Autonomy says that the agent is, conceptually at least, always running. The agent may discover a condition that might interest the user and independently decide to notify him or her. The agent may remain active based on previous input long after the user has issued other commands or has even turned the computer off.

How many times have you built an application and then, when speaking to a user about how they use it been confounded to learn that they don't know about some of the best little features that you included in it?

So through better use of agents, we can envisage a system where users are offered cues to increase their productivity with a tool.  Some simple illustrations of this might include systems which:

  • know which features a user hasn't used and offers some ad-hoc advice about that feature to a user.
  • can offer help whilst a user is using a particular feature
  • remember past screens that a user has used and offer them as "favourite" short-cuts
  • offer quick-fix solutions for filling-in forms with valid data


Autonomous Interface Agents
http://web.media.mit.edu/~lieber/Lieberary/Letizia/AIA/AIA.html
Getting ASP.NET Membership running against your own database
 

If you want to install the ASP.NET V2 tables and procedures for things such as Membership, Personalization, etc you need to run the aspnet_regsql.exe tool against your database.  The tool can be found in the %windir%\Microsoft.NET\Framework\{FRAMEWORKVERSION} folder.

 

Once you've run that tool, then it's just a matter of replacing the ConnectionString entry in your web.config file to point to your database:

 

  <configuration xmlns="http://schemas.microsoft.com/.NetConfiguration/v2.0">

    <connectionStrings>
      <remove name="LocalSqlServer" />
      <add connectionString="CONN STRING" name="LocalSqlServer" providerName="System.Sql.DbClient" />
    </connectionStrings>

  </configuration>

After that, the built-in API's will read and write from your database for that application.

Posted: Mar 26 2005, 08:33 PM by digory | with 10 comment(s)
Filed under: ,
Out of context - how to know whether Tweety can fly

In the excellent whitepaper titled "Out of context: Computer systems that adapt to, and learn from, context", there's a section nearing the end titled:  "The view of context from other fields" - containing the following subsections: 

   Mathematical and formal approaches to AI.
   Context in the human-computer interface field
   Context in sociology and behavioral studies

Taken from the section on the mathematical approach the document uses the following example to highlight the problems about making contextual assumptions:

Several areas of mathematics, and formal approaches to artificial intelligence (AI), have tried to address context in reasoning. When formal axiomatizations of commonsense knowledge were first used as tools for reasoning in AI systems, it quickly became clear that they could not be used blindly. Simple inferences: “If Tweety is a bird, then conclude that Tweety can fly” seemed plausible until the possibility that Tweety might be a penguin or an ostrich, a stuffed bird, an injured bird, a dead bird, etc., was considered.

While having a better understanding of context seems to be the next major landmark in building better software, we've already seen that making poor assumptions can lead to even greater user dissatisfaction.


"Out of context: Computer systems that adapt to, and learn from, context":
http://www.research.ibm.com/journal/sj/393/part1/lieberman.html

Coaching end-users

Today I had a conversation with a friend who is a musician.  We were discussing some of the similarities between music and software and even extending many of them to any creative pursuit where the output is consumed by others.  One of the things that we noted was that, as with software, end-users of music do not always share the feelings and experiences envisaged by the architects of the product.  I'm not sure how a musician can give corrective advice to an end-user about such a discrepancy at "runtime" but, in software we are fortunate that we can use context and UI elements to teach a user about the intended usage of a system. 

Providing the user with a shared understanding of a system's capabilities is an area that will increasingly be solved by software agents that can use contextual information to know when and how to offer explicit and implicit cues.  Examples of this in current Microsoft software include: 

  • AutoCorrect functionality in Office
  • AutoComplete and Refactoring in VS2005
  • Clippy and friends in Office
Grokking Information technology

My days of using handcrafted Access database applications to automate inventory reconcilliation seems to be nothing but a distant blur.  Too soon it seems that I was whisked away from my accounting world of Office applications and surrounded by millions of rows worth of raw data.  There's something about real, raw data that seems to make my nerve edges jingle in a merry way.

After the accounting world, I moved into the world of "fetch and format".  This often feels like the world of high demand, low-cost, digital paintings.  There's an art to it that I haven't yet mastered so onward I plod.  The frustration here is that I often feel that I'm not empowered to take control of information and really build something that creates valuable knowledge for a creative, energetic culture.  Is it my mind that is limiting me?  My skills?

Things started to change recently with some of the new applications and frameworks that the Office team have created; OneNote and Information Bridge Framework (IBF) certainly appeal to me, as does InfoPath to a slightly lesser extent.

In IBF I can clearly see the ability to put useful information at the finger-tips of empowered knowledge workers and insustrial consumers of data.  Having worked in an environment where access to information is vital while preparing financial models in a time-restricted environment I can see the benefits of opening up systems and this seems like a framework that is ready and able to make that a reality.

Once again it is Office products that have my blood surging and my head spinning with plans for putting data through a technological treadmill and producing artefacts of value at the other end.

More Posts « Previous page - Next page »