Rippe Tech Talk: November 2012

Wednesday, November 14, 2012

Code Refactoring - Improving speed without changing external behavior

As part of our Contract Management solution Contract Guardian, we have developed a reporting tool to enable us to enter complex search terms and fetch contracts satisfying the search terms. Our Rippe & Kingston development team wanted this to be a tool that we could use on any of our custom projects and products. This tool was developed using ExtJS as the front end and C# ASHX pages to process the query. This sounds really simple except when it was used in real situations that fetch many thousand contracts. When this project was deployed to a client that had more than 5000 contracts, we got the dreaded IE error “A script on this page is causing Internet Explorer to run slowly”.

On further research we determined that a script on our program was taking a long time to finish executing. This is a note from Microsoft¹ for Internet Explorer - Because some scripts may take an excessive amount of time to run, Internet Explorer prompts the user to decide whether they would like to continue running the slow script. Some tests and benchmarks may use scripts that take a long time to run and may want to increase the amount of time before the message box appears. "

Now there were a couple of options to resolve this error. One of the options provided in the Microsoft Knowledge Base includes modification of a registry value on client machines to wait for more time before throwing the notification message. That option would introduce issues during mass deployment across large client installations. Further, the same script could also be running slower in other browsers as well.

The other option was to go back to the drawing board and refactor² the script that was causing the issue. Refactoring is basically recoding the program without affecting the end result of the program.

As noted, the front end was ExtJS . This consists of a Contract grid (A ExtJS Grid Panel) with default columns. The users have options to choose the display fields that will be displayed as columns on the grid. These display fields are saved on the Cache engine behind our reporting tool. The Cache saves all the filters (Departments, Companies, Users, Contract Types) used to filter the contracts and the display fields that show up on the Contract Grid.

Our script calls a corresponding ASHX page that processes the query and the query in turn uses the Cache to get the fields. The filters selected/entered by the user generates the JSON object to be returned to the script. The JSON object consists of the headers that will recreate the ExtJS Grid.Panel columns and the data that will go to the ExtJS store that populates the grid.

Since the fields are not known until run time, the script was processing every header field, recreating columns, filters, creating a store, paging tool bar and many other things and finally adding the data to the store in the Grid Panel. This took a lot of time.

// Comment


function populateContractGrid(data) {

   // Code omitted .....
        reconfigure_grid_test(newHeaders, newHeaderTypes, newData, dataStore);
   }

//pass the information to the next method from here
function reconfigure_grid_test(headers, types, data, dataStore) {
    // Code omitted .........
    reconfigure_grid(headerFields, headerTypes, dataFields, dataStore);
}
 
function reconfigure_grid(fields, types, data, dataStore) {
    // Code omitted...
    //loop will create grid columns and grid fields, adding them to the above arrays
            
        // Code omitted 
        columnArray.push(column);

       // Code omitted 
        fieldArray.push(field);
    }

    //build the column model that will be used in the contract grid
   
    // Code omitted
    var reader = new Ext.data.ArrayReader(
    {
        totalProperty: ''
    }, record);

    var memoryProxy = new Ext.ux.data.PagingMemoryProxy(data);

    var store3 = new Ext.data.ArrayStore(
    {
        remoteSort: true,
        reader: reader,
        fields: fieldArray,
        baseParams:
        {
            lightWeight: true,
            ext: 'js'
        },
        data: data,
        cm: columnModel,
        proxy: memoryProxy
    });

   // Create the filters for the grid
   // Code omitted
    pagingTBar.bindStore(store3, true);
    contractGrid.reconfigure(store3, columnModel);

    if (contractGrid.plugins) {
        for (var index = 0; index <= contractGrid.plugins.length; index = index + 1) {
            contractGrid.plugins.pop();
        }
    }

    filterPlugin.init(contractGrid);
    summary.init(contractGrid)
    contractGrid.plugins.push(filterPlugin);
    contractGrid.plugins.push(summary);

   
    // Code omitted
}

Our solution to this was to recreate the Grid Panel before fetching the JSON data from the Query_Handler ASHX page and then replacing the store in the Ext.Grid.Panel with the store recreated from the JSON Object returned by the ASHX page.

So basically when the user selected the fields for the Grid, the fields were saved in the Cache by the DisplayFields_ASHX page and in return a JSON Object with empty data rows was returned by this ASHX page. Our script used this JSON Object to run the code displayed above with empty data (instead of more than 5000 records).

Then when the query was run, the store of the ExtJS Grid Panel was just replaced with the new store returned by the JSON object without recreating the grid. That resulted in a dramatic change and the script was much faster in IE. Instead of calling the original code, we called a new function demonstrated below, where the store in the Contract Grid was replaced with the data from the JSON object.

// Comment
function populateContractGridWithData(data) {
  // Code Omitted
    var newData = [];
    for (var i = 0; i < arrayRows.length; i++) {
        newData[i] = arrayRows[i].value;
    }
    var store4 = contractGrid.getStore();
    var memoryProxy = new Ext.ux.data.PagingMemoryProxy(newData);
    store4.proxy = memoryProxy;
    pagingTBar.bindStore(store4, true);

    store4.load(
    {
        params:
        {
            start: 0,
            limit: 100
        }
    });

This design provided better user experience and also minimizes the need on system resources.

In summary, for slow operation of any code, one of the best options to resolve these options is to go back to the drawing board and refactor the code to improve speed without changing the external result or behavior of the code.

For more information check our website

References:

1. http://support.microsoft.com/kb/175500

2. http://en.wikipedia.org/wiki/Code_refactoring

Tuesday, November 13, 2012

Advanced Techniques with Database Migrations

While Entity Framework's database migrations will automatically pick up on structure changes, there are times where we want to do a little bit more. Consider the following model:

public class Attachment
{
    public int Id { get; set; }
    public string Name { get; set; }
    public DateTime CreatedOn { get; set; }
    public DateTime ModifiedOn { get; set; }
    public string Path { get; set; }
    public long Size { get; set; }
    public string MimeType { get; set; }
}

We'll assume that this model has already been added to the database via a previous migration. Let's say that we'd like to give users the ability to change the file in a given attachment. More than that, let's say that instead of just keeping track of the current file, we would like to have a history of all files that a user has uploaded for versioning. We'll modify our model:

public class Attachment
{
    public int Id { get; set; }
    public string Name { get; set; }
    public DateTime CreatedOn { get; set; }
    public DateTime ModifiedOn { get; set; }

    public virtual List<AttachmentFile> AttachmentFiles { get; set; }
}

public class AttachmentFile
{
    public int Id { get; set; }
    public DateTime CreatedOn { get; set; }
    public string Path { get; set; }
    public long Size { get; set; }
    public string MimeType { get; set; }

    public Attachment { get; set; }
    public int AttachmentId { get; set; }
}

Instead of the Attachment model containing information about the file, it has a one-to-many relationship with AttachmentFiles. We will assume that the AttachmentFile with the latest date will be used as the "primary" attachment.

Now we need to carry this change to our database. Using the package manager console, we run "add-migration AddAttachmentFiles", and we come up with this:

public partial class AddAttachmentFiles : DbMigration
{
    public override void Up()
    {
        CreateTable(
            "AttachmentFiles",
            c => new
                {
                    Id = c.Int(nullable: false, identity: true),
                    CreatedOn = c.DateTime(nullable: false),
                    Path = c.String(),
                    Size = c.Single(nullable: false),
                    MimeType = c.String()
                })
            .PrimaryKey(t => t.Id)
            .ForeignKey("Attachments", t => t.AttachmentId, cascadeDelete: true)
            .Index(t => t.AttachmentId);
        DropColumn("Attachments", "Path");
        DropColumn("Attachments", "Size");
        DropColumn("Attachments", "MimeType");
    }

    public override void Down()
    {
        AddColumn("Attachments", "MimeType", c => c.String());
        AddColumn("Attachments", "Size", c => c.Single(nullable: false));
        AddColumn("Attachments", "Path", c => c.String());
        DropIndex("AttachmentFiles", new[] { "AttachmentId" });
        DropForeignKey("AttachmentFiles", "AttachmentId", "Attachments");
        DropTable("AttachmentFiles");
    }
}

This looks acceptable. This migration will create the new table and remove the desired columns from the Attachments table. But what about our attachment data in the Attachments table? There is no way for Entity Framework to know that we want to do anything with the data in the columns we're deleting from the Attachments table. So, we do it manually. The DbMigration class has a Sql() method we can use to execute raw SQL against our database. If we modify our migration:

public partial class AddAttachmentFiles : DbMigration
{
    public override void Up()
    {
        CreateTable(
            "AttachmentFiles",
            c => new
                {
                    Id = c.Int(nullable: false, identity: true),
                    CreatedOn = c.DateTime(nullable: false),
                    Path = c.String(),
                    Size = c.Single(nullable: false),
                    MimeType = c.String()
                })
            .PrimaryKey(t => t.Id)
            .ForeignKey("Attachments", t => t.AttachmentId, cascadeDelete: true)
            .Index(t => t.AttachmentId);

        Sql("");

        DropColumn("Attachments", "Path");
        DropColumn("Attachments", "Size");
        DropColumn("Attachments", "MimeType");
    }

    public override void Down()
    {
        AddColumn("Attachments", "MimeType", c => c.String());
        AddColumn("Attachments", "Size", c => c.Single(nullable: false));
        AddColumn("Attachments", "Path", c => c.String());

        Sql("");

        DropIndex("AttachmentFiles", new[] { "AttachmentId" });
        DropForeignKey("AttachmentFiles", "AttachmentId", "Attachments");
        DropTable("AttachmentFiles");
    }
}

we can write SQL queries that will be executed in the flow of commands. The first query will move data from the to-be-deleted columns in the Attachments table to the AttachmentFiles table:

INSERT INTO AttachmentFiles (CreatedOn, Path, Size, MimeType, AttachmentId)
SELECT CreatedOn, Path, Size, MimeType, Id AS AttachmentId FROM Attachments

This query selects only the relevant fields from the Attachments table and inserts them into the AttachmentFiles. Since there will be exactly one set of file data per Attachment record, there will be exactly one AttachmentFile per Attachment. This means that the "primary" AttachmentFile for each Attachment will by default be the previous contents of the Attachments table simply because it will be the only record.

The second query will move data from the AttachmentFiles table back into the re-created columns in the Attachments table:

UPDATE Attachments
SET Attachments.Path = af1.Path, Attachments.Size = af1.Size, 
    Attachments.MimeType = af1.MimeType
FROM Attachments
INNER JOIN AttachmentFiles af1 on af1.Id =
    (SELECT TOP 1 Id FROM AttachmentFiles af2
    WHERE af2.AttachmentId = Attachments.Id 
        ORDER BY af2.CreatedOn DESC)

This query is much more complicated, because we have many AttachmentFiles per Attachment, and we need to select out only one per Attachment. We do this by using a subquery that selects out the most recent AttachmentFile for a given AttachmentId.

Our final migration looks like this:

public partial class AddAttachmentFiles : DbMigration
{
    public override void Up()
    {
        CreateTable(
            "AttachmentFiles",
            c => new
                {
                    Id = c.Int(nullable: false, identity: true),
                    CreatedOn = c.DateTime(nullable: false),
                    Path = c.String(),
                    Size = c.Single(nullable: false),
                    MimeType = c.String()
                })
            .PrimaryKey(t => t.Id)
            .ForeignKey("Attachments", t => t.AttachmentId, cascadeDelete: true)
            .Index(t => t.AttachmentId);

        Sql("INSERT INTO AttachmentFiles " +
            "(CreatedOn, Path, Size, MimeType, AttachmentId) " +
            "SELECT CreatedOn, Path, Size, MimeType, Id AS AttachmentId " +
            "FROM Attachments");

        DropColumn("Attachments", "Path");
        DropColumn("Attachments", "Size");
        DropColumn("Attachments", "MimeType");
    }

    public override void Down()
    {
        AddColumn("Attachments", "MimeType", c => c.String());
        AddColumn("Attachments", "Size", c => c.Single(nullable: false));
        AddColumn("Attachments", "Path", c => c.String());

        Sql("UPDATE Attachments " +
            "SET Attachments.Path = af1.Path, Attachments.Size = af1.Size, " +
                "Attachments.MimeType = af1.MimeType " +
            "FROM Attachments " +
            "INNER JOIN AttachmentFiles af1 on af1.Id = " +
                "(SELECT TOP 1 Id FROM AttachmentFiles af2 " +
                "WHERE af2.AttachmentId = Attachments.Id " +
                    "ORDER BY af2.CreatedOn DESC)");"

        DropIndex("AttachmentFiles", new[] { "AttachmentId" });
        DropForeignKey("AttachmentFiles", "AttachmentId", "Attachments");
        DropTable("AttachmentFiles");
    }
}

For more information, check out our website.

Rippe Tech Talk

Wednesday, November 14, 2012

Code Refactoring - Improving speed without changing external behavior

Tuesday, November 13, 2012

Advanced Techniques with Database Migrations

Followers

Blog Archive