James's profileJames McCaffreyBlogLists Tools Help

Blog


    May 31

    Web Testing, Part III

    An extremely powerful, but rarely used, technique to test Web applications' functionality is to write test automation which exercises the app using the IE Document Object Model. Click on the image at the bottom of this entry to see what I mean. The image shows that I've written a test harness using HTML. There is a master HTML page which really is just a container for two frames; let's call them left and right. The right frame houses the Web application under test. In this case it is a simple calculator app. The left frame houses a bit of UI for the test harness (a button to launch the automation, a text area to display messages and so on). The left frame also holds JavaScript code which manipulates the Web app in the right frame through the IE DOM. For example, one of the lines of code is:

     

    parent.rightFrame.document.theForm.TextBox1.value = "7";

     

    which simulates a user typing 7 into the first text box. The idea of this testing technique is simple in principle but the details are somewhat tricky. In particular, synchronizing the actions on the Web app is rather delicate. Explaining exactly how this Web testing technique works was one of the main motivators for me to write ".NET Test Automation Recipes" and I dedicate a chapter to all the details.

     

    An advantage of testing a Web application through its UI using JavaScript to manipulate the IE DOM, compared with other testing strategies, are that the technique is relatively simple once you know the key tricks. A second advantage is that the technique does not require you to instrument or modify the Web app under test in any way -- you use it exactly as is. A disadvantage of this technique is that because the IE DOM is specific to IE, you have to write separate scripts if you want to test your Web app on different clients such as Netscape Navigator.

    Web Testing, Part II

    The most basic way to test a Web application is to manually test the app through the client software (IE in a Microsoft environment). From the scenario described in the previous entry, you would type in a search term, click the submit/search button on IE, wait for the HTTP request to be received and processed by the target Web server. Then after the resulting HTML page is returned to the client as an HTTP response and rendered by IE, you would visually verify that the result page is correct or not. Manual testing is essential but gets very tedious, very quickly.

    The most fundamental way to test a Web application using test automation is shown by the screenshot at the bottom of this page. Instead of sending a request through IE, you create a test harness client and programmatically send an HTTP request to the Web server. The request is received by IIS, and an HTML page is returned back to the test harness client. Instead of rendering the HTML in friendly form, the test harness examines the HTML response, looking for values which indicate whether the response is correct or not.

    In principle, this testing technique is quite simple, but there are a lot of details. Luckily, if you are working in a Microsoft environment, the .NET Framework makes this technique relatively easy. The technique is centered around HTTP. Now HTTP is a protocol which rides on top of TCP/IP. So, an alternative approach is to programmatically send the request from client to server using TCP/IP. I've heard mixed opinions on whether using this approach is necessary, or even a good idea. On argument goes, "why bother?" Another argument goes, "you easily get additional testing and also validate your HTTP-based test results." In general, because there is rarely enough time to test any software system thoroughly, I rarely use the TCP/IP technique, but it has come in handy several times.

    May 25

    Web Testing, Part I

    If you are  software tester, you must know how to test Web applications. And in order to understand how to test Web applications, you must have a basic familiarity with the client-server process. Click on the image at the bottom of this entry to see a diagram which describes the process. Imagine you are at a machine, with Internet Explorer running. Your machine is the client, and IE is the client software. Suppose you're at some Web site which sells books. You type in some search term such as "test automation" and click on the search button, indicated by Step #1 in the diagram. This generates an HTTP request which is sent to the Web server, Step #2. The HTTP Request says in essence, "please construct a Web page which lists all the books which are related to the text 'test automation'". The Web server accepts this request. In a Microsoft environment, the Web server is running Internet Information Services (IIS) and this is the server software. IIS then executes a script -- typically VBScript (for ASP technology), or C# or VB.NET (for ASP.NET) -- which dynamically builds an HTML page (step #3 in the diagram). After this HTML page has been created, the page is returned as an HTTP Response to the client (step #4). The client software, IE, accepts the HTML page and then renders the page in human-friendly form (step #5).

    The process is pretty straightforward, but it's absolutely essential to keep this process in mind when you are writing Web application software test automation. By the way, in the process above, in step #3, when IIS is executing a script/program which creates the HTML response page, IIS will often need to send an intermediate request to a database server in order to get the data IIS needs to build the page. Interestingly, this added complexity does not always influence writing your test automation.

    May 23

    Visual Studio Team System

    Let me start by stating I'm kind of a crusty, old-school software engineer. I'm always skeptical about the "next great product" because I've seen too many products and technologies fail to live up to their hype. I've used dozens of bug tracking tools over the years. So, when I decided to investigate the new Microsoft Visual Studio Team System (VSTS), my expectations were low. Very, very low. But let me cut to the chase and say that VSTS absolutely impressed me. This is one awesome product. VSTS has a ton of features but here I'm going to just describe how VSTS is used for bug tracking. Click on the image at the bottom of this blog entry to see a screenshot of the new bug tracking feature.

    Here's a bit of background context. When I was first working at Microsoft in the mid-1990s, the standard bug tracking tool was an internally developed tool named RAID. The tool's name was not an acronym; it was a reference to the insecticide product which kills bugs. Nobody I've ever talked to knew exactly when RAID was created but the consensus is that it was developed somewhere about 1993. RAID was a great tool for the time, when Microsoft was a relatively small company, with relatively few products. The problem with RAID was that is was somewhat difficult for different, but related, product groups to coordinate their bugs. Next came a tool named Product Studio about mid-2001. So that it could work between groups, Product Studio was completely integrated into the internal Microsoft network -- so much so that the product actually required changes to the Active Directory schema! Although Product Studio was an improvement over RAID in some ways, it was troublesome to manage and had quite a few bugs early on. It was clear as early as 2002 that Product Studio was not going to be a long term bug-tracking solution at Microsoft. In March of 2006, Product Studio was officially discontinued in favor of the new VS Team System.

    Visual Studio 2005 shipped in, not surprisingly, 2005. One of the editions is Visual Studio Team System. The bug tracking components of VSTS are called Visual Studio Team Foundation Server (on the server side), and Visual Studio Team Explorer (on the client side). It's not quite clear what to call this system -- is it Foundation Server (FS), or is it Team Explorer (TE), or is it Team System (TS)?  Anyway, based on my early experience with the system, I am particularly impressed by three things. First, I really like the way Team Explorer is integrated into Visual Studio. If you check out the screenshot at the bottom of this entry you'll see that TE is hosted inside VS. No more need for a completely separate tool. Second, I really like the way FS is integrated with Windows. It is really easy to manage users, groups, and permissions because they're all based on Windows and Active Directory. No more separate user lists, password lists, group lists, individual permissions management, and so on. Third, I really like the way that FS is almost totally customizable via XML-based template configuration files. For example, I was stunned that the default configuration setting did not include a bug severity field. But I quickly figured out how to add a severity field.

    Just so you don't think I'm a Microsoft marketing hack, let me point out that VSTS isn't perfect. In particular, installation could have been a little bit better documented with some screenshots, the system's overall performance is adequate at best, and I am disappointed that there isn't an out-of-the-box Web interface to Foundation Server. Based on the modular architecture, I have no doubts that I can pretty easily craft a simple Web page interface to FS. However, I shouldn't have to do that from scratch. So anyway, bottom line: for working in a Microsoft environment, Visual Studio is (so far) clearly the best system for bug tracking I've ever seen.

    May 17

    Open Source Test Frameworks

    One approach to software test automation is to use an Open Source test framework. A test framework is a set of prewritten software modules plus optionally, a master program of some kind that engineers can use to create test automation. In other words, if lightweight test automation is a short program, written from scratch in a language such as C# or Perl, which tests a software system, then an Open Source test framework is much the same except that a lot of code modules are already available to you. Or put another way, an Open Source test automation framework is similar to a commercial test framework, except that an Open Source framework is free, typically (but not always) at the expense of features, quality and support. Bret Pettichord, a well-known writer and speaker on software testing, also pointed out to me that Open Source frameworks can also be considered a library which enables you to more easily write test automation. There are hundreds of Open Source test frameworks available. Some examples are JUnit for Java unit testing, NUnit for C# unit testing, and Watir for Web testing. Click on the image at the bottom of this blog entry to see a screenshot of Junit.

    Like all testing approaches, using an Open Source framework has pros and cons. The obvious advantage of using an Open Source framework is that, once you learn how to use the framework, you can potentially save time. One disadvantage of using an Open Source framework is that most require a moderate to significant amount of time to learn. Another disadvantage of Open Source frameworks is that they tend to be changing constantly as new features are added and bugs are fixed. And a third disadvantage is that when you use an Open Source framework, there is an extremely strong tendency to test your system based on what the framework can do rather than what needs to be tested.

    As always, there is no one best approach to software test automation. Some members of the Open Source community tend to be very strong advocates of all things Open Source, even to the extent of claiming that using Open Source frameworks is always a better approach than writing custom lightweight test automation. That is just not true; different software development scenarios require different testing approaches.

    May 16

    Software Risk Analysis

    If you are a software tester, sooner or later you'll have to deal with software risk analysis and management. Risk is loosely defined as the possibility that an unwanted event can happen. Risk analysis is the process of identifying and estimating the likelihood of possible risks. Here's an example. First you list possible risk events. Risk analysis can happen at different levels. For example, at a management level, risk events are things like key personnel quitting. At an implementation level, risk events are things like choosing a multi-threaded architecture for improved performance vs. choosing a single-threaded architecture for simplicity. And as a tester, risk events are often not discovering a particular type of bug. Let's suppose you are worried about 3 arbitrary risk events, A, B, C. Now you must estimate the probability of each of these events. Suppose you guess that the probability of risk events A, B, and C are 0.3, 0.4, and 0.1 respectively. Notice that because the risk events are not necessarily related, their probabilities need not sum to 1. Now you must estimate the risk impact of each risk event. Let's imagine that you assign the costs of events A, B, and C as $1000, $3000, and $6000 respectively. You can now compute the so-called risk exposure of each event, which is just the probability of the event times the impact:

    A: (0.3)($1000) = $300

    B: (0.4)($3000) = $1200

    C: (0.1)($6000) = $600

    You can use this information to help prioritize where you spend your resources. Next you can compute your overall risk exposure, which is just the sum of the individual events' exposures: $300 + $1200 + $600 = $2100. You can use this value to compare against previous risk analysis data from other projects, and you can use the value to estimate your likely cost overruns from your base budget.

    I am not a fan of the risk analysis procedure described above. All the numbers give a very false sense of the accuracy and precision of the process. Here's an alternative approach I prefer. Instead of assigning numerical probabilities to each risk event, you assign a likelihood attribute of L, M, or H (for low, medium, and high). Then instead of assigning a numeric (money) cost for the impact, you assign one of the same L, M, H attributes. So, the example above might translate to the following where the first attribute is the likelihood, and the second attribute is the impact:

    A: M x L -> ML

    B: H x M -> HM

    C: L x H -> LH

    Each risk event will be described by one of 9 attribute pairs: LL, LM, LH, ML, MM, MH, HL, HM, HH. You should pay most attention to the HH and HM risk events. You can often ignore LL and LM risk events. How you treat the other five risk types will depend on your particular situation.

    May 11

    Lightweight Software Test Automation

    Lightweight software test automation is the process of creating and using relatively short and simple computer programs, called lightweight test harnesses, designed to test a software system. Lightweight test automation harnesses are not tied to a particular programming language but are most often implemented with the Java, Perl, Visual Basic .NET, and C# programming languages. Lightweight test automation harnesses are generally four pages of source code or less, and are generally written in four hours or less. Lightweight test automation is often associated with Agile software development methodology. Click on the image at the bottom of this blog entry to see a screenshot of a typical lightweight test automation run which is testing methods in a library of cribbage card game routines. 

    The three major alternatives to the use of lightweight software test automation are commercial test automation frameworks, Open Source test automation frameworks, and heavyweight test automation. The primary disadvantage of lightweight test automation is manageability. Because lightweight automation is relatively quick and easy to implement, a test effort can be overwhelmed with harness programs, test case data files, test result files, and so on. However, lightweight test automation has significant advantages. Compared with commercial frameworks, lightweight automation is less expensive in initial cost and is more flexible. Compared with Open Source frameworks, lightweight automation is more stable because there are fewer updates and external dependencies. Compared with heavyweight test automation, lightweight automation is quicker to implement and modify. Lightweight test automation is generally used to complement, not replace these alternative approaches.

    Lightweight test automation is most useful for regression testing, where the intention is to verify that new source code added to the system under test has not created any new software failures. Lightweight test automation may be used for other areas of software testing such as performance testing, stress testing, load testing, security testing, code coverage analysis, mutation testing, and so on.

    If you are interested in lightweight test automation in a .NET environment, check out my book ".NET Test Automation Recipes: A Problem-Solution Approach", Apress, 2006.

     

    May 09

    Commercial Test Automation Frameworks

    There are four main approaches to software test automation: commercial test frameworks, Open Source test frameworks, lightweight custom test harnesses, and heavyweight custom test harnesses. There are a huge number of commercial test automation frameworks available. Some of the larger and better known companies which sell these often very expensive programs include Mercury, Segue, Compuware, and Rational. Each of these companies sells many different tools. For example, Mercury sells QuickTest for basic unit testing, and WinRunner for Web application testing. Segue sells SilkTest for basic unit testing and SilkPerformer for load testing. Compuware sells QACenter Enterprise Edition for functional testing, and QACenter Performance Edition for performance testing. Rational sells Rational Robot for general purpose client-server testing, and Rational Functional Tester for windows-based application testing. A once very popular tool which is no longer being sold was Visual Test. All of these products are fairly similar. Click on the image at the bottom of this blog entry to see a screenshot of the old Visual Test.

    I have used products from all these companies. In general, the main advantage of using commercial test automation tools is that they provide you with a coherent management structure (although you often have to buy a separate product). An alternative to using commercial test automation frameworks is simply to write your own custom test harnesses. Compared to custom test harnesses, commercial test frameworks are extremely expensive. Commercial automation frameworks have a very steep learning curve (some even have a proprietary scripting language you must learn, and/or a complex object to learn on top of a not-so-obvious user interface). Commercial tools are constantly being revised and repackaged, and even the companies themselves are often acquired by other companies. For example, Visual Test was originally developed as an internal Microsoft tool, then marketed by Microsoft as a commercial product.  Visual Test was then acquired by the Rational company, and then Rational was acquired by IBM. But by far the biggest disadvantage of commercial automation frameworks is a subjective issue.  If you use a commercial tool, you tend to exercise your system under test based on what the test tool can do most easily, rather than testing the SUT based on what needs to be tested.

    Based on my experience, the most successful testing approach is to use both commercial test automation tools (if you can afford them) and lightweight custom test automation. Or put another way, there is no single best approach to software test automation.

    May 06

    Repro Steps

    When software testers find a bug in the system they're testing, they record information about the bug in a bug tracking tool. These bug tracking tools are essentially a database with a user interface. There are many commercial and Open Source bug tracking tools you can obtain. At Microsoft, most testers currently use an internally-developed bug tracking tool called Product Studio. When entering a new bug, based on my experience, I'd say the two most important fields of a bug report are the Title and the steps necessary to reproduce the bug, or Repro Steps for short. Click on the image at the bottom of this blog entry for a screenshot of a dummy bug.

    A good bug Title is obviously important because it's the first field most testers search on to determine if the bug they just found has already been entered into the bug tracking tool. The Repro steps are hugely important. A bug can't be fixed unless the cause of the problem is found, and the problem can't be found unless it can be reproduced. There's nothing more irritating to a developer than to attempt to replicate a system problem, but not have enough information in the bug tracking tool. Writing good repro steps is part art and part science. Suppose you're a tester. If you enter every conceivable bit of information as part of the repro steps, your repro steps would be pages long for each bug. Entering not enough information is even worse. As a really, really crude rule of thumb, most of the bugs I record generally have about 8-12 steps.

    The hard part is figuring out the context of who will be trying to reproduce the bug. By that I mean, a developer who works very closely with you will understand all kinds of information about the system under test that you don't need to place in the repro steps. For example, a closely-aligned developer will know where the drop point is, the build numbering system, and so on. But suppose you are outsourcing the bug fixing process to a third party development team. In a situation like that, you can't make nearly as many assumptions about what the developer knows.

    All repro steps should have Expected Result and Actual Result information. Some bug tracking tools have explicit fields for this information, but if your tool does not, you need to include expected and actual information somewhere. Some of the product groups I worked in use a single Result field instead of separate Expected and Actual fields. The need to tell the developer who is reproducing your bug, what is supposed to happen after your repro steps may seem obvious, but it's something new software testers often don't do.

    By the way, the new Visual Studio 2005, Team System version (VSTS) contains really many new features including a bug tracking component based on Product Studio. Active development of the internal-to-Microsoft Product Studio tool has stopped and eventually most Microsoft product groups will use VSTS. So, you now have the ability to use the same bug tracking system as that used by Microsoft engineers.

    May 02

    ODBC vs. OLE DB

    A frequent question I get from new software test engineers is, "What is the difference between ODBC and OLE DB?" Both are specifications which describe how an application program can access the data in a data store. ODBC stands for Open Database Connectivity. The ODBC call-level interface specification was created by Microsoft in 1992 as a way to standardize program-to-SQL data communication. Before ODBC, application programmers had to use a different set of API calls for every type of database. By creating a standard interface, programmers could write one set of code (for the most part) that would work with any ODBC-compliant database. ODBC was quickly embraced by most major database vendors and became a de facto standard. Notice that this cooperation happened when Microsoft was not a very large company.

    OLE DB originally stood for Object Linking and Embedding for Databases, but now the acronym just means a COM-based interface to a wide range of data sources. OLE DB is sometimes written as OLEDB or OLE-DB. OLE DB came into being in the mid 1990s through an evolution and merging of several Microsoft technologies. The idea of OLE DB is to provide programmers with a consistent interface to many different types of data, including SQL databases, Excel spreadsheets, and so on.

    The best way to understand the relationship between ODBC and OLE DB is by way of a picture. (Click on the image at the bottom of this blog entry to enlarge it so you can see it clearly). Imagine that you are a developer or tester writing a program which needs to access and manipulate some data. If the data is stored in a SQL relational database, use can use ODBC calls. It turns out that working directly with ODBC is a bit awkward. An alternative is to use OLE DB. OLE DB programming tends to be quite a bit easier than ODBC programming, in part because OLE DB operates at a higher level of abstraction. The downside of OLE DB is a slight performance penalty in most cases. Now if you want to get at Excel data, you can also use OLE DB calls to access and manipulate the data. Of course I've left out a lot of details, but this overview should help you understand the difference between ODBC and OLE DB.

    If you enjoyed reading this blog you might enjoy working at Microsoft. My parent company, Volt Information Sciences, is always looking for software engineers. Check out the job listings at http://jobs.volt.com/.

    May 01

    Code Complexity Analysis

    Code complexity analysis is the process of examining your system under test and producing a number which measures how complicated the SUT is. There are many different types of code complexity analysis. One of the most common is called McCabe's Cyclomatic Complexity. Developed in the mid-1970s it has stood the test of time quite well. Cyclomatic complexity measures the number of linearly independent paths through a program module. For example, consider this artificially-simple module:

     function foo(int n)

    {

      int result;

      if (n < 0)

        result  = -1;

      else if (n >= 0 && n <= 9)

        result = 1;

      else

        result = 0;

      return result;

    }

    It is fairly obvious that there are exactly 3 paths through the code. Cyclomatic complexity was designed, in theory, to be independent of computer language. If a code module is represented as a graph structure then the definition of cyclomatic complexity is 

    CC = E - N + 2 

    where E is the number of edges in the graph, and N is the number of nodes in the graph. (Note: there are several variations on this formula.) A graph corresponding to the module above is shown at the bottom of this blog entry. The cyclomatic complexity would be calculated as CC = 8 - 7 + 2 = 3. 

    It's not practical to directly translate a program module into a graph structure, so cyclomatic complexity is calculated by looking at the branching instructions (typically if-then, case, and so on) in the SUT source code.  It is not feasible to compute cyclomatic complexity by hand, so there are a large number of commercial and Open Source tools which can do the job for you. You can also write your own complexity metric generation tool fairly easily. Unfortunately to actually calculate cyclomatic complexity you must take into account the syntax and constructs of a particular programming language. This means that the same pseudocode translated into Visual Basic may not yield the exact same cyclomatic complexity metric as the pseudocode translated into Java. But in practical terms this usually isn't an issue.

    So, what is code complexity analysis good for? During the development process, if you monitor complexity, you can early identify situations where your source code's complexity is rapidly increasing. You can also use code complexity metrics to help estimate how much testing resources you'll need -- more complex code requires a greater test effort.

    Cyclomatic complexity is just one of many code complexity metrics. You can group all complexity metrics into two categories: static and dynamic (or run-time). Static metrics analyze the system under test's source code. Dynamic metrics analyze the SUT's behavior during run time. Cyclomatic complexity is an example of static analysis. One way to group static code complexity metrics is to categorize them into "traditional" (my term) and object-oriented metrics. Many of the original code complexity metrics, including cyclomatic complexity, were created before the use of object oriented languages was common. So, starting in the mid 1980s, new complexity measures were created which are specifically intended for object oriented programming. Now the traditional and OOP complexity measures are by no means exclusive -- the categorization is just a useful way to think about the large number of complexity metrics available to you. In a future blog entry I'll describe alternative static complexity measures and OOP complexity measures.

    If you enjoyed reading this blog you might enjoy working at Microsoft. My parent company, Volt Information Sciences, is always looking for software engineers. Check out the job listings at http://jobs.volt.com/.