| James's profileJames McCaffreyBlogLists | Help |
|
June 28 SQL Server 2008 and Testing - HierarchyIDSQL Server 2008, due to be released within a few months, has a ton of new features. I've been looking at RC0 (release candidate 0), trying to determine the implications SQL Server 2008 has for testing. The first step in such an analysis is to nderstand exactly what the new features of SQL Server 2008 are. One of the interesting new features is a new HierarchyID data type. You can use HierarchyID to represent hierarchical data, such as the manager and report-to relationships in a company. Of course you can represent such relationships but HierarchyID provides a de facto standard. I didn't find any good HierarchyID examples (by that I mean examples which expose aspects of HierarchyID that are relevant to testing implications) so I experimented. Consider this example based on one in the documentation:
-- assume we have an existing table of employees
create table tblOrgStructure
( orgnode hierarchyid primary key clustered, orglevel as orgnode.GetLevel(), -- computed column emp_id int unique not null, emp_name varchar(50) not null, emp_title varchar(50) null ) go insert into tblOrgStructure -- root node
values ( hierarchyid::GetRoot(), -- orgnode -- orglevel (will be computed) 111, -- emp_id 'Allen Anderson', -- name 'President' -- title ) insert into tblOrgStructure values -- child of root
( hierarchyid::GetRoot().GetDescendant(null,null), 222, 'Bob Baker', 'Vice President' ) -- child of non-root
declare @managersOrgnode hierarchyid declare @newOrgnode hierarchyid -- 1. use manager's ID to get manager's orgnode select @managersOrgnode = orgnode from tblOrgStructure where emp_id = 222 -- 2. use manager's orgnode to get insert position select @newOrgnode = MAX(orgnode) from tblOrgStructure where orgnode.GetAncestor(1) = @managersOrgnode -- 3. insert new node insert into tblOrgStructure values ( @managersOrgnode.GetDescendant(@newOrgnode,null), 333, 'Chris Collins', 'Director' ) select orgnode.ToString(),
orglevel, emp_id, emp_name, emp_title from tblOrgStructure As you can see, the trick is to determine the orgnode which is a HierarchyID type. So, what does this mean for testing? First, we'll assume that the SQL Server 2008 folks have tested the heck out of HierarchyID and its associated functions such as GetDescendant(). So, by testing HierarchyID, I really mean testing an application program (or perhaps a library module of some sort) which uses a SQL table which has one or more HierarchyID columns. And this means you'd have use standard hierarchical data testing techniques -- examine empty organization structures, org structures with just one node, and so on. June 21 Perl, Testing, and Web ProgrammingPerl used to be one of my favorite languages for performing test automation. Perl is available on most platforms, has been around for a long time, and has extensive libraries that have all kinds of functionality such as HTTP requests. I don't use Perl very much anymore. Perl has fairly tricky syntax which makes development time a bit longer than with more modern languages such as C# and PowerShell. Anyway, I was putting together an introduction to Perl class last week and discovered a couple of things about Perl and Web programming that I didn't know before. Namely, you can use Perl (instead of the usual VBScript) to create ASP pages on the server side, and you can use Perl on the client side too. Sort of anyway. The trick is that you really have to use something called PerlScript, which is a kind of set of extensions to the Perl language. PerlScript comes from ActiveState, the most common version of Perl used on Windows based systems. Here's an example of PerlScript used to create an ASP page:
<%@ Language="PerlScript" %>
<html> <body> <h3>ASP with PerlScript Example</h3> <p> <% $t = localtime $Response->write("This page created at $t"); %> </body> </html> Pretty cool. You can also use PerlScript on the client side (instead of the usual JavaScript) too. For example:
<html>
<head> <script language="PerlScript"> sub putMessage { $window->document->myForm->Text1->{'value'} = "Hi"; } </scipt> </head> <body> <form name="myForm"> <input type="text" id="Text1" /> <input type="button" onclick="putMessage();" /> </form> </body> </html> Also pretty cool but not terribly useful because the technique requires PerlScript on the client. Anyway, Perl is a very flexible language and it's amazing what you can do with it.
June 14 Parsing XML FilesOne of my least favorite software test automation tasks is parsing XML files. For example, a common scenario is to have an XML file which contains test case data (such as test case IDs, inputs, and expected values) and I've got to parse the file. For some reason, although I love coding of all kinds, I just don't get much enjoyment out of parsing XML files. Anyway, when programming in a .NET environment I use five main techniques to parse XML files. My most common approach, and the technique which is usually the only technique you'll find if you research parsing XML files, is to use the XmlDocument class of the System.Xml namespace. I read the entire XML file into memory and then use a combination of methods and properties including SelectNodes(), Attributes, SelectSingleNode(), InnerXml, InnnerText, and Item. It's quite ugly, but that's just the way XML is sometimes. A second approach is to use the XmlTextReader class. This technique works well when you have very simple XML files that have a consistent structure with very few levels. A third approach is to use the XmlSeralizer class. This technique is sometimes a god approach if you need to perform processing of your XML data once it's been parsed into memory, because you end up with an array of objects. A fourth XML parsing technique is to read XML data into memory using the XPathDocument class. This approach is quite rare and I use it only when I need to perform a lot of searching through the XML data -- the XPathDocument class is optimized for search with XPath queries. My fifth XML paring approach is to use a DataSet object. Here I read my entire XML file into memory as a DataSet object using the ReadXml() method. Once in the Dataset object, I can use DataSet methods and properties such as GetChildRows() and DataRow to extract individual XML data. There's no real moral to this blog -- parsing XML files simply isn't very much fun. The five techniques I've listed here meet most of my test automation scenarios. Here's an article I wrote some time ago that gives you all the details and a bunch of complete examples: http://www.ddj.com/windows/184416669. June 06 Software Project Risk IdentificationIdentifying risks is an absolutely essential activity for all software projects. A risk is an event which is unpredictable and which has negative consequences. You must identify project risks, so that you can analyze the likelihood and impact of the risks (see a nice overview at http://www.rand.org/pubs/working_papers/2004/RAND_WR112.pdf), and then plan to prevent the risks (in part, through testing) and mitigate the effects of risks if they do occur.
Technique #1 - At a very high level of granularity, you take each of the four components of any project (time, cost, quality, features) and ask yourself, what could happen to make any of these four components go wrong. For example, "In my project, what events would make me go over budget?" Pros: easy, good way to get started. Cons: too high level to catch all risks.
Technique #2 - At a slightly lower level of granularity than Technique #1 above, you can walk through a taxonomy. This is simply a classification of generic software risks. There are many software project risk taxonomies available. One commonly used taxonomy was published by the Software Engineering Institute. It is available in an Appendix in a document at http://www.sei.cmu.edu/pub/documents/93.reports/pdf/tr06.93.pdf. The SEI taxonomy is structured into various categories and sub-categories as a list of 194 questions. Examples include, "[159] Are the development facilities adequate?", and "[3] Are there any TBDs in the specifications?" Pros: relatively easy and mechanical. Cons: can take a long time, too generic to catch risks specific to specialized projects.
Technique #3 - A less generic approach of risk identification is to walk through your software project specification documents, and ask what could go wrong for each feature, activity, and person. Pros: quite specific. Cons: only as good as the specification documents.
Technique #4 - If you use traditional project management techniques, you will have a Work Breakdown Structure. For each leaf Work Package activity, you ask yourself, what risks are associated with the activity. Pros: Can be applied to many levels of work package granularity, leverages an existing WBS. Cons: requires a WBS, often emphasizes activities and deemphasizes resources.
Technique #5 - An unusual approach, but one which I've used quite successfully, is to storyboard. You assume various roles (end user, developer, etc.) and walk through various scenarios, asking yourself what could go wrong at each step. Think about going on a plane trip: drive to airport (bad traffic risk), park at airport (no parking available risk), etc. Pros: effective in practice, very specific. Cons: more art than science.
(Semi) Technique #6 - A theoretical but promising approach is explained in a research paper at http://iag.pg.gda.pl/iag/download/Miler-Gorski_Risk_Identification_Patterns.pdf. The authors present a possible structured way to identify risk by first identifying six project components: activities, artifacts, roles, practices, features and capabilities. Next you run through a list of risk patterns. For example, one specific example of a pattern is: "If Requirements <artifact> loses Clarity <feature> then Development <activity> takes more time than expected." Pros: unknown. Cons: unknown.
June 01 Estimating the Probability your Project will Finish on TimeI think most software developers, testers, and managers should have a basic understanding of estimating the probability that a project will finish on time (or finish behind schedule). The technique is fairly simple. First you break your project down into manageable sized chunks. At a coarse level of granularity these chunks can be milestones (typically measured in weeks or months), or at a fine level of granularity these chunks can be work packages (typically lasting from 4 to 40 hours) that are derived from a Work Breakdown Structure. Next for each chuck you estimate how long it will take, using an optimistic guess, a pessimistic guess, and a most likely guess. Of course this is the hard part, and you have to rely on historical data from similar projects, expert judgment, or some other method. Now for each chunk you compute the duration mean and variance. How you do this depends on which probability distribution you use, but the beta distribution (along with the triangular) is the most common. The mean for a beta distribution is the quantity of (optimistic, plus 4 times most likely, plus pessimistic), all divided by 6. The variance for beta is simply the square of (the quantity of pessimistic minus optimistic divided by 6). Now you compute the sum of the means and the sum of the variances. With these you can compute a Z score as (X - M) / (sqrt(sum of variances)) and use the Normal distribution to compute your probabilities. This sounds a lot worse than it is. Here's a highly simplified example. You have three chunks, A, B, C. The means are 4.0, 5.0, and 8.0 (arbitrary units) respectively. The variances are 4.0, 9.0, and 36.0 repectively. The sum of the means is 17.0. The sqrt of the sum of the variances is sqrt(4 + 9 + 36) = sqrt(49) = 7.0. You want to know the probability that your project will take between 17.0 days (the mean) and 27.5 days. Z = (27.5 - 17.0) / 7.0 = 10.5 / 7.0 = 1.50. Looking up this value in a Standard Normal Distribution table you get probability = 0.4332 or 43%. As with any quantitative technique, a.) your result is just a crude estimate, b.) because in most cases even a crude estimate is better than no estimate, c.) your final estimate is only as good as your input data, and d.) the most important value from such an analysis come from setting the problem p, not the final answer. |
|
|