| James's profileJames McCaffreyBlogLists | Help |
|
April 24 Classification, Clustering, and Rule Set ExtractionI've been working on a set of related programming projects over the past couple of weeks. Classification, cluster analysis, and rule set extraction are closely related topics. Suppose you have a set of data points (also called vectors or tuples) of some sort. These data points could be numeric abstractions such as geometric points, like (0, 3, -1), or the data points might be rows of a SQL database like (Smith, Stan, $21.33, Developer). Now suppose you have a set of known categories, such as c0 = "likely to vote Democratic", c1 = "likely to vote Republican", and so on. Programmatic classification is the process of assigning each data point to a particular category. Programmatic clustering is similar to classification except that you don't have known categories; instead the data points are grouped together into clusters of similar data points. Both classification and clustering can be supervised or unsupervised. With a supervised approach, a set of preliminary training data points are manually classified or clustered, and then this information is used to classify or cluster additional new data points. There is a huge body of research on classification and cluster analysis. However, the majority of this research deals with purely numerical data such as (3.0, 5.0, 2.0). There is much less research on categorical data such as (red, small, hot). The main reason for this is that most classification and clustering algorithms rely on some form of a difference function. It's not too hard to compute a number which represents the difference between (2.0, 3.0, 4.0) and (1.0, 3.5, 2.7), but it's a harder problem to determine the difference between (red, small, hot) and (blue, large, cold). Anyway, I've found what I believe to be some very cool new ways to perform classification and clustering of categorical data. The topic of rule set extraction enters the mix then: after clustering your data, how can you extract a set of if..then rules that correspond to the clustering result? Again, I'm working on some ideas that really fascinate me. April 18 Generating All Combinations of a Set of StringsI was working on some data mining related code last week and ran into an old problem I've bumped up against many times in the past, namely, how to generate all combinations of a set of strings. Mathematical combinatorics was my primary area of study during my undergraduate days and I have my own library of code that I've created over the years. For example see http://msdn.microsoft.com/en-us/magazine/cc163957.aspx. However I decided to search the Internet to see what I'd turn up. I was somewhat surprised at a.) the limited amount of information available for such a common problem, and b.) the amount of information available which is just incorrect. One thing I immediately noticed is that many Internet posts confuse combinations and permutations. A permutation of a set of items is a rearrangement. For example, if a set contains {"Adam", "Bill", "Carl"} then one permutation is {"Bill", "Adam", "Carl}. There are a total of n! different permutations of a set of size n. Combinations on the other hand are subsets of size k of the original set where order does not matter. For example, with the above set, if k = 2, then all three possible combinations are {"Adam", "Bill"} {"Adam", "Carl"}, and {"Bill", "Carl"}. Anyway, generating all possible combinations would yield these 7 combinations:
(k = 1): {"Adam"}, {"Bill"}, {"Carl"}
(k = 2): {"Adam", "Bill"} {"Adam", "Carl"}, and {"Bill", "Carl"} (k = 3): {"Adam", "Bill", "Carl"} This problem turns out to be somewhat surprisingly tricky. I'll give some code in a future blog post. Combinatorics, which includes combinations and permutations, is one of the most fundamentally important areas of software testing. April 13 Cross-Browser Web Application UI Test AutomationSuppose you want to test a Web application through its user interface. Over the past couple of years I've written several articles for Microsoft's MSDN Magazine that demonstrate different ways to do this, but in all cases I assumed you are working in a 100% Microsoft environment and using Internet Explorer. Two of the techniques available to you are to write a C++ or C# language program which calls directly into the IE COM-based API set, or to use JavaScript to call into the IE DOM. The JavaScript approach is often a good choice and works well for Internet Explorer but doesn't always work with other browsers because of differences between browser DOMs. For example, with IE v4.0, to get a reference to a button with ID = "Button1" you can write var btn = document.all["Button1"] but with IE v5.0 and above and Firefox you could write var btn = document.getElementById("Button1"). Although you can write your test automation code to sense which browser is being used and then branch to correct JavaScript, an alternative is to use the nice jQuery library. The jQuery library is essentially a set of JavaScript wrapper functions which are for the most part browser-independent. April 05 Software Testing ConferencesI tend to group conferences that are in some way related to software testing into three categories. One category is hard-core academic/scholarly conferences. These are usually attended by college professors and researchers. The WorldCOMP International Conference on Software Engineering Research and Practice (see http://www.world-academy-of-science.org/worldcomp09/ws/conferences/serp09 ) is an example. The goal of these conferences is primarily to provide a channel for researchers to publish scholarly papers.
A second category of conferences is commercial conferences. These are generally sponsored by a large company such as Microsoft and attended by people who work in the IT industry, often in mid-management roles. Software testing usually has a very small role in these conferences. The Microsoft Management Summit (see http://www.mms-2009.com/ ) is one example. The goal of these conferences is to market and advertise a company's products and services.
A third category of conferences is hard to put a label on but I'll call them industry-organization conferences. These conferences are sponsored by various organizations. The Better Software Conference and Expo (see http://www.sqe.com/BetterSoftwareConf/ ) is a good example. Attendees are often practitioners and individual contributors. The goal of these conferences is, typically, to make money for the sponsoring organization.
I enjoy attending and speaking at conferences. I usually go to conferences in Las Vegas because travel there is relatively cheap and easy. The main benefit from conferences in my opinion is that you get away from the day-to-day work grind and come back with fresh ideas.
It's hard to recommend specific software testing conferences because they are all quite different and really target completely different types of people. The best way to evaluate a conference is to get an opinion from someone you know who has been to the conference in the past. That said however, the Pacific Northwest Software Quality Conference (see http://www.pnsqc.org/ ) is often a good one (meaning good value) for testing practitioners. |
|
|