Search


  • WWW
    Wind River Blog Network

Recent Comments

Disclaimer

October 30, 2006

Four Commercials

By Dinker Charak

Four commercials running across India caught my attention.

The first one is a TV commercial which shows a couple of tennis players playing over the rooftops and jumping off the building during play. It ends with a person inside one of the buildings managing to catch an excellent return with a digital camera. A voice-over describes the wonderful features of the camera.

The second one is also a TV commercial and has couple of people listening to music as they go about their day. A voice-over talks about the ease with which their music is now accessible on the go.

The third one is a print advertisement. It talks about the capabilities of a flashlight, such as its battery life and brightness.

The fourth one I have seen on large billboards. It has two international fashion icons holding up a gold plated fashion accessory as the must-have bling.

The interesting thing is that none of these commercials are for cameras, portable music players, flashlights or fashion accessories. They are all commercials for mobile phones.

I often buy mobile phones for my parents, uncles, aunts and all the non-tech savvy people I know. I tried to recall when was the last time someone asked me to buy a phone that does not drop calls and has good sound. Probably, they mentioned these needs only after all-important requirements have been mentioned.

What they want is to buy a phone that can take videos, that can play Bhajans as ring tones, that has good looks, and is easy to use.

Like my relatives, there is a growing customer-base  who pays more attention to product differentiators and take for granted basic functionality, and they are who are driving convergence in the market.

This is consistent with the DSO message that device manufacturers should spend most of their time and effort on creating an application that differentiates them from their competition.

This is all reflected in the commercials for arguably the most popular consumer device here!

Dinker Charak is a senior software engineer in Wind River India and is building applications to quickly diagnose and repair errors in device software. He likes writing, and has authored a collection of fiction and sci-fi short stories

October 24, 2006

Quality vs. Customer Support

"What is the single most important thing your manufacturer can do to make you completely happy with their service?"
(IMV ServiceTrak Survey 2005)

This was the question that caught my attention at a presentation I attended recently. Apparently the question is multiple-choice, with one answer allowed. Actually, PDF form of the executive summary seems to be available from IMV Web Site.

Figure 1.1, happiness with the service chart shows that 35-43% of customers wanted faster service and 10-23% want better service quality. Apparently 28-39% of the customers are already completely happy with their manufacturers. What interests me is that less than 5% of the respondents wanted to see these companies improve the quality of their equipment. What does this mean? Does faster and better service become more valuable than quality at certain point?

I think so. Consider a hypothetical scenario where a product fails in some way after a year in service (like my latest digital camcorder). Let's say a competing product from a different vendor fails after 2 years. This suggests that it is a better quality product than the first one. Now, consider this: The first company is able to resolve the problem within a day while the second one takes more than a week. So, how does this change the perception of these companies and products for the customer? In my case, my camcorder wasn't fixed in a day, so I am switching to a different brand.

Quality is very important, and it is what engineering teams increasingly invest in. But, without the technical support to back it up, customer satisfaction seems all too elusive.

Many companies I know have been developing tools and processes to deliver better customer support and faster defect resolution. Unfortunately there are not many off-the-shelf products that are designed specifically for diagnosing software problems in deployed devices. Wind River Diagnostics is one such product. I believe we will start seeing more field diagnostics products in the future, as customers are demanding faster and better service.

In summary, given that product quality meets or exceeds customer needs, service and support become big differentiators between products and companies...

October 10, 2006

Reporting Exceptions in QA

I now have a habit of asking for the Exception Detection and Reporting (ED&R) log as a minimum, and the core file if available, whenever QA observes exceptions during testing. If you are not a VxWorks 6.x user, ED&R log is like a signature of a core dump. It briefly describes the exception by providing the address of the crash, stack backtrace, register values, and even a disassembly of the surrounding code. Core file lets one use Wind River Workbench debugger and other host tools, like the host shell if you like to use command line to dump memory and see tasks at the time of exception. In many cases this all I need to pinpoint the root cause, and fix the defect.

Alternative used to be rather painful, time consuming, and much less deterministic. It preempts what I was doing, so that I can setup my environment to recreate the same crash scenario based on the description of the eye-witness (i.e. the tester). If the bug is obvious enough this method works, of course. But, what if I can't reproduce the crash? Was it a faulty setup in the QA lab or was it some rare race condition that caused the exception? And, if I do reproduce the crash, the first thing I'll probably do is to look at it using a debugger, which is what the core file gives me to start with.

ED&R has been part of VxWorks since version 6.0. Memory based core dump is supported since Wind River Diagnostics 1.0 with VxWorks 6.2 and later.

Both ED&R and core dump support requires a bit of planning ahead of time. It would be too late if the features were not configured in the VxWorks kernel when the exception happened. So, they must be included in the image.

If a Wind River Diagnostics Server is installed it can automatically upload ED&R logs and core files from all the devices in the QA lab. Once the logs are in the server, all a tester would need to do is to reference the device id in the new defect report. Afterward, any developer can connect to the same server and download all the logs from the device using a Web interface or through a Wind River Diagnostics plug-in to Wind River Workbench.

I am again a little bit biased perhaps. Hopefully others are benefiting from these technologies as well.

October 02, 2006

Fault Injection

In his recent blog post, Maarten Koning writes about fault propagation paths. That is, faults happening in one place in the code propagating to affect other parts of a system, indeed even multiple systems in an interconnected world. I recently had a chance to talk with a group of engineers at a large telecom equipment manufacturer. They told stories of software defects bringing several systems down for extended periods of time, causing outages in their customer's networks. Well, we all have heard stories similar to these.

I will agree with Maarten in saying; "I suspect that a large part of the code in any system is related to managing faults in some way."

In the past most software developers looked at software defects as failures to provide required functionality. Accordingly,most of the unit and system tests were designed to find as many such failures as possible. Today we must also consider the scenario where software is deliberately attacked to cause a fault. We need to make sure that faults are contained and not propagated.

I think we can enhance the quality of our products in this context by incorporating more fault injection test cases into the system tests. Fault-injection testing is not new, of course. But, most of the traditional fault injection tools and methods use a black-box approach to testing or they significantly modify the platform and the execution environment on which the tests are run. These make them somewhat ineffective and difficult to use. I think, for it to be useful in analyzing fault propagation paths, fault injection must be done in a controlled and deterministic manner. An effective fault injection test should have these characteristics as a minimum:

  1. It must be predictable at a function- and component-level granularity. Grabbing all available memory while the application is running is not predictable in this sense. We have no control of which function will hit this condition first.
  2. It must not change the application and system under test in any significant way. If the application or platform was significantly modified to cause deliberate faults, we are no longer testing the product we will ship.
  3. It must be dynamic and random. Playing back a recorded faulty sequence of packets is not very useful after the first run.
  4. It should be quick to develop and execute. We have millions of lines of code for which new test cases will need to be written.

I am probably a little bit biased but I believe Sensor Point technology can be used very effectively as a white-box fault injection tool. Sensor Point technology is a part of the Wind River Diagnostics product.

So, what are Sensor Points? Sensor Points are arbitrarily large fragments of code that are inserted into a running application or device and activated by patching a branch instruction at a specific address in the application's code. This way, newly inserted code is executed whenever the program counter reaches the patched address. When the Sensor Point code returns, the existing code continues to execute from where it was patched. If the Sensor Point code is relatively small, there is very little performance impact on the system; about 200 instructions overhead per Sensor Point, which is well within the bounds of expected scheduling latency.

So how can we apply this technique to inject faults into a function?

Well, in addition to inserting new code into an existing application (or a function, to be more precise), one can also stub a function completely by using Sensor Points. Together with the ability to nest Sensor Points, which makes it possible to enable a Sensor Point only when a particular function is on the stack, faults can be injected to a precise function or a component without affecting other functions or applications in the system. Hence, one can measure propagation characteristics of a fault originating at a precise point in the system, perhaps even at a precise point in time.

I will write more about using Sensor Points in QA testing in the future, and provide some fault-injection Sensor Point examples.

Bulent Kasman

  • Bulent Kasman has over 20 years of experience in developing systems and network management software. Currently, he is the architect of the Diagnostics product line at Wind River, where he is designing applications to quickly diagnose and repair errors in device software.