Software Engineering Posts

July 23, 2008

Wind River Webinar: Q and A

Hello All,

last week as some of you know I was the featured presenter / presentation for a Webinar.   (you may have problems watching that with firefox...) .  During the course of the webinar, we were asked a number of questions, and we ran out of time...

Here are some of the questions and answers we couldn't get to.

--
Q: What is the biggest advantage of Vxwork over other real-time operation system?
A:  VxWorks is the most well-deployed and well-used commercial realtime OS in the world.  Because of this, the OS is more well-tested than any other commercial RTOS, for this reason I would say the maturity of VxWorks is perhaps it's greatest advantage.
--
Q:have you ever patched the OS or only the application sw?
A: In most cases, only application code is updated in space, though it is possible to patch  or replace even the bootrom code in many of the space robots.
--
Q:Can VXWorks 'replace' itself in case of malfunction?
A: I'm not certain what "replace" means in this context; some of our customers have invented ways to detect bad RAM locations, map around those locations, and load vxWorks to the remaining RAM. 
--
Q:How do you ensure that the RTOS does not crash? Even due to a simple NULL-pointer access? Do you have any built-in Crash Recovery mechanisms in WindRiver for Space systems?
A: While debugging with newer versions of VxWorks it is possible to use the MMU to trap accesses to, for instance, the 0-page, to catch accesses to uninitialized pointers, etc.  Once code is sufficiently  debugged, the MMU may be disabled if desired, or the product may be deployed with the protection enabled.  As far as crash-recovery systems, most of the systems in-flight have created health maintenance and monitoring systems, and event logging systems.  Wind River has learned from this, and our newer OS releases have support for configurable event logs and health monitoring.
--
Q:Do you customize Vx Works for individual customers ?
A: We have services experts available to help with everything from the initial installation, to implementation of the entire product, including modifying vxWorks for a particular project.
--
Q:Which the CPU platforms are apted for VxWorks? How to get more info about VXWorks and its using?
A: VxWorks is available for many PowerPC, MIPS, ARM, Xscale, and other CPU types.  Wind River has offices world wide.  Please check at WWW.Windriver.com for office locations.
--
Q: I know you have a trial version of vxworks, but for a person who want to learn it, it expires quick. do you have a slim version with no expiration?
A: We do not have a "slim" version.
--
Q:  Is there a Webinar that spotlights VxWorks? A Demo?
A:  Demos are available of all of our products, pleas contact your local sales office.
--
Q: how do you update your software on space robots?
A: This is highly dependent on the hardware and hardware capabilities used to implement the robots, so unfortunately our customers must usually invent the right methods.
--
Q: AI and VX Works.. Any efforts to embed supporting AI as a native support on VX Works?
A: Though AI systems and some AI capabilities have been implemented using VxWorks, I am not aware of any efforts to embed more than rudimentary AI functionality along with VxWorks; Stardust, DS1, and the Mars Exploration Rovers are the best examples I can think of that incorporated any degree of AI.  Wind River is not planning on adding any AI capabilities to VxWorks (or Linux) at this time.
--
Q: how do you know when to time out communication to a mars rover when there is so much delay and interference?
A: That is left to the folks who implement the radio protocols used by the Deep Space Network to communicate with all the probes/robots/satellites in deep space.  They are experts with communications in deep space.  I expect sometime in the near future that this will all change with Delay Tolerant Networking and the implementation of the InterPlanetary Internet.
--
Q: Is VxWorks migrating from support of ASIC hardware to FPGA's?
A: VxWorks runs on a variety of platforms including COTS boards, ASIC, and FPGA based designs.
--
Q: All application using embedded controller or PCs?
A: Many manufacturers of COTS computer boards for VME, PCI, or cPCI supply VxWorks BSPs for their boards, and Wind River Systems supports BSPs for several COTS boards directly.  I hope this answers the question.
--
Q: Do you see an increase in the percentage of embedded applications that use some form of Linux vs others like pure VxWorks (i.e. not including VxWorks Linux)?
A: To be clear: VxWorks is NOT Linux, the two are not even remotely related to each-other; vxWorks pre-dates Linux by... years.  Likewise, Linux did not evolve from VxWorks.  They are similar in some respects (Posix, networking support, etc), but they are not even "kissing cousins".  Wind River does have our own versions of Linux available along with VxWorks.
Though I have seen an increased presence of Linux in the embedded arena, and an increased presence in the development and testing phases of even software for space applications, I have not seen Linux promoted to controlling a mission (e.g. the primary flight computer) yet.
--
Q: What problems are unique and interesting to  underwater implementations of VxWorks like those faced by MBARI?
A: I wish I had a contact at MBARI for you to ask!  As far as I know, most issues unique to the situation would be shared by all submersible vehicles, and MBARI has excellent experience in dealing with submersibles and software for underwater applications. Once the mechanical issues of sealing out the environment are taken care of, weather it's space or deep-sea, the rest becomes implementing software to control your devices, debugging, and deployment.
--
Q: [referring to an earlier question] Additional details on my earlier question on RTOS decision making ... The specific application is for an RTOS decision to be made for new instrumentation used in human spaceflight ... that is where do I start?
A: I would imagine you would need to start with the specifications for your deliverables: what kinds of certifications you will need, if any, and what kind of constraints the computer needs to operate under.  For instance, will this be a deep-spacee project requiring rad-hard hardware, or will rad-tolerant hardware suffice?  Will you need FAA or military certification, or none at all?  Wind River is happy to discuss the software packages we have and how they may be applied to your project.
--
Q: which vxworks versions have flown in the past and how customized were they ? (components)
A: VxWorks 5.2, 5.3.1, 5.3.1 MER Edition, 5.5.1, and I think 6.2 have all flown in space.  Other versions have flown in military aircraft, etc.  The most customized component of VxWorks I believe would be the DosFS file system on the MER rovers.  The folks at JPL made some great improvements after the SOL18 issue. 

For the most part, we strive to keep the releases "as close as possible" to the standard releases in order to facilitate technical support and software maintainability.  Newer radiation hardened chips are available that are very similar to commercial parts, like standard PowerPC or Sparc chips, and these newer chips run the same (current) versions of vxWorks as everyone else does.  This allows the customer to use our standard technical support for many issues, making experts more available for all issues.
--
Q: There were flash managment issues on MER and the Polar Lander. Could you explain the issue?
A: The flash management issues on MER "A" Spirit was actually more a RAM management  issue combined with a debug feature, precipitated from a "feature" of the DOS file system.  I am not intimate with any problems experienced on Mars Phoenix Lander, but the last I heard MPL's experts have identified a possible application problem.  Given when I'd heard this, I'd expect they may already have tested and sent-up a fix.
--
Q:Is VxWorks already been ported to RAD750 or Leon3FT ? Do you support academic R&D with easy to access software or anything other than that ? Thanks!
A: Yes - Rad750 BSP is supported by BAE Systems, from Wind River's perspective it's pretty much a "generic" PowerPC 750 and uses standard software and tools.  BSPs are available for VxWorks versions 5.x and 6.x, I believe 6.4 is available and a BSP for 6.6 will be available soon.

Wind River does not directly support VxWorks on Leon, but the manufacturer does have a solution with VxWorks 6.x available (I find this very fascinating and would love to "play" with it sometime).  :)

Wind River does have a University Program, contact your local Wind River sales office for details.
--
Q: How many versions/levels are there of VxWorks, and, how do you choose a version of VxWorks for example a Mars rover?
A: VxWorks has been around for a while, well over 20 years.  When I first saw it, the version was 4.0, and there was one version of vxWorks that ran on top of other companies kernels.  Now there are various versions of VxWorks for specific markets, and platforms available to help tailor VxWorks for specific usage.  I would always recommend using the latest version available for your hardware platform that satisfies the needs of your program.


This was the first webinar I've ever presented.  I hope you enjoyed it as much as I did. 

May 03, 2007

What's in a Kernel?

Recently my cohorts Paul Parkinson and Doug Gaff have been making a bit of noise about changes in the software industry - evolution of software, and specialized software for military and commercial avionics,and other applications that have strict time and security requirements.  Aside from requiring strict adherence to standards, critical systems software - especially medical or flight related - have an increasing litany of certification inspections it must pass.  In the secure-systems markets the same trend is evident.  Each of these markets are starting to require time- and space- critical systems.

Let me define time and space critical. 

Time critical is pretty straight forward - anything that has to be done within deadlines, is time critical. Sometimes time-critical functions are also periodic in nature, requiring periodic service with deadlines.

Space-critical is a bit more slippery.  This is about RAM, but not just "how much".  I'm talking about space in terms of RAM that other processes are protected from, and RAM that is protected from this process, and even memory shared between processes.

There are a lot of examples of time critical periodic applications.  Any aspect of handling telemetry - course, attitude, velocity, housekeeping (checking fluid levels, etc) - has time critical periodic attributes.   It's easy to think of time-critical tasks.

Space-critical is a little more elusive to both define and decide where to use, especially in embedded and real-time applications.  The answer to time-critical is somewhere between having fast enough hardware, efficient enough software, and proper scheduling. Protecting memory areas on a process-by-process basis has excellent benefits, but those benefits usually come at a cost of slower execution.    Sure, you never want your stack to be overflowed, your text to be zero-cleared, or your exception table to be  the victim of an uninitialized data pointer, but when is the trade-off in execution speed penalties worth the  overhead of enabling such protection?  The answer has more to do with security than with protection from poorly-debugged code - the separation, multiplexing, encoding, decoding and proper sorting of information streams.  There may be situations where every function of your application must have it's own data spaces because of the nature of the data being handled.  In fact, this kind of design is becoming more common.

It seems like it would take a fantastic amount of processing power to be able to do what's described above - guarantee on-time scheduling to handle deadlines, and mix-in varying levels of RAM-security, and still provide all the services and infrastructure necessary from an O.S. to build effective applications on-top of.  Even though processors are being created now with incredible horsepower, it wouldn't seem like they could keep up with much of this kind of work.  All that partition scheduling and context swapping would bog down the best of processors if it had more than just a few partitions and contexts to handle!

...or would it?

Come on down to the Regional Devlopers Conference in Manhattan Beach (Los Angeles) on May 24th, and we'll talk about it.  :-) See you there!

March 01, 2007

Test as you... surprise!

There's an old adage in the world of flight - space flight, or otherwise.

"Test as you fly, fly as you test."

It's pretty short and sweet and straight-forward.  Don't fly what you haven't tested.  Test exactly the ways you expect to fly.

In a big, round, wonderful world, it seems "funny" when we hear about problems related to living here.  Problems like aircraft "flipping over" when they crossed the equator because they were adjusting to negative latitude - the coder hadn't thought far enough ahead to think about crossing the equator.

This is the kind of problem that should be found in test, simulation, or in a thorough validation of design and implementation.  The kind of validation that should be done with "other eyes" - not the eyes of the implementing entity.  It's far better to find these kinds of problems before you're in the air, and lives are depending on everything to be Reliable.

That's why I was surprised to read today about problems with the F22, apparently crossing the International Date Line caused the computers to shut-down.

December 01, 2006

Mitigating Risk

The phone rings.  A customer has an application destined for a high-risk environment, and somewhere in test they've found a new unanticipated condition.  A "Tiger Team" is formed - a group of experts who've "been here before", who understand the priority and the risks, and know how serious it is.  Many times a Tiger Team may give the go / no-go, the final answer that either saves a mission, or drops all that work into the dust-bin.  Such a team itself represents a *lot* of work, frustration, and time - investigations may run on days, weeks, or months, as long as they reach their conclusion on-time.  And there's the one facet that can't be changed by any engineering practice:  Time.

Developing space, mission- or life-critical applications is serious business.  The deployment environment may be harsh - and much about it may be unknown.  To make sure missions are successful, the known risks are eliminated as well as possible.  Some of these risks may include high radiation, occlusion by the Sun (Solar Conjunction), extreme differences in temperatures, harsh chemical environment, vibrations, or electrical noise.  Some of this you can plan for with hardware, some of this you can't.  You address the things you know about, that's all you can do.

With software engineering, you're also faced with unknowns.  One of the biggest impacts is the unknown date when you'll finally have the actual hardware the project is going to run on.  Using simulators can help out, a lot, enabling some degree of parallel development when hardware time is rare.  I encourage most of my customers to leverage the simulator wherever possible.  It's a good way to kick-start development, enable more hands on the project, and bring new engineers up to speed.

Once you have your hardware, there  may be some hardware-level debug issues.  Having a hardware debugging interface of some sort - an In Circuit Emulator (ICE), for example - can be invaluable.  As long as one has an LED to blink, one can write polled-output routines to display register settings (etc) to assist in debugging a hardware interaction, but nothing is as good as being able to dump desired address ranges (registers, etc) at will.  An ICE can give you that ability to see inside the hardware.

What do you do if the hardware is just not going to be available?  You can use similar boards, processors, etc, but.. it's not the *same* thing.  The minute you change the foundation, everything above it is going to have a new set of interactions with the foundation, and some of those may induce problems.  When updating hardware, timing, electrical, or even errata related to updated chips or even PAL equations may cause engineering delays - even if it's just a newer version of the same board.  Switching entire boards - or from engineering / test boards to "flight qualified" boards - can complicate the issue.

So.. what can you do when you don't have the real hardware on-hand, to help reduce risks associated with developing for that hardware?  In the past few years, hardware emulation tools have come a long way.  Hardware emulation is like a simulator - vxSim for example - except there's a layer of software that's designed to act *just like* the real hardware would, right down to devices and registers within those devices, etc. 

Leveraging known-good (mature) software designs is another way to limit risk.  Re-using application code that is well understood is one example.   Selecting a base of software that's been certified to adhere to accepted standards is another way.  It's always nice to know that software has been scrutinized by eyes other than the manufacturer's, and been found fit.  And there's no substitute for a rigorous test strategy, or following the oft-learned adage: test as you will fly, fly as you did test.

Safe design practices can't eliminate tiger teams completely - there will still be unknowns, still be conditions that aren't discovered until the last moment.  But by combining proper practices, mature systems, and the proper tools, many of the problems that lead to the need for tiger teams may be addressed, and may be discovered far enough in advance to prevent a project from running out of Time.

November 01, 2006

MS taking some pain out of WinCE?

Microsoft has released WinCE 6.0 - this time as a "shared source" release, with various enhancements.  Some of the big claims-to-fame:  It's the "first commercial hard real-time OS released as a shared-source product".

I'm not sure exactly what this means.  Perhaps they're using their source licensing model - "shared sourcing" - as a qualifier?  Since it's the only RTOS made by Microsoft with sources shared through their license, that makes the "shared" part true.  From that perspective, there are licensing options for source code from most proprietary OS companies, you just can't post it all to the web for anyone to download.

As far as "open" goes, MS is capitalizing on the fear of the GPL - no-one has challenged it far enough yet to know if using a GPL OS would make it so you have to release your *applications* under a GPL umbrella. But ignoring this bit - there are a number of realtime or at least nearly-deterministic kernels available via GPL and Open Source communities (a quick online search yields over a thousand pages with the exact phrase "Open Source RTOS").

Determinism: putting the "Hard" in Real-time.

There are hundreds of open-source OS's - some are even deterministic enough to be "hard realtime".  RealTime is basically "fast enough to keep up" - but hard realtime means "measurably deterministic" - not just fast enough to keep up, but measurably and predictably able to keep up with a given load. WinCE 6 makes the claim that it's hard realtime  But... is it really a hard-realtime system?  Just claiming you're "RealTime" isn't enough - as This Stanford Comparison  shows, some systems claiming to be "Real Time" don't quite show determinism, especially when running as a "loaded" system.  Part of determinism is being able to give "deterministic" execution - that is, you can be guaranteed that the OS itself won't take longer than a given window of time to react to the world.  Execution times for critical system activities that vary over an order of magnitude is not a shining example of "determinism" (see page 4!).

I don't know if you've done any research into this part of the topic - but if you do, you'll notice that many comparisons for determinism messier things like context-switch time and Interrupt Latency (how long it takes the OS to handle an interrupt or change contexts, usually measured under idle and highly-loaded conditions).  I've never seen such a document for WinCE - and most of what I've seen was either from MS themselves or from companies hired by MS to run their tests.  Though their method is scientific, it requires specific hardware and tools, and their graphs don't give you the complete story for context switching or interrupt latency.  Since the method isn't shown against any other RTOS, there's no comparison possible.

WinCE 6 has extensive use of virtual memory spaces, including supporting up to 32,000 processes each with a 2GB addressable virtual address space.  This requires at least changing memory maps, and possibly "swapping" chunks of text and data  between RAM and some fixed storage.  The act of swapping - pushing data to/from physical media - induces indeterminism.

Other than this - there are some interesting developments.

It's being bundled with Visual Studio with an integrated Platform Builder plug-in.  Microsoft says ""Under one roof, you have the entire development chain from device to application".  Eventually it might have some sort of data visualization tools, perhaps some kind of system event display, maybe even.. hardware-assisted debug... could it be that Microsoft may be joining the DSO trend?

October 12, 2006

Does it have a motor?

Back in the 80s there was a car commercial, where a small crowd of doubters are looking over a neighbor's new car.  Amongst the comments and questions about the options that make a car comfortable comes the quip "Does it have a motor?"

It's true that a car can be a car without a motor, or any of the comfort options, but let's face it - with no motor, it's not worth much.  Same thing for a car in snow country without a heater, or a car in the Mojave without working A/C.  Some of the "comfort options" become necessities depending on the intended use.

The same thing can be said for Operating Systems.  In this blog entry on Applications for Linux, DMarti brings up a very big issue facing Linux today:  What exactly is in your pot of soup?  The big applications producers are starting to take notice of what it means to support Linux.

Linux is a lot like soup - how it turns out is a combination of the ingredients and how you cook them.  It seems like most providers are grabbing whatever freezes they "need", applying a specific set of patches, and some proprietary patches or enhancements, rolling up a release, and dropping it out on FTP with one or two "demo" platforms.  They may offer some open-source applications they've ported to their release, or some of their own packages.  But as DMarti points out - the bigger players are going to want a standardized platform to base their support on, not just the results of some "compliance" tests, and not just one released platform.  And that means a known starting point, free of proprietary changes that may inhibit future growth or portability.

Although software manufacturers like Adobe may want to start off with a big name like Red Hat as a platform, they would be wise to take a conservative approach.  In order to run on a wider audience than just the "preferred platform", the implementation itself should be based on a locked and well-known "version" of Linux, without dependencies on any proprietary extensions.  I think as time goes on, this will be a strategy more and more software manufacturers follow.

Wind River Systems has our "own" release of Linux.  But ours is a little different than others.  We've taken a bit of an idealistic approach to our release.  We start with a known kernel from kernel.org, and a known set of patches and libraries - these are all publicly available.  We apply the patches in a step-by-step method, and provide instructions on exactly what we did.  We provide BSPs to support Linux on specific hardware.  We don't have proprietary extensions built-in to the base offering, we don't have patches or extensions that may or may not ever make it into any other Linux release, we don't have software included that would "gate" a user's ability to comply with a standard or upgrade to future patches from kernel.org (etc).

Any big player would understand how dangerous it is to tie one's product or future to a proprietary standard or system.  History is full of lessons where superior technologies lost out to open standards because the open standards made for increased usability and lower price-points (Betamax, anyone?).   I think there will be a chosen Linux platform upon which the likes of Adobe standardize, but I also think the lessons of the past will guide a choice that can be accessed by anyone, not just a single proprietary release.

October 08, 2006

View from Above.

It's not every day you can watch your work in action. It's even less often that you get to watch your work watching your work in action. :)

Mars' latest arrival takes picture of second oldest running vehicle on the planet.


Click for a high-res image

Considering it's a photo taken from orbit while in motion... isn't it beautifully clear?

MRO is running VxWorks 5.5.1 for PowerPC on a BAE Systems RAD750 Space Flight Processor - Opportunity is running VxWorks 5.3.1 MER edition on a BAE Systems Rad6000 Space Flight Processor. It's sort of a grandchild-takes-grandma's picture kind of thing.  And given Grandma's heritage, Grandson has a big pair of shoes to fill.

*standing ovation for both MER and MRO teams* They've both done outstanding jobs.

The MER team recently received funding from NASA for one more year of operation.

On October Twenty Fifth, Spirit will have been on Mars for One Thousand Days. On November Fourteenth, Opportunity will reach it's Sol-1K. On those anniversaries, the rovers will be "910 days past warranty".

Mike Deliman

  • As an Engineering Specialist, it is Mike Deliman's responsibility to enable customers to achieve success in their endeavors, assist sales groups in evangelizing Wind River's technologies, and bring feedback of customer needs and experiences back into Marketing and Engineering. Mike has over 15 years of experience with VxWorks.
    "Mike's forgotten more about VxWorks than most people will ever know." -J Carlstrom
åç