Analytics and the ‘Internet of Things’

By Gareth Noyes

Gareth_resizedAnalytics has become a major buzzword these days, whether in the realm of connected devices, the Internet of Things, web analytics or big data business analytics.  In the context of the Internet of Things, I thought I’d share some observations on different analytics paradigms and use cases.

One common analytics model is what could be termed “store-and-analyze-later,” where massive amounts of data are streamed up to cloud servers and Hadoop clusters for later analysis. The problem with this approach, especially given the ever increasing amounts of data, is that it doesn’t scale, the amount of data quickly overwhelms our ability to make sense of it. Compare that to a model where intelligence (gained from analytics) is tailored to the use case and system topology, leading to “intelligence where and when you need it”– the notion of multi-tiered intelligence where edge devices have a configurable amount of autonomy and decision making authority, and no longer act as “dumb data generators.”

Clearly the right architecture is largely driven by the use case being addressed, and context matters. Take for example a predictive modeling scenario, where complex machine learning algorithms on powerful servers crunch through huge amounts of collected operational data to build a predictive failure model, for example, of a wind turbine. Once the model is generated “in the cloud” it can be exported to the turbine control system for more autonomous execution and refinement.

More and more startups appear in the analytics space, from real-time in-memory databases to full analytics platforms. One area of particular interest is around streaming analytics engines, quite opposite to the store-and-analyze-later scenario described earlier. Data is being processed and analyzed “on the fly.” This technology generally scales better to a multi-tiered analytics use case.

Yet another perspective of analytics, and one that’s becoming increasingly important for the Internet of Things, is what I would term one-way vs. closed-loop analytics. One-way analytics is effectively up streaming of data to the cloud for storage and visualization, largely requiring humans to make intelligent interpretations. Where it gets really interesting, is the closed-loop use case, where analytics either in the cloud or in aggregation points in the network, drive changes and control back to the edge device, or devices exchange analytics intelligence between each other.

Regardless of the type of analytics paradigm or use case, the one constant is that data and the ability to make sense of it, is becoming a critical differentiator for the Internet of Things.


For additional information from Wind River, visit us on Facebook.


  1. Tim Skutt

    I completely agree with your final sentence. This differentiation enables significant cost savings for businesses.
    I think, though, that the “store-and-analyze-later” model is at least equally important to the “on-the-fly” model.
    Having worked in the aviation field, I saw the cost benefits of predictive maintenance drive multiple product lines. If, through predictive maintenance, you can schedule maintenance on an aircraft rather than react to a failed subsystem, you can save a lot of money on logistics and ripple effect throughout your fleet.
    Many aircraft subsystems are very complex, and it’s hard to determine the some of the failure modes ahead of time, but with “store-and-andalyze-later” analytics, once you start to see a pattern of failures develop you can mine the stored data to extract the historical data for the failures and, hopefully, create a model. This model can then be leveraged by “on-the-fly” analytics to predict subsystems that are likely to fail in a given time frame, and schedule maintenance so you don’t have to deal with the cost and repercussions of reacting to it later.
    So there’s a lot of value in both models, and the combination of the models gives you come really powerful capabilities.

  2. Gareth Noyes

    Thanks for your comments Tim, I agree that both models have their uses. In the case where you know very little about, or don’t have a refined model of, your system then large amounts of data that you analyze later could be an ideal way to mine cause & effect correlations. However it isn’t always feasible to do that, and sometimes the overhead of providing meta-data to provide context can bloat data repositories or add temporal overhead to analysis. In those cases having compute power closer to the edge can be advantageous. Ultimately though, these are system design choices that individual applications will have to make, but we want to enable those design choices.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>