The term “information age” has been used for decades. Information is data, so by association, are we living in the “data age?” It appears so. Advances in the communication mediums that move information to so many so rapidly enable both.
One could argue that not only are we living in the “data age,” but “data” is effectively becoming the new world currency and measure of wealth, as transactions based on cash, checks and even credit cards as we know them are becoming a thing of the past. Today, virtually everything we earn, everything we own and everything we trade is represented by data in one form or the other.
In today’s Internet of Things (IoT), Internet of Anything (IoA), or Internet of Everything (IoE), or whatever you prefer to call it, the central theme is: anything can be connected to anything, able to communicate with anything, able to interoperate, and ultimately able to do something. As everything gets connected to everything, data is the commodity being transferred between the various IoT components and attempts to store all the data being generated as a result continues to push the boundaries of Big Data.
Big Data, when discussed today, often gets tethered to IoT as if IoT were the only culprit or key culprit that is creating limitless data. However, Big Data representing massive volumes of data have been around long before IoT.
Big Data can be a result of any means. A key ingredient in IoT is sensing. So while IoT is sensor data “rich”, there are other sources in recent history that also generated massive data sets. For the past 50+ years, these data sets have been around us whether we knew it or not, and whether or not we had the analytics tools to derive wisdom from the Big Data in the manner that has become so crucial in today’s way of doing business.
Consider NASA. Think about their telescopes, satellites, imagery, and super computers that continuously generated huge data sets 24/7. Think about today’s e-health systems that generate volumes of patient, medical data. Consider YouTube videos, and other Internet technologies such as cloud, that enable and encourage Big Data by offering surprisingly cheap ways to store it. After all, if you can’t store it no one will generate it.
The point here is that Big Data is already a decades-old “done deal.” So where does IoT fit in? Does IoT make the problem more interesting, less interesting, harder or easier, any of these, none of these?
To partially answer this, we’ll focus here on three aspects that showcase the importance of the question: (1) scalability, (2) pedigree, and (3) heterogeneity.
The minute you mention terms like anything and everything, scalability becomes obvious, since so much is already connected. Everything suggests that which exists now – anything suggests anything created going forward. And when all of these things communicate, you have to think about what they are really doing. They are talking, and data is the language spoken.
Pedigree is slightly more interesting. Pedigree involves issues such as what produced the data, and where is that “what” located? Who owns that “what”? Could someone have tampered with the data? Has the data gone stale? These are simple questions geared at acquiring some level of trust in the data the ‘things’ generate and feed off of. Answers to these simplistic questions as well as others form a multi-dimensional view of pedigree.
And lastly, think about the heterogeneity of the ‘things’ that are generating data in many formats and languages, whether natural or digital. That is a unique twist, that for example, NASA likely did not have to contend with years ago – their systems were well-defined because they could bound the ‘things’ in their systems based on their overarching specifications. In IoT, that luxury is not.
Put simply, a universe built from components called anything, everything, anytime, anywhere, etc. are anything but a bounded specification. We argue that makes the problem more interesting, but also harder to solve.