In my last blog post
, I told you how excited I am about the huge opportunity ahead for us at Hortonworks. In addition to continued innovation in HDP, there is a huge opportunity for the industry being driven by the Internet of Things (IoT). I think it's broader than the Internet of Things, and is really the Internet of Everything and ANYTHING.
Whether it is in moving metal (like cars or jet engines), wearable technology or even simple things like our modern refrigerator, sensors and the data they create are the next big thing. All of this data being created, consumed and analyzed will create new business opportunities for companies across all industries. The application of IoT technologies will enable new business models and customer interactions like connected car, proactive maintenance of equipment, network optimization, enhanced logistics solutions and more.
As I look out into the next few years I predict that IoT (or as I see it the Internet of Anything-to make the point I'll use IoAT) will be even bigger than the advent of big data. And most industry analysts seem quite excited by the prospect of IoT. Further, I think there are three very new problems to be solved to enable IoAT solutions and applications to proliferate successfully.
1: It's the Grid, dummy!
First, IoAT changes the way we think about traditional data flows. I'm not talking about specific technologies like ETL or Streaming, but really about how data needs to flow in IoAT. It is a new paradigm around 'data in motion'. Traditional data flow is generally one direction, from source systems to target systems with some level of transformation or processing along the way. In the IoAT world, data flows will be Bi-directional and point to point: meaning that sensors will send data in, but also will require data back and more importantly may even need to talk to each other.
This paradigm shift in data flow requirements reminds me of the transition that utility companies have made from the days of a one way power grid (you know big power plants distributing electrons to consumers over a complex-but one way grid). Enter the smart grid where consumers may also be generating power and putting it into the grid. It's a new paradigm that required new technology and new thinking.
Adding to the complexity of modern IoAT data flows is the notion that the optimizing the data flow grid will be very important. Bandwidth may be limited in some portions of the grid, and may be very expensive in other parts of the grid. Managing to optimize and understand variability of a diverse bi-directional and point-to-point grid will be key.
2: Beware, the Jagged Edge!
Second, an IoAT world means that the perimeter of data processing is outside the data center and can be very jagged. This 'Jagged Edge' creates new opportunity for security, data protection, data governance and provenance. Traditional solutions that reside inside the data center were not constructed to handle this complexity.
It's easy to understand the importance of securing IoAT data, but additional requirements will include ensuring data are transmitted and received correctly, in a timely manner with full auditability and provenance. Again keeping track of one-way streams of data is a tall order and in the jagged multi directional world it's a bigger problem to manage.
3: We need a Rosetta Stone!
Today's big data implementations thrive on data variety, from web logs to images to text to traditional spreadsheets, we've come a long way. IoAT will bring even more variety to the data landscape from propriety languages to open standards driven by the vast spectrum of devices, applications, and sensor manufacturers. Being able to reach out to any sensor from any manufacturer. Integration of data from different sensors created at different times will define true success at an enterprise level.
It's impossible to think that there will be one single language and standard for this network of smart devices, so light weight agents that enable data collection and integration are required.
Ok, so I've told you what I see as the key issues and opportunities in the new world of IoAT. And, don't worry-help is on the way.
Hortonworks to Acquire Onyara
Onyara is the leading contributor to the Apache NiFiproject, which is now a top-level project at Apache Software Foundation(ASF). Onyara was formed around the technology in Dec 2014. Engineers and technologists at Onyara are key contributors to Apache NiFi developed over the last 8 years.
Previously called "Niagara Files" at the National Security Agency (NSA), the project has been proven at scale in production. NiFi was open sourced as Apache NiFi as part of the NSA Technology Transfer Program in the fall of 2014. Subsequently, it became a top-level Apache project in July 2015.
Apache NiFi is a secure and reliable data flow solution for IoAT. It delivers high quality real time and trusted data flows and is proven at scale. The Onyara team share's Hortonworks' vision for 100% community driven open source and is a great extension to the industry's best Hadoop distro: Hortonworks Data Platform
. Not only Apache NiFi complements HDP, it also accelerates flow of data in motion the into HDP for full fidelity analytics
By combining HDP with the new Hortonworks DataFlow powered by Apache NiFi, our customers will uniquely be positioned to conquer Big Data as well as the deliver IoAT solutions that easily manage the "Smart Grid" and Jagged Edges that will pervade their landscapes. Customers will be able to benefit the end-to-end secure data routing with compression.
Hortonworks DataFlow powered by Apache NiFi will enable customers to Collect Conduct and Curate their IoT (and IoAT) data.
Collect: Aggregate any and all IoAT data from sensors, machines, geolocation, clicks, files, social feeds via a highly secure lightweight agent.
Conduct: Mediate secure point-to-point and bi-directional data flows and deliver data reliably to real-time analytic applications and full fidelity data systems such as HDP and optimize data flows for costly/limited bandwidth and prioritize most valuable data
Curate: All data from any source: parse, filter, join, transform, fork, and clone data in motion while maintaining a real chain of custody
These capabilities enable the smart grid, tame the jagged edge and make IoAT data trusted and manageable - now!