Here Comes Hadoop Open Source Project for Big Data

Date posted: September 30, 2015

As Big Data became increasingly a frontline topic, on its heels came Hadoop— the open source project for managing huge amounts of data.

Hadoop is an open source software framework written in Java for distributed storage and processing of very large data sets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are commonplace and thus should be automatically handled in software by the framework.

While the implementation of Hadoop lagged behind its original hype, a recent article on Fortune indicates that its time may be at hand. The driver is the Internet of Things (IoT).

As more companies collect data from devices such as industrial sensors, mobile phones and cars, companies selling Hadoop-based software can try to swoop in and make sales. During the past couple of quarters, revenue at one of those companies, Hortonworks, has been growing rapidly—more than doubling on a year-over-year basis—partly because the Internet of things is driving demand for its software, CEO Rob Bearden told Fortune.

“Absolutely, beyond a question of a doubt, it is a functioning, seriously monetizable, fast-growing market,” Bearden said. “…If (companies) can get visibility in real time, and act and react in real time, it completely changes the dynamic of that transaction.”

Other open source technologies (e.g., Apache Kafka and Apache Spark) have also helped to make Hadoop better suited to the IoT. They have made getting data into Hadoop and then processing it much faster than before.

According to Fortune, “For companies trying to profit from Hadoop, the rush to connect devices and analyze their data could not be better.”

Take note. Here comes Hadoop.

Here Comes Hadoop Open Source Project for Big Data

Date posted: September 30, 2015

Archives

Internet gems