Apache Kudu

By |2022-07-04T14:35:39+00:00February 1st, 2018|ApacheKudu, Informative, Technologies|

Introducing Apache Kudu Kudu is a columnar storage manager developed for the Apache Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and [...]

Apache Tez

By |2022-07-04T13:45:33+00:00February 1st, 2018|Apache Tez, Informative, Technologies|

Introduction The Apache TEZ® project is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data. It is currently built atop Apache Hadoop YARN. The 2 main design [...]

Apache Drill

By |2022-07-04T14:43:17+00:00February 1st, 2018|Apache Drill, Informative, Technologies|

Apache Drill: Drill is an Apache open-source SQL query engine for Big Data exploration. Apache Drill is designed from the ground up to support high-performance analysis on the semi-structured and rapidly evolving data [...]

Apache Storm 

By |2022-07-04T13:43:51+00:00February 1st, 2018|Apache Storm, Informative, Technologies|

OVERVIEW A system for processing streaming data in real time Apache™ Storm adds reliable real-time data processing capabilities to Enterprise Hadoop. Storm on YARN is powerful for scenarios requiring real-time analytics, machine learning and [...]

Apache Hive 

By |2022-07-04T13:41:40+00:00January 31st, 2018|Apache Hive, Informative, Technologies|

The Apache Hive™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage and queried using SQL syntax. Built on top of Apache Hadoop™, Hive provides the following features: Tools to enable easy [...]

Apache Mahout 

By |2022-07-04T13:37:34+00:00January 31st, 2018|Apache Mahout, Informative, Technologies|

Apache™ Mahout is a library of scalable machine-learning algorithms, implemented on top of Apache Hadoop®  and using the MapReduce paradigm. Machine learning is a discipline of artificial intelligence focused on enabling [...]

Apache Pig

By |2022-07-04T13:49:15+00:00January 30th, 2018|Apache Pig, Informative, Technologies|

Apache Pig is a high-level language platform developed to execute queries on huge datasets that are stored in HDFS using Apache Hadoop. It is similar to SQL query language but applied [...]

Apache Hadoop

By |2022-07-04T14:34:41+00:00January 29th, 2018|Apache Hadoop, Informative, Technologies|

The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers [...]

CONTACT US