Hadoop MapReduce

By |2022-07-04T14:23:30+00:00February 1st, 2018|Uncategorized|

MapReduce Tutorial: Introduction In this MapReduce Tutorial blog, I am going to introduce you to MapReduce, which is one of the core building blocks of processing in Hadoop framework. Before moving ahead, I would [...]

Apache Oozie 

By |2022-07-04T13:34:33+00:00February 1st, 2018|Uncategorized|

OVERVIEW The blueprint for Enterprise Hadoop includes Apache™ Hadoop’s original data storage and data processing layers and also adds components for services that enterprises must have in a modern data architecture: data integration and [...]

Mongo DB 

By |2022-07-04T13:51:55+00:00February 1st, 2018|Uncategorized|

MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling. Document Database A record in MongoDB is a document, which is a data structure composed of field and value [...]

TensorFlow

By |2022-07-04T13:40:18+00:00February 1st, 2018|Uncategorized|

TensorFlow Architecture We designed TensorFlow for large-scale distributed training and inference, but it is also flexible enough to support experimentation with new machine learning models and system-level optimizations. This document describes the system architecture [...]

Apache Kudu

By |2022-07-04T14:35:39+00:00February 1st, 2018|Uncategorized|

Introducing Apache Kudu Kudu is a columnar storage manager developed for the Apache Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and [...]

Redis

By |2022-07-04T14:29:38+00:00February 1st, 2018|Uncategorized|

Overview Of Redis Architecture Redis is a in-memory, key-value data store. Redis is the most popular key-value data store. Redis is used by all big IT brands in this world. Amazon Elastic Cache supports [...]

Apache Tez

By |2024-07-31T13:22:34+00:00February 1st, 2018|Uncategorized|

Apache Tez Introduction The Apache TEZ® project is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data. It is currently built atop Apache Hadoop YARN. The 2 [...]

Apache Drill

By |2022-07-04T14:43:17+00:00February 1st, 2018|Uncategorized|

Apache Drill: Drill is an Apache open-source SQL query engine for Big Data exploration. Apache Drill is designed from the ground up to support high-performance analysis on the semi-structured and rapidly evolving data [...]

Presto

By |2022-07-04T14:38:41+00:00February 1st, 2018|Uncategorized|

WHAT IS PRESTO? Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from [...]

Apache Sqoop

By |2022-07-04T14:25:01+00:00February 1st, 2018|Uncategorized|

Before starting with this Apache Sqoop tutorial, let us take a step back. Can you recall the importance of data ingestion, as we discussed it in our earlier blog on Apache Flume. Now, as we [...]

CONTACT US