About Dj Das

This author has not yet filled in any details.
So far Dj Das has created 395 blog entries.

Cloudera Impala 

By |2022-07-04T14:21:52+00:00January 31st, 2018|Uncategorized|

Cloudera Impala provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS or HBase. In addition to using the same unified storage platform, Impala also uses the same metadata, SQL [...]

Apache Kafka

By |2024-07-31T13:10:27+00:00January 30th, 2018|Uncategorized|

Apache Kafka We think of a streaming platform as having three key capabilities: It lets you publish and subscribe to streams of records. In this respect it is similar to a message queue or [...]

Apache Flume

By |2024-07-31T13:02:23+00:00January 30th, 2018|Uncategorized|

What is Apache Flume? Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store. The [...]

Apache Spark 

By |2024-07-31T13:18:23+00:00January 30th, 2018|Uncategorized|

What is Apache Spark? Apache Spark is a fast and general engine for large-scale data processing. Speed Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. Apache Spark [...]

Apache Pig

By |2024-07-31T13:14:14+00:00January 30th, 2018|Uncategorized|

What is Apache Pig? Apache Pig is a high-level language platform developed to execute queries on huge datasets that are stored in HDFS using Apache Hadoop. It is similar to SQL [...]

Apache Hadoop

By |2024-07-31T13:04:06+00:00January 29th, 2018|Uncategorized|

Apache Hadoop The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters [...]

CONTACT US