Amazon Athena
What is Amazon Athena? Amazon Athena is an interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL. With a few actions in [...]
What is Amazon Athena? Amazon Athena is an interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL. With a few actions in [...]
Couchbase is the merge of two popular NoSQL technologies: Membase, which provides persistence, replication, sharding to the high-performance Memcached technology CouchDB, which pioneers the document-oriented model based on JSON Like other [...]
What Is Amazon EMR? Amazon EMR is a managed cluster platform that simplifies running big data frameworks, such as Apache Hadoop and Apache Spark, on AWS to process and analyze vast amounts of data. By using these [...]
What is Apache HBase? Apache Hbase is a popular and highly efficient Column-oriented NoSQL database built on top of Hadoop Distributed File System that allows performing read/write operations on large datasets in real time [...]
What is Apache Ignite? Apache Ignite is a memory-centric distributed database, caching, and processing platform for transactional, analytical, and streaming workloads delivering in-memory speeds at petabyte scaleDurable Memory Ignite's durable memory component treats RAM [...]
MapReduce Tutorial: Introduction In this MapReduce Tutorial blog, I am going to introduce you to MapReduce, which is one of the core building blocks of processing in Hadoop framework. Before moving ahead, I would [...]
OVERVIEW The blueprint for Enterprise Hadoop includes Apache™ Hadoop’s original data storage and data processing layers and also adds components for services that enterprises must have in a modern data architecture: data integration and [...]
MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling. Document Database A record in MongoDB is a document, which is a data structure composed of field and value [...]
TensorFlow Architecture We designed TensorFlow for large-scale distributed training and inference, but it is also flexible enough to support experimentation with new machine learning models and system-level optimizations. This document describes the system architecture [...]
Introducing Apache Kudu Kudu is a columnar storage manager developed for the Apache Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and [...]