Apache Hbase

By Dj Das|2024-07-31T10:17:16+00:00February 1st, 2018|Uncategorized|

What is Apache HBase? Apache Hbase is a popular and highly efficient Column-oriented NoSQL database built on top of Hadoop Distributed File System that allows performing read/write operations on large datasets in real time [...]

Apache Ignite

By Dj Das|2024-07-31T13:09:00+00:00February 1st, 2018|Uncategorized|

What is Apache Ignite? Apache Ignite is a memory-centric distributed database, caching, and processing platform for transactional, analytical, and streaming workloads delivering in-memory speeds at petabyte scaleDurable Memory Ignite's durable memory component treats RAM [...]

Hadoop MapReduce

By Dj Das|2022-07-04T14:23:30+00:00February 1st, 2018|Uncategorized|

MapReduce Tutorial: Introduction In this MapReduce Tutorial blog, I am going to introduce you to MapReduce, which is one of the core building blocks of processing in Hadoop framework. Before moving ahead, I would [...]

Apache Oozie

By Dj Das|2022-07-04T13:34:33+00:00February 1st, 2018|Uncategorized|

OVERVIEW The blueprint for Enterprise Hadoop includes Apache™ Hadoop’s original data storage and data processing layers and also adds components for services that enterprises must have in a modern data architecture: data integration and [...]

Mongo DB

By Dj Das|2022-07-04T13:51:55+00:00February 1st, 2018|Uncategorized|

MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling. Document Database A record in MongoDB is a document, which is a data structure composed of field and value [...]

TensorFlow

By Dj Das|2022-07-04T13:40:18+00:00February 1st, 2018|Uncategorized|

TensorFlow Architecture We designed TensorFlow for large-scale distributed training and inference, but it is also flexible enough to support experimentation with new machine learning models and system-level optimizations. This document describes the system architecture [...]

Apache Kudu

By Dj Das|2022-07-04T14:35:39+00:00February 1st, 2018|Uncategorized|

Introducing Apache Kudu Kudu is a columnar storage manager developed for the Apache Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and [...]

Apache Drill

By Dj Das|2022-07-04T14:43:17+00:00February 1st, 2018|Uncategorized|

Apache Drill: Drill is an Apache open-source SQL query engine for Big Data exploration. Apache Drill is designed from the ground up to support high-performance analysis on the semi-structured and rapidly evolving data [...]

Presto

By Dj Das|2022-07-04T14:38:41+00:00February 1st, 2018|Uncategorized|

WHAT IS PRESTO? Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from [...]

Apache Storm

By Dj Das|2024-07-31T13:21:13+00:00February 1st, 2018|Uncategorized|

Apache Storm OVERVIEW A system for processing streaming data in real time Apache™ Storm adds reliable real-time data processing capabilities to Enterprise Hadoop. Storm on YARN is powerful for scenarios requiring real-time analytics, machine [...]

Full-cycle Development

Consultation & Implementations

AI & Data Talent Solutions

GenAI & Conversational AI Solutions

Enterprise Knowledge Intelligence

AI Agents for Workflow Automation

Computer Vision Intelligence

Predictive AI & Forecasting

Manufacturing

Information Technology

Energy & Utility

Telecommunications

AdTech & Marketing

Banking, Finance & Insurance

Apache Ignite

Apache Oozie

Mongo DB

TensorFlow

Presto

Removing Duplicates from Order Data Using Spark

Storing Nested Objects in Cassandra with Composite Columns

Data Normalization with Spark

Anomaly Detection with Robust Zscore

Bulk Insert, Update and Delete in Hadoop Data Lake

Handling Categorical Feature Variables in Machine Learning using Spark

Combating High Cardinality Features in Supervised Machine Learning

Ruling with Drools Rule Engine

How to build your own AlphaZero AI using Python and Keras

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Products & Platforms

Valuable Resources

Company Insights

Share your requirements with our AI engineers to initiate a productive discussion.

Connect With Us on Social Media Platforms