Hive Plays Well with JSON

By |2018-11-28T11:08:04+00:00November 28th, 2018|Pranab Ghosh|

Hive Plays Well with JSON Hive is an abstraction on Hadoop Map Reduce. It provides a SQL like interface for querying HDFS data, whch accounts for most of it’s popularity.  In Hive, table structured data [...]

Data Normalization with Spark

By |2019-01-02T07:34:58+00:00November 27th, 2018|Pranab Ghosh|

Data Normalization with Spark Data normalization is a required data preparation step for many Machine Learning algorithms. These algorithms are sensitive to the relative values of the feature attributes. Data normalization is the process of bringing all the [...]

Anomaly Detection with Robust Zscore

By |2019-07-17T13:47:16+00:00November 27th, 2018|Pranab Ghosh|

Anomaly Detection with Robust Zscore Anomaly detection with with various statistical modeling based techniques are simple and effective. The Zscore based technique is one among them. Zscore is defined as the absolute difference between [...]

Ruling with Drools Rule Engine

By |2018-11-27T09:45:21+00:00November 22nd, 2018|Pranab Ghosh|

In a project several years ago I built a rule engine from scratch.  In a recent project, which needed a rule engine, I decided to take different route. I decided to give  Drools rule engine [...]

Auto Training and Parameter Tuning for a ScikitLearn based Model for Leads Conversion Prediction

By |2019-08-22T07:32:52+00:00May 29th, 2018|Analytics, Blogs, Data Sciences, Pranab Ghosh, Predictive Analytics, Predictive Modeling, Python, ScikitLearn|

Auto Training and Parameter Tuning for a ScikitLearn based Model for Leads Conversion Prediction This is a sequel to my last blog on CRM leads conversion prediction using Gradient Boosted Trees as implemented in ScikitLearn. The focus of [...]

CONTACT US