read our blogs

Blogs or Expert Columns

Redefine data analytics with Modern Data Warehouse on Azure

Data is more critical than ever in all aspects of modern business. As more organizations embark on digital transformation and embrace a data-driven culture, a common IT challenge is building the definitive source of truth for trusted data, breaking infrastructure and functional silos and bringing all data types together. Thousands of customers are now using Azure SQL Data Warehouse (SQL DW) to take advantage of the fast, flexible, and secure analytics platform to gain deeper insights and make better decisions. Azure SQL DW has been engineered to deliver lighting fast query performance in a secure, cost-effective cloud solution. With [...]

Beyond Deep Learning – 3rd Generation Neural Nets

Summary:  If Deep Learning is powered by 2nd generation neural nets.  What will the 3rd generation look like?  What new capabilities does that imply and when will it get here? By far the fastest expanding frontier of data science is AI and specifically the rapid advances in Deep Learning.  Advances in Deep Learning have been dependent on artificial neural nets and especially Convolutional Neural Nets (CNNs).  In fact our use of the word “deep” in Deep Learning refers to the fact that CNNs have large numbers of hidden layers.  Microsoft recently won the annual ImageNet competition with a CNN comprised of 152 layers.  [...]

Cybersecurity: Time to Get Serious

We have been on the run from cyberattacks for more than a decade now, and in spite of the $90+ Billion we spent last year (according to Gartner) on cybersecurity technology, things have gotten worse, not better. Those same analysts predict we will be spending more than $113 Billion by 2025. Thanks to tons of investment capital and bright young graduates from great schools, we have lots of sparkling new technology. In fact, at last year’s RSA Conference, we broke all the records with 15 keynote presentations, more than 700 speakers across 500+ sessions and more than 550 companies [...]

NERSC, Intel, Cray Harness the Power of Deep Learning to Better Understand the Universe

A Big Data Center collaboration between computational scientists at Lawrence Berkeley National Laboratory’s (Berkeley Lab) National Energy Research Scientific Computing Center (NERSC) and engineers at Intel and Cray has yielded another first in the quest to apply deep learning to data-intensive science: CosmoFlow, the first large-scale science application to use the TensorFlow framework on a CPU-based high performance computing platform with synchronous training. It is also the first to process three-dimensional (3D) spatial data volumes at this scale, giving scientists an entirely new platform for gaining a deeper understanding of the universe. Cosmological ''big data'' problems go beyond the simple volume [...]

Putting the Power of Kafka into the Hands of Data Scientists

Putting the Power of Kafka into the Hands of Data Scientists Over a year ago, my fellow data infrastructure engineers and I broke ground on a total rewrite of our event delivery infrastructure. Our mission was to build a robust, centralized data integration platform tailored to the needs of our Data Scientists. The platform would be fully self-service, so as to maximize the Data Scientists’ autonomy and give them complete control over their event data. Ultimately, we delivered a platform that is revolutionizing the way Data Scientists interact with Stitch Fix’s data. In two parts, this post peeks into [...]

Monte Carlo Tree Search – beginners guide

For quite a long time, a common opinion in academic world was that machine achieving human master performance level in the game of Go was far from realistic. It was considered a ‘holy grail’ of AI – a milestone we were quite far away from reaching within upcoming decade. Deep Blue had its moment more than 20 years ago and since then no Go engine became close to human masters. The opinion about ‘numerical chaos’ in Go established so well it became referenced in movies, too. Surprisingly, in march 2016 an algorithm invented by Google DeepMind called Alpha Go defeated Korean world champion in [...]

Addressing the increasing demand for data scientists and how to get involved

Addressing the increasing demand for data scientists and how to get involved Data science has become a crucial practice for many businesses to increase efficiency and to enhance processes. It has had a profound impact on a variety of areas of society. In fact, data science has been so heavily in demand that it has been ranked the best job in the US for the past three years. However, getting into data science might be tough, initially, especially given the technical requirements of the job, but it certainly proves to be rewarding. For any potential future data scientists, Mashable recently published an article titled [...]

How a year in DevOps changed me as a Developer

About a year ago, we had a project come in which required automating infrastructure setup, configuration management and change management. I happened to be at the right place at the right time and was given in charge of the whole thing. The project was designed on the micro services style of architecture and our teams decided on using GitHub as the source code repository, JFrog for storing our build artifacts, a RHEL based Kubernetes cluster as the deployment platform and Jenkins to drive the whole thing through a continuous integration, testing and deployment pipeline. In this way, developers, QA, [...]

Robots in workplace ‘could create double the jobs they destroy’

 The World Economic Forum report suggests new technologies have the capacity to both disrupt and create new ways of working. Photograph: Alamy Stock Photo The rise of machines, robots and algorithms in the workplace stands to create almost double the number of jobs for the global economy by the middle of the next decade than it puts at risk of being replaced. According to the World Economic Forum (WEF), about 133m jobs globally could be created with the help of rapid technological advances in the workplace over the next decade, compared with 75m that could be displaced. The findings will go [...]

Machine Learning Is A Moneyball Moment For Companies

Runner slides into home base (Photo by Lambert/Getty Images) By now, you have heard of "Moneyball." Maybe you have even read the book by Michael Lewis (The Art of Winning and Unfair Game), seen the movie that starred Brad Pitt, or even done the business case, all of which detail the unorthodox, data-driven management approach of Major League Baseball’s Oakland A’s. Led by General Manager Billy Beane, the Oakland A’s pioneered the use of big data in baseball, identifying overlooked but important factors to evaluate players. A similar opportunity is available today for business leaders. Despite a great deal of lip service [...]

8 Top-funded Facial Recognition Startups

The eyes may be the window to the soul, but the face is the door through which we can enter the future. OK, we’re a bunch of underpaid MBAs, not starving poets, but that was our less prosaic way of introducing the topic of this article: facial recognition technology. Facial recognition technologies are becoming ubiquitous, from unlocking your smartphone to identifying your so-called friends on Facebook. Retailers use facial recognition to spot shoplifters, and doctors are starting to employ the technology to detect certain kinds of diseases. Heck, there’s even a facial recognition platform for cows! There’s no dearth of [...]

Introducing Azure DevOps

Today we are announcing Azure DevOps. Working with our customers and developers around the world, it’s clear DevOps has become increasingly critical to a team’s success. Azure DevOps captures over 15 years of investment and learnings in providing tools to support software development teams. In the last month, over 80,000 internal Microsoft users and thousands of our customers, in teams both small and large, used these services to ship products to you. The services we are announcing today span the breadth of the development lifecycle to help developers ship software faster and with higher quality. They represent the most [...]

Artificial intelligence is going to completely change your life

Just as electricity transformed the way industries functioned in the past century, artificial intelligence — the science of programming cognitive abilities into machines — has the power to substantially change society in the next 100 years. AI is being harnessed to enable such things as home robots, robo-taxis and mental health chatbots to make you feel better. A startup is developing robots with AI that brings them closer to human level intelligence. Already, AI has been embedding itself in daily life — such as powering the brains of digital assistants Siri and Alexa. It lets consumers shop and search [...]

Microsoft and Kymeta bring futuristic connected vehicles onto home turf

A prototype vehicle designed for disaster response is outfitted with Kymeta’s KyWay flat-panel satellite antenna. (Microsoft / Kymeta Photo) REDMOND, Wash. — The vehicles sitting in the parking lot here at the headquarters of Kymeta Corp., a flat-panel antenna startup backed by Microsoft co-founder Bill Gates, are a car fan’s dream. But Microsoft’s Scott Montgomery says you have to look under the hood. And under the roof. “The vehicles themselves are literally just four wheels and an engine to get the platform where we need it to go,” Montgomery, who’s a senior industry solution manager at Redmond-based Microsoft, told GeekWire. “What I [...]

Real-Time Analysis of Popular Uber Locations Using Apache APIs

Let's take an in-depth look at a real-time analysis of popular Uber locations using Apache APIs. According to Gartner, smart cities will be using about 1.39 billion connected cars, IoT sensors, and devices by 2020. The analysis of location and behavior patterns within cities will allow optimization of traffic, better planning decisions, and smarter advertising. For example, the analysis of GPS car data can allow cities to optimize traffic flows based on real-time traffic information. Telecom companies are using mobile phone location data to provide insights by identifying and predicting the location activity trends and patterns of a population in [...]

Automated Text Classification Using Machine Learning

Digitization has changed the way we process and analyze information. There is an exponential increase in online availability of information. From web pages to emails, science journals, e-books, learning content, news and social media are all full of textual data. The idea is to create, analyze and report information fast. This is when automated text classification steps up. Text classification is a smart classification of text into categories. And, using machine learning to automate these tasks, just makes the whole process super-fast and efficient. Artificial Intelligence and Machine learning are arguably the most beneficial technologies to have gained momentum [...]

Demystifying IBM Streams

The aim of this blog is to help you build a picture of how one could develop SPL programs in IBM streams. In this blog, you will learn about — Data Streaming and the need for real-time decision systems. IBM Streams and its toolkits. Some SPL code. Let us take a deep dive into the ocean of IBM Streams. 50 m Atlantis 1 Submarine — Streaming Analytics Streaming Analytics, (also referred to as Stream Processing) is the method of processing huge data (Stream) in-flight. These data streams are the uninterrupted flow of an overlong sequence of data. The power of [...]

Amazon Kinesis Data Streams Adds Enhanced Fan-Out and HTTP/2 for Faster Streaming

A few weeks ago, we launched two significant performance improving features for Amazon Kinesis Data Streams (KDS): enhanced fan-out and an HTTP/2 data retrieval API. Enhanced fan-out allows developers to scale up the number of stream consumers (applications reading data from a stream in real-time) by offering each stream consumer its own read throughput. Meanwhile, the HTTP/2 data retrieval API allows data to be delivered from producers to consumers in 70 milliseconds or better (a 65% improvement) in typical scenarios. These new features enable developers to build faster, more reactive, highly parallel, and latency-sensitive applications on top of Kinesis Data [...]

Azure Event Hubs: The good, the bad and the ugly

I can hear that question already in the minds of people. “Why ??? Just go with Kafka.”… And you would be right. It has been around for years and has proven itself as the “big boy on the block”. But let’s take a look at the newcomer. It is being sold as a carefree Kafka competitor, which even surpasses Kafka (according to Microsoft). So let’s have an objective look at “the good, the bad and the ugly” of this solution and if what they are selling is really as good as they claim. What is Azure Event Hubs ? Well if [...]

Travel Time Optimization With Machine Learning And Genetic Algorithm

What is the relationship between machine learning and optimization? — On the one hand, mathematical optimization is used in machine learning during model training, when we are trying to minimize the cost of errors between our model and our data points. On the other hand, what happens when machine learning is used to solve optimization problems? Consider this: a UPS driver with 25 packages has 15 trillion possible routes to choose from. And if each driver drives just one more mile each day than necessary, the company would be losing $30 million a year. While UPS would have all the data [...]

Google releases open source reinforcement learning framework for training AI models

Above: Google's Mountain View headquarters. Image Credit: Google Reinforcement learning — an artificial intelligence (AI) technique that uses rewards (or punishments) to drive agents in the direction of specific goals — trained the systems that defeated Alpha Go world champions and mastered Valve’s Dota 2. And it’s a core part of Google subsidiary DeepMind’s deep Q-network (DQN), which can distribute learning across multiple workers in the pursuit of, for example, achieving “superhuman” performance in Atari 2600 games. The trouble is, reinforcement learning frameworks take time to master a goal, tend to be inflexible, and aren’t always stable. That’s why Google is proposing an alternative: an open source reinforcement framework based [...]

Pioneering digital transformation in the legal and justice system

Abu Dhabi Global Market (ADGM) Courts embarked on an end-to-end transformation to deliver fully digital courts for the first time in the judicial and legal system. The authority used Azure, Dynamics 365 and Office 365 to build an innovative new legal platform to fully digitize dispute resolution and other legal proceedings. ADGM Courts aims to make the legal system more simple, efficient and accessible for clients and legal professionals.  Established in 2015, Abu Dhabi Global Market is Abu Dhabi’s international financial center. It has three independent authorities, one of them being ADGM Courts. ADGM Courts is an extremely [...]

Cloudera’s a data warehouse player now | ZDNet

The Impala-based Cloudera Analytic Database is now Cloudera Data Warehouse. And on the PaaS cloud side, it's Altus Data Warehouse. No more euphemisms. Cloudera's in the DW race. Almost seven years ago, in a hotel meeting room in Manhattan, Mike Olson, then Cloudera's CEO, briefed me on the still confidential Cloudera project called Impala. I think Olson knew he was preaching to the converted as he told me how inefficient and insufficient MapReduce-based computing was for the Enterprise. The answer, he said, was Impala, a Hive-compatible database that used Hadoop for storage but completely bypassed MapReduce for compute and processing. A data warehouse [...]

Hortonworks Delivers Improved Operational Insights to Simplify Streaming Architectures

SANTA CLARA, Calif. – Aug. 23, 2018— Hortonworks, Inc. (NASDAQ: HDP), a leading provider of global data management solutions, today announced it is delivering innovations that enable customers to get operational and streaming insights into data generated at the edge by enterprises. Performance improvements accelerate time to value, enabling businesses to capitalize on real-time market changes and customer sentiments. In addition, operational enhancements allow for clearer insights about data streams, making operations, DevOps and developers more productive. According to Forrester, “Connected solutions enable businesses to optimize processes, enhance offerings and transform their own business models. They generate streams of valuable [...]