Blogs or Expert Columns
Using Presto in our Big Data Platform on AWS
At Netflix, the Big Data Platform team is responsible for building a reliable data analytics platform shared across the whole company. In general, Netflix product decisions are very data driven. So we play a big role in helping different teams to gain product and consumer insights from a multi-petabyte scale data warehouse (DW). Their use cases range from analyzing A/B tests results to analyzing user streaming experience to training data models for our recommendation algorithms. We shared our overall architecture in a previous blog post. The underpinning of our big data platform is that we leverage AWS S3 for our [...]
Outlier Detection using Apache Spark Solution
Outlier Detection using Apache Spark Solution Sometimes an outlier is defined with respect to a context. Whether a data point should be labeled as an outlier depends on the associated context. For a bank ATM, transactions that are considered normal between 6 AM and 10 PM, may be considered anomalous between 10 PM and 6 AM. In this case, the context is the hour of the day.</span In this post, we will go through some contextual outlier detection techniques based on statistical modeling of the data. The Spark based implementation is available in the open source projects in github [...]
Microsoft’s AI Roadmap
Digital transformation is in full effect, and giants of the tech industry are investing heavily in new technologies Due to its nearly limitless potential, artificial intelligence is at the forefront of much of this research, and Microsoft has been making headlines with new technologies, major acquisitions, and innovative ideas. The tech giant has long been moving toward a cloud-based future, and investment in AI is helping solidify its path toward becoming the AI leader in a number of fields. Here are a few technologies Microsoft has invested in recently and the potential impact they’ll have on the company’s future [...]
Harnessing Machine Learning for Anomaly Detections in Web Server Logs
Detecting Anomaly in Web Server Logs with Microsoft Azure Cloud – For FREE and at one-tenth the effort! Every website has web server logs which record the intricate details of site visitors – their browsing behaviors, clicks, actions etc. Web server logs soon become very large and bloated as they log all these information, one record per line. Within this maze of data lies hidden deep secrets about the website. Secrets like what are site visitors actually doing on their website, how best is the server responding to the requests from site visitors, what are the actions that site [...]
Web Server Logs
Web Server logs are server log files of a web server. A server log is a log file (or several files) automatically created and maintained by a server consisting of a list of activities it performed. A typical example is a web server log which maintains a history of page requests. The W3C maintains a standard format (the Common Log Format) for web server log files, but other proprietary formats exist. More recent entries are typically appended to the end of the file. Information about the request, including client IP address, request date/time, page requested, HTTP code, bytes served, [...]
Marmaray: An Open Source Generic Data Ingestion and Dispersal Framework and Library for Apache Hadoop
Connecting users worldwide on our platform all day, every day requires an enormous amount of data management. When you consider the hundreds of operations and data science teams analyzing large sets of anonymous, aggregated data, using a variety of different tools to better understand and maintain the health of our dynamic marketplace, this challenge is even more daunting. Three years ago, Uber adopted the open source Apache Hadoop framework as its data platform, making it possible to manage petabytes of data across computer clusters. However, given our many teams, tools, and data sources, we needed a way to reliably [...]
How Search Engines Use Machine Learning: 9 Things We Know for Sure
When we first started hearing about machine learning in the early 2010s, it seemed scary at first. But once it was explained to us (and we realized how technology is already being used to provide us with solutions), we started to get down to the practical questions: How are search engines using machine learning? How will it affect SEO? Machine learning is essentially using algorithms to calculate trends, value, or other characteristics of specific things based on historical data. Google has even declared itself a machine learning-first company. If you want to learn more about the tactical side of this technology, Eric Enge has a great [...]
Step into A New Era Of Public Service With Smarter Infrastructure
This year at Smart City Expo World Congress, the industry-leading event for urbanization, Microsoft will bring city leaders and solution experts from around the world to demonstrate innovative technologies currently empowering the digital transformation of smart cities. This blog post is the third in a series about how a connected city—powered by our intelligent cloud and intelligent edge platform—will help you step into a new era of public service. Join Microsoft and its partners at SCEWC 2018. By 2025, the UN projects that 68% of the world population will live in cities or urban areas. To keep pace with urbanization, innovative cities use technology like the [...]
Facial Recognition Tech Is Ready for Its Post-Phone Future
MARCIO JOSE SANCHEZ/AP ONE YEAR AGO, Craig Federighi opened his eyes, stared into the brand-new iPhone X, and showed the world how he could unlock it with his face. Or, at least, he tried. It took the Apple executive a few attempts and one back-up phone to get the screen to unlock without a fingerprint or a passcode. But then, like magic, he was in.This was Apple’s annual fall hardware show, where the company dangles its newest iPhones before the world and sets the tone for consumer products to come. Executives danced around the stage to show off the iPhone X's seemingly endless [...]
Anheuser-Busch InBev brews up game-changing business solutions with Microsoft Azure
Anheuser-Busch InBev, headquartered in Leuven, Belgium, isn’t just a beverage company, it’s a technology company. From its Beer Garage in Silicon Valley to its Global Analytics Center in Bengaluru, India, the company known as AB InBev is pushing the innovation envelope. The company is using technology to drive commercial and operational growth and increase sustainability by moving its IT operations to the cloud, and it is gaining more significant insights into business operations by breaking down data silos and building a global analytics platform. AB InBev chose Microsoft Azure as the best platform to support these game-changing advances. With [...]
Redefine data analytics with Modern Data Warehouse on Azure
Data is more critical than ever in all aspects of modern business. As more organizations embark on digital transformation and embrace a data-driven culture, a common IT challenge is building the definitive source of truth for trusted data, breaking infrastructure and functional silos and bringing all data types together. Thousands of customers are now using Azure SQL Data Warehouse (SQL DW) to take advantage of the fast, flexible, and secure analytics platform to gain deeper insights and make better decisions. Azure SQL DW has been engineered to deliver lighting fast query performance in a secure, cost-effective cloud solution. With [...]
Beyond Deep Learning – 3rd Generation Neural Nets
Summary: If Deep Learning is powered by 2nd generation neural nets. What will the 3rd generation look like? What new capabilities does that imply and when will it get here? By far the fastest expanding frontier of data science is AI and specifically the rapid advances in Deep Learning. Advances in Deep Learning have been dependent on artificial neural nets and especially Convolutional Neural Nets (CNNs). In fact our use of the word “deep” in Deep Learning refers to the fact that CNNs have large numbers of hidden layers. Microsoft recently won the annual ImageNet competition with a CNN comprised of 152 layers. [...]
Cybersecurity: Time to Get Serious
We have been on the run from cyberattacks for more than a decade now, and in spite of the $90+ Billion we spent last year (according to Gartner) on cybersecurity technology, things have gotten worse, not better. Those same analysts predict we will be spending more than $113 Billion by 2025. Thanks to tons of investment capital and bright young graduates from great schools, we have lots of sparkling new technology. In fact, at last year’s RSA Conference, we broke all the records with 15 keynote presentations, more than 700 speakers across 500+ sessions and more than 550 companies [...]
NERSC, Intel, Cray Harness the Power of Deep Learning to Better Understand the Universe
A Big Data Center collaboration between computational scientists at Lawrence Berkeley National Laboratory’s (Berkeley Lab) National Energy Research Scientific Computing Center (NERSC) and engineers at Intel and Cray has yielded another first in the quest to apply deep learning to data-intensive science: CosmoFlow, the first large-scale science application to use the TensorFlow framework on a CPU-based high performance computing platform with synchronous training. It is also the first to process three-dimensional (3D) spatial data volumes at this scale, giving scientists an entirely new platform for gaining a deeper understanding of the universe. Cosmological ''big data'' problems go beyond the simple volume [...]
Putting the Power of Kafka into the Hands of Data Scientists
Putting the Power of Kafka into the Hands of Data Scientists Over a year ago, my fellow data infrastructure engineers and I broke ground on a total rewrite of our event delivery infrastructure. Our mission was to build a robust, centralized data integration platform tailored to the needs of our Data Scientists. The platform would be fully self-service, so as to maximize the Data Scientists’ autonomy and give them complete control over their event data. Ultimately, we delivered a platform that is revolutionizing the way Data Scientists interact with Stitch Fix’s data. In two parts, this post peeks into [...]
Monte Carlo Tree Search – beginners guide
For quite a long time, a common opinion in academic world was that machine achieving human master performance level in the game of Go was far from realistic. It was considered a ‘holy grail’ of AI – a milestone we were quite far away from reaching within upcoming decade. Deep Blue had its moment more than 20 years ago and since then no Go engine became close to human masters. The opinion about ‘numerical chaos’ in Go established so well it became referenced in movies, too. Surprisingly, in march 2016 an algorithm invented by Google DeepMind called Alpha Go defeated Korean world champion in [...]
Addressing the increasing demand for data scientists and how to get involved
Addressing the increasing demand for data scientists and how to get involved Data science has become a crucial practice for many businesses to increase efficiency and to enhance processes. It has had a profound impact on a variety of areas of society. In fact, data science has been so heavily in demand that it has been ranked the best job in the US for the past three years. However, getting into data science might be tough, initially, especially given the technical requirements of the job, but it certainly proves to be rewarding. For any potential future data scientists, Mashable recently published an article titled [...]
How a year in DevOps changed me as a Developer
About a year ago, we had a project come in which required automating infrastructure setup, configuration management and change management. I happened to be at the right place at the right time and was given in charge of the whole thing. The project was designed on the micro services style of architecture and our teams decided on using GitHub as the source code repository, JFrog for storing our build artifacts, a RHEL based Kubernetes cluster as the deployment platform and Jenkins to drive the whole thing through a continuous integration, testing and deployment pipeline. In this way, developers, QA, [...]
Robots in workplace ‘could create double the jobs they destroy’
The World Economic Forum report suggests new technologies have the capacity to both disrupt and create new ways of working. Photograph: Alamy Stock Photo The rise of machines, robots and algorithms in the workplace stands to create almost double the number of jobs for the global economy by the middle of the next decade than it puts at risk of being replaced. According to the World Economic Forum (WEF), about 133m jobs globally could be created with the help of rapid technological advances in the workplace over the next decade, compared with 75m that could be displaced. The findings will go [...]
Machine Learning Is A Moneyball Moment For Companies
Runner slides into home base (Photo by Lambert/Getty Images) By now, you have heard of "Moneyball." Maybe you have even read the book by Michael Lewis (The Art of Winning and Unfair Game), seen the movie that starred Brad Pitt, or even done the business case, all of which detail the unorthodox, data-driven management approach of Major League Baseball’s Oakland A’s. Led by General Manager Billy Beane, the Oakland A’s pioneered the use of big data in baseball, identifying overlooked but important factors to evaluate players. A similar opportunity is available today for business leaders. Despite a great deal of lip service [...]
8 Top-funded Facial Recognition Startups
The eyes may be the window to the soul, but the face is the door through which we can enter the future. OK, we’re a bunch of underpaid MBAs, not starving poets, but that was our less prosaic way of introducing the topic of this article: facial recognition technology. Facial recognition technologies are becoming ubiquitous, from unlocking your smartphone to identifying your so-called friends on Facebook. Retailers use facial recognition to spot shoplifters, and doctors are starting to employ the technology to detect certain kinds of diseases. Heck, there’s even a facial recognition platform for cows! There’s no dearth of [...]
Introducing Azure DevOps
Today we are announcing Azure DevOps. Working with our customers and developers around the world, it’s clear DevOps has become increasingly critical to a team’s success. Azure DevOps captures over 15 years of investment and learnings in providing tools to support software development teams. In the last month, over 80,000 internal Microsoft users and thousands of our customers, in teams both small and large, used these services to ship products to you. The services we are announcing today span the breadth of the development lifecycle to help developers ship software faster and with higher quality. They represent the most [...]
Artificial intelligence is going to completely change your life
Just as electricity transformed the way industries functioned in the past century, artificial intelligence — the science of programming cognitive abilities into machines — has the power to substantially change society in the next 100 years. AI is being harnessed to enable such things as home robots, robo-taxis and mental health chatbots to make you feel better. A startup is developing robots with AI that brings them closer to human level intelligence. Already, AI has been embedding itself in daily life — such as powering the brains of digital assistants Siri and Alexa. It lets consumers shop and search [...]
Microsoft and Kymeta bring futuristic connected vehicles onto home turf
A prototype vehicle designed for disaster response is outfitted with Kymeta’s KyWay flat-panel satellite antenna. (Microsoft / Kymeta Photo) REDMOND, Wash. — The vehicles sitting in the parking lot here at the headquarters of Kymeta Corp., a flat-panel antenna startup backed by Microsoft co-founder Bill Gates, are a car fan’s dream. But Microsoft’s Scott Montgomery says you have to look under the hood. And under the roof. “The vehicles themselves are literally just four wheels and an engine to get the platform where we need it to go,” Montgomery, who’s a senior industry solution manager at Redmond-based Microsoft, told GeekWire. “What I [...]
