MLbase: A Distributed Machine Learning System
Machine learning (ML) and statistical techniques are crucial for transforming Big Data into actionable knowledge. However, the complexity of existing ML algorithms is often overwhelming. Many end-users do not understand the trade-offs and challenges of parameterizing and choosing between different learning techniques. Furthermore, existing scalable systems that support ML are typically not accessible to ML developers without a strong background in distributed systems and low-level primitives. In this talk I will provide an overview of MLbase, a system designed to make it easier to (1) use machine learning and to (2) implement distributed ML algorithms. I will explain the declarative approach of ML as well as the high-level operators that will enable ML developers to scalably implement a wide range of ML methods without deep systems knowledge.