Automating the Machine Learning Workflow

Automating the Machine Learning Workflow - AutoML

- June 26, 2017

Motivation:

Using Machine Learning will not require expert knowledge.
All machine learning tasks follow the same basic flow.
Difficult to find best fit hyper parameters.
Hard to make hand made features.
Fun.

Automatic Machine Learning in progress:

My motivation to write blog on this topic was Google's new project - AutoML.
Google's AutoML project focuses on , a technique that involves passing data through layers of . Creating these layers is complicated, so Google’s idea was to create AI that could do it for them.
There are many other open source projects, like AutoML and Auto-SKLEARN working towards a similar goal.

Goal:

The goal is to design the perfect machine learning “black box” capable of performing all model selection and hyper-parameter tuning without any human intervention.
AutoML draws on many disciplines of machine learning, prominently including

Bayesian optimization - It is a sequential design strategy for global optimization of black box functions.
Regression models for structured data and big data
Meta learning - It is a field of machine learning where automatic algorithms are applied on meta-data about machine learning experiments, improving the efficiency of existing learning algorithms.
Transfer learning - It focuses on storing knowledge gained while solving one problem and applying it to a different but related problem
Combinatorial optimization.

The basic pipeline of every AutoML framework:

Data Preprocessing

Converting the data to tabular form.
Splitting the test, train and validation data.

Feature Engineering

Label or one hot encoders for categorical variables.
TF-IDF or Bag Of Words for text variables.

Feature Stacking

Combining different features

Decomposition

For high dimension data PCA is used.
For text data - SVD is applied after converting text to sparse matrix.

Feature Selection

Greedy Forward Selection
Greedy backward elimination
Using models like LASSO or Random Forest for implicit selection.

Model selection and Hyper Parameter tuning

Grid Search
Random Search
Bayesian Search

Evaluation of model

Reference:

Comments

pridesys19 November 2018 at 14:37
Thanks For Sharing Excellent Blog. Machine Learning is steadily moving away from abstractions and engaging more in business problem solving with support from AI and Deep Learning. With Big Data making its way back to mainstream business activities, now smart (ML) algorithms can simply use massive loads of both static and dynamic data to continuously learn and improve for enhanced performance. Pridesys IT Ltd
ReplyDelete
Replies
ConvexPath6 June 2022 at 23:47
Excellent you have provided important data for us. It is essential and informative for everyone. Keep posting always. I am very thankful to you. Read more info about Best online machine learning course
ReplyDelete
Replies

Add comment

Search This Blog

A Cup Of Code