Pythian Blog: Technical Track

An explanation of machine learning algorithms

Machine learning algorithms help to identify hidden insights in data without needing explicit, complicated programmes. Machine learning uses algorithms which learn from previous data to help produce reliable and repeatable decisions. It’s important to know the iterative aspect of machine learning, as models can have a mind of their own when exposed to new, fresh data. Having consistent models to run over data will help produce accurate and consistent results. Machine learning has “revolutionized” the world of testing and analytics. Machine learning algorithms now have the capability to efficiently and quickly apply complicated mathematical calculations to large sets of data on a regular basis, producing valuable business insights. In this post, we will go over the different types of algorithms that can be used in machine learning.

Regression

Regression models (both linear and non-linear) are used for predicting real value, such as salaries. If your independent variable is time, then you are forecasting future values. Otherwise, your model is predicting present but unknown values. Regression techniques vary from linear regression to SVR and random forests regression.

Classification

Unlike regression where you predict a continuous number, classification is used to predict a category. There is a wide variety of classification applications, from medicine to marketing. Classification models include linear models such as logistic regression, SVM and nonlinear ones such as K-NN, Kernel SVM and random forests.

Clustering

Clustering is similar to classification, but the basis is different. In clustering, you don’t know what you are looking for and you are trying to identify some segments or clusters in your data. When you use clustering algorithms on your dataset, unexpected things can suddenly pop up, such as structures, clusters and groupings you might never have thought of otherwise.

Association rule learning

People who bought X also bought Y. That is what association rule learning will help you figure out.

Reinforcement learning

Reinforcement learning is a branch of machine learning and is also called online learning. It is used to solve interacting problems where the data observed up to time t is considered in order to decide which action to take at time t + 1. It is also used for artificial intelligence when training machines to perform tasks such as walking. Desired outcomes provide the AI with reward, undesired with punishment. Machines learn through trial and error.

Natural language processing

Natural language processing (NLP) is applying machine learning models to text and language. Teaching machines to understand what is said in spoken and written word is the focus of natural language processing. Whenever you dictate something into your iPhone or Android device that is then converted to text, that’s an NLP algorithm in action. You can also use NLP on a text review to predict if the review is a good one or a bad one. You can use NLP on an article to predict categories of the articles you are trying to segment. You can use NLP on a book to predict the genre of the book. And it can go further. You can use NLP to build a machine translator or a speech recognition system, and in that last example, you use classification algorithms to classify language. Speaking of classification algorithms, most NLP algorithms are classification models, and they include logistic regression, Naive Bayes, CART (a model based on decision trees), maximum entropy (again related to decision trees) and Hidden Markov models (models based on Markov processes). A very well-known model in NLP is the Bag-of-Words model. It is a model used to preprocess the texts to classify before fitting the classification algorithms on the observations containing the texts.

Deep learning

Deep learning is the most exciting and powerful branch of machine learning. Deep learning models can be used for a variety of complex tasks:
  • Artificial neural networks for regression and classification
  • Convolutional neural networks for computer vision
  • Recurrent neural networks for time-series analysis
  • Self-organizing maps for feature extraction
  • Deep Boltzmann machines for recommendation systems
  • Auto-encoders for recommendation systems
Some common everyday implementations of this are:
  • Self-driving cars
  • Voice search and voice-activated assistants (Amazon Echo and Google Now)
  • Automatic text generation
  • Automatic image capturing

Conclusion

I hope this gives you a good overview of the different machine learning types and algorithms. In the next couple of blog posts, I will go into detail on each of the sections and start exploring some algorithms and their use cases. If you need help with your data analytics, data science or machine learning requirements, please feel free to get in touch.  

No Comments Yet

Let us know what you think

Subscribe by email