~erdo.dev


Machine Learning Key Terminology

2020-05-21

Machine Learning domain is very fascinating and it is becoming very powerful and useful technology in today’s world. Even if you are not directly working on machine learning, in somehow you can touch some part of machine learning in any project. Since machine learning is a very wide open space, it is not possible to learn all of the keywords and the theory behind it for those who are not having expertise area on ML. Which is why I am writing this article, to make those people understand or simply give them a insight of simple concepts in machine learning. I explained general aspect of AI, Machine Learning and Deep Learning on my previous article, (Introduction to AI, ML, and DL). So this will be involved with only Machine Learning. Let’s begin!

Key Terminology

Supervised Learning: Supervised learning is where you have features and corresponding labels and you use an algorithm to learn the parameters for mapping function from the features to the labels.

Unsupervised Learning: Unsupervised learning is where you have features but not corresponding labels.

Semi-Supervised Learning: Semi-supervised learning is where you have features with corresponding labels and without labels. Generally, these kinds of models have more unlabeled examples than labeled ones.

Regression: A regression model predicts continuous, or numerical values. The numerical value could be the price of a house, the temperature of the weather, etc. Regression models can be grouped under supervised learning. Some of the regression algorithms are Linear and Polynomial Regression, Decision Trees, and Random Forest.

Classification: A classification model predicts discrete, or categorical values. The categorical value could be dog or cat, disease or not disease, etc. Classification models can be grouped under supervised learning. Some of the classification algorithms are KNN (K-Nearest Neighbour), Trees, SVM and Logistic Regression.

Clustering: A clustering model discovers the inherent groupings in continuous data according to similar features. An example could be the grouping of customers by purchasing behavior. Clustering models can be grouped under unsupervised learning. Some of the Clustering algorithms are K-Means and PCA.

Association: An association model discovers rules that describe large partitions of categorical data. An example could be: people that who buy suits also tend to buy ties. Hidden Markov Model and FP-Growth. Association models can be grouped under unsupervised learning.

Models: A model defines the mapping relationship between features and labels. The model uses the parameters to make a prediction.

Labels: A label is the resultant value or category variable we are predicting (mostly denoted as 'y') as a result of the machine learning model. The label could be the object shown in an image, the future temperature value of weather, or the next word in a text or music lyrics.

Features: A feature is an input variable (mostly denoted as 'x') that we use for feeding the machine learning model. The features could be the pixels of an image, past temperature and humidity values of weather, or words in a text or music lyrics.

Examples: An example is the set of features and its corresponding label. An example could be the pixels of an image(features) and the object (label) inside this image.

Parameters: Parameters are the weights and biases that we are trying to calculate and set their most appropriate values to make a good prediction at the end of the learning process.

Hyper-Parameters: Hyper-parameters are the parameters that you are tuning by hand to make the model more predictive. Some of them are learning rate, epoch, batch size, and so on.

Accuracy: Accuracy is the one of model metrics that tell us how good our model is at making predictions.

References