BLOG
All

What is a probability of an event? Classic formula and examples
Definition of probability Probability is a word that we use a lot in our everyday life. Winning the lottery and being struck by lightning is very unprobable. Instead, receiving a…

Introduction to probability theory
What is probability theory? Most phenomena in our lives are partially random. Think about weather forecasting, the stock market, and football game outcomes. These are all things we can’t exactly…

What is bagging in ensemble learning?
What is bagging? Bagging is a parallel ensemble learning technique that trains multiple weak models on different datasets and averages their predictions. The bagging algorithm Problem statement We have a…

Ensemble learning, boosting and bagging
What is ensemble learning? Ensemble learning is a machine-learning approach where we train numerous simple models and aggregate their predictions. These simple models are called weak learners. What is an…

The derivative of a function explained clearly
Suppose we have a graph representing the population of a village as a function of time.Let us take two time instants on the x-axis where the population is equal. Now…

Code random forest from scratch in Python
In this post, I’ll show you how to program a random forest from scratch in Python using ONLY MATH. Why is coding a random forest from scratch useful? When studying…

The complete guide to handling missing values
What are missing values in machine learning? Missing values in a dataset indicate the absence of observations. The danger of missing values Why are missing values a problem for our…

The complete guide to encoding categorical features
What are categorical features – recap In categorical features, measurements can assimilate a number of limited and fixed values, called “categories“. There are 2 types of categorical features: Why can’t…

What is feature engineering? Definition, techniques and importance
What is feature engineering? Feature engineering is selecting, extracting, and transforming features from raw data to create a new dataset useful for building predictive models. This new dataset is compatible…

What is a confusion matrix?
Types of classification outputs Positive and negative outputs In a classification problem, there are 2 types of categories, positive and negative. Positive categories are labels with a particular characteristic that…