Introduction to Custom Transformers. A Walk-through in Scikit-learn Python

Photo by Arseny Togulev on Unsplash

Data Transformers

Probing the the basics of Confusion Matrix, ROC-AUC curve, and Cost Functions for Classification in Machine Learning.

Photo by Safar Safarov on Unsplash

Confusion Matrix

Terminologies of Confusion Matrix

Hands-on Clustering Algorithms: A Walkthrough in Python!

Photo by Kelly Sikkema on Unsplash


How does Clustering differ from Classification?

Probing deep into the fundamental concepts of Machine Learning

Photo by Arseny Togulev on Unsplash

Machine Learning

“Field of study that gives computers the ability to learn without being explicitly programmed”.

Probing deep into cost functions of Regression and its Optimization Techniques: A Walkthrough in Python

Photo by Alexander Mils on Unsplash

Cost Function

Loss function vs. Cost function

  • A function that is defined on a single data instance is called Loss function.

Boost your data processing performance with Apache Pyspark!

Photo by Kristopher Roller on Unsplash

Apache Spark

Features of Spark

  • Spark is polyglot which means you can utilize Spark using one or more programming languages. Spark provides you with high-level APIs in Java, Python, R, SQL, and Scala. Apache Spark package written in Python is called Pyspark.
  • Spark supports multiple data…

Skyrocket your model performance with Artificial Neural Networks. A Walkthrough in Tensorflow!

Photo by Moritz Kindler on Unsplash

Artificial Neural Network (ANN)

Biological neurons vs Artificial neurons

Structure of Biological neurons and their functions

Categorical Feature Encoding: A Walkthrough in python!

Photo by Maxwell Nelson on Unsplash

Categorical Feature Encoding

  • Ordinal Features
  • Nominal Features

Ordinal Features:

  • Ordinal features are the features that have inherent ordering.
  • Eg: Ratings such as Good, Bad.

Nominal Features:

  • Nominal features are the features that don’t have any inherent ordering as opposed to Ordinal features.
  • Eg: Names of persons, gender, yes, or no.

Need for categorical feature encoding

  • Categorical features must be encoded before…

Facing issues with Overfitting and Low accuracy? Feature Selection comes to the rescue

Photo by Hunter Harritt on Unsplash

Dimensionality Reduction

  • Dimensionality reduction is the process of reducing the set of available features in the dataset.
  • The model could not be applied to the entire set of features directly which may lead to spurious predictions and generalization issues which in turn makes the model unreliable.
  • In order to prevent these issues dimensionality reduction is applied.

Need for dimensionality reduction

  • Overfitting is when the model memorizes the data and fails to generalize. Overfitting can be caused by flexible models (like decision tree) and high dimensional data as well.
  • The Overfitted model could not be applied to real-world problems due to the problem…

Techniques for handling the Missing data in Machine Learning: A Walkthrough in Python

Photo by Franki Chamaki on Unsplash

Why handling missing data is important?

  • First, the missingness of data reduces the power of statistical methods.
  • Second, the missing data can cause bias in the model.
  • Third, many machine learning packages in python does not accept missing data. It needs the missing data to be treated first.

Missing data mechanisms

  • Missing Completely At Random (MCAR): The values are Missing Completely At Random (MCAR) if the missing data is completely not related to both observed…

Srivignesh Rajan

Aspiring Machine Learning Practitioner 👨🏻‍💻 💻

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store