51 docs tagged with "machine-learning"

View all tags

Accuracy: The Intuitive Metric

Understanding the most common evaluation metric, its formula, and its fatal flaws in imbalanced datasets.

Actor-Critic Methods

Combining value-based and policy-based methods for stable and efficient reinforcement learning.

Attention Models: Learning to Focus

How the Attention mechanism solved the bottleneck problem in Seq2Seq models and paved the way for Transformers.

Autoencoders

Neural network-based dimensionality reduction: Encoder-Decoder architecture and bottleneck representations.

DBSCAN: Density-Based Clustering and Outlier Detection

Discovering clusters of arbitrary shapes and identifying outliers using density-based spatial clustering.

Decision Trees

Understanding recursive partitioning, Entropy, Gini Impurity, and how to prevent overfitting in tree-based models.

Deep Q-Networks (DQN)

Scaling Reinforcement Learning with Deep Learning using Experience Replay and Target Networks.

Elastic Net Regression

Combining L1 and L2 regularization for the ultimate balance in feature selection and model stability.

F1-Score: The Balanced Metric

Mastering the harmonic mean of Precision and Recall to evaluate models on imbalanced datasets.

Feature Scaling: Normalization & Standardization

Mastering the techniques used to harmonize feature scales, ensuring faster convergence and better model accuracy.

Feature Selection: Quality Over Quantity

Techniques for identifying and keeping only the most relevant features using filter, wrapper, and embedded methods.

Gaussian Mixture Models (GMM)

Probabilistic clustering using Expectation-Maximization and the Normal distribution.

Gradient Boosting: Learning from Mistakes

Exploring the power of Sequential Ensemble Learning, Gradient Descent, and popular frameworks like XGBoost and LightGBM.

Hierarchical Clustering

Understanding Agglomerative clustering, Dendrograms, and linkage criteria.

K-Fold Cross-Validation

Mastering robust model evaluation by rotating training and testing sets to maximize data utility.

K-Means Clustering

Grouping data into K clusters by minimizing within-cluster variance.

K-Nearest Neighbors (KNN)

Understanding the proximity-based classification algorithm: distance metrics, choosing K, and the curse of dimensionality.

Lasso Regression (L1 Regularization)

Understanding L1 regularization, sparse models, and automated feature selection.

Leave-One-Out Cross-Validation (LOOCV)

The most exhaustive validation technique: training on N-1 samples and testing on a single observation.

LIME & SHAP: Interpreting the Black Box

A deep dive into local explanations using LIME and game-theory-based global/local explanations with SHAP.

Linear Regression

Mastering the fundamentals of predicting continuous values using lines, slopes, and intercepts.

Loading Data in Scikit-Learn

How to use Scikit-Learn's built-in datasets, fetchers, and external loaders to prepare data for modeling.

Log Loss (Logarithmic Loss): The Probability Penalty

Understanding cross-entropy loss and why it is the gold standard for evaluating probability-based classifiers.

Logistic Regression

Understanding binary classification, the Sigmoid function, and decision boundaries.

Making Predictions

How to use trained Scikit-Learn estimators to generate point predictions and probability estimates.

Model Selection & Validation

How to choose the right algorithm, split data correctly, and use Cross-Validation to ensure model reliability.

Normalization Techniques

A deep dive into Min-Max scaling, MaxAbs scaling, and Unit Vector normalization for bounded data ranges.

Policy Gradients

Optimizing the policy directly: understanding the REINFORCE algorithm, stochastic policies, and the Policy Gradient Theorem.

Polynomial Regression: Beyond Straight Lines

Learning to model curved relationships by transforming features into higher-degree polynomials.

Precision: The Quality Metric for Positive Predictions

Understanding Precision, its mathematical foundation, and why it is vital for minimizing False Positives.

Principal Component Analysis (PCA)

Mastering feature extraction, variance preservation, and the math behind Eigenvalues and Eigenvectors.

Q-Learning: Learning Through Rewards and Penalties

Mastering the Bellman Equation, Temporal Difference learning, and the Exploration-Exploitation trade-off.

Random Forest: Strength in Numbers

Understanding Ensemble Learning, Bagging, and how Random Forests reduce variance to build robust classifiers.

Recall: The Sensitivity Metric

Understanding Recall, its mathematical definition, and why it is critical for minimizing False Negatives.

Reinforcement Learning: Learning through Action

Understanding the Agent-Environment loop, reward signals, and how AI learns to make optimal decisions in dynamic systems.

Ridge Regression (L2 Regularization)

Mastering L2 regularization to prevent overfitting and handle multicollinearity in regression models.

ROC Curve and AUC

Evaluating classifier performance across all thresholds using the Receiver Operating Characteristic and Area Under the Curve.

Self-Supervised Learning: The Engine of Modern AI

How AI learns by predicting missing parts of its own input, powering Large Language Models and Computer Vision.

Semi-Supervised Learning: The Best of Both Worlds

Combining small amounts of labeled data with large amounts of unlabeled data to improve model accuracy and reduce labeling costs.

Supervised Learning: Learning with Labels

A deep dive into supervised learning: regression, classification, and the relationship between features and targets.

Support Vector Machines (SVM)

Mastering the geometry of classification: margins, hyperplanes, and the Kernel Trick.

The Confusion Matrix

The foundation of classification evaluation: True Positives, False Positives, True Negatives, and False Negatives.

Tokenization: Breaking Down Language

The first step in NLP: Converting raw text into manageable numerical pieces.

Train-Test Split

Mastering the data partitioning process to ensure unbiased model evaluation.

Unsupervised Learning: Finding Hidden Structure

Discovering patterns in unlabeled data through clustering, association, and dimensionality reduction.

Welcome to Machine Learning

A comprehensive introduction to the Machine Learning Tutorial structure, purpose, and key learning outcomes for CodeHarborHub learners.

What is Machine Learning (ML)?

Define Machine Learning, its key characteristics, and how it differs from traditional programming.

What is Machine Learning?

Understanding the paradigm shift from traditional programming to data-driven learning.

Why Model Evaluation Matters

Understanding the difference between training performance and real-world reliability.

Word Embeddings: Mapping Meaning to Vectors

How to represent words as dense vectors where geometric distance corresponds to semantic similarity.

XAI Basics: Beyond the Black Box

An introduction to Explainable AI, its importance in ethics and regulation, and the trade-off between performance and interpretability.