Multinomial Naive Bayes #
This module gives a pure multinomial Naive Bayes classifier over String features and labels,
using Lean's HashMap for the fitted count tables. It is not a tensor-indexed neural model like
most of NN/Spec/Models/*; it is a non-neural baseline that keeps the training and prediction
semantics explicit.
Probabilities are computed in log space (via MathFunctions.log) to avoid underflow.
Ecosystem note:
PyTorch does not provide a Naive Bayes classifier in torch.nn; the closest ecosystem analogue is
scikit-learn’s MultinomialNB.
What "training" means here #
Naive Bayes is a counting model: training is just collecting label and feature counts from the dataset. The API keeps fitting and inference separate, so examples can show exactly where counts are learned and where predictions are made.
The API exposes an explicit fit step that produces a Model, plus:
predictModelfor inference using the fitted countsnegLogLikelihoodas a standard training objective (useful for evaluation/comparison)
Fitted model #
Model stores the counts and some precomputed bookkeeping derived from the dataset.
Nothing here depends on the scalar type α; we only need α when we turn counts into smoothed
probabilities (log-space scores).
Fitted multinomial Naive Bayes model.
This stores raw counts plus a little derived bookkeeping (labels, vocab, totalExamples).
Scoring functions turn these counts into Laplace-smoothed log probabilities on demand.
- labelCounts : Std.HashMap String ℕ
label Counts.
- featureCounts : Std.HashMap String (Std.HashMap String ℕ)
feature Counts.
- totalCounts : Std.HashMap String ℕ
total Counts.
labels.
vocab.
- totalExamples : ℕ
total Examples.
Instances For
Fit a naive Bayes model by collecting counts from the dataset.
Instances For
Scoring and prediction #
We use the standard multinomial NB scoring rule (with Laplace smoothing):
- prior:
(count(label)+1) / (N + nLabels) - conditional:
(count(feature,label)+1) / (totalFeatures(label) + vocabSize)
Scores are in log space. For prediction we only need relative ordering.
Training objective (negative log-likelihood) #
This is the standard objective used to evaluate NB models:
- Σ log P(y_i | x_i)
Even though we don't optimize it with gradients (NB training is closed-form counting), having this objective is useful for:
- checking improvements (smoothing choices, feature engineering)
- comparing NB against other baselines
- unit tests / runtime checks