The Deployment Unifying Framework

Every basic model data mining/data analytics course covers โ€” what it outputs, what gets deployed, and how it reaches production

Predictive Analytics & Data Mining
All Models
Supervised
Unsupervised
Text Analytics
Time Series
Model / Technique Type What's the "Model"? What Gets Deployed? Deployment Vehicle How You Evaluate It When to Retrain
๐ŸŽฏ SUPERVISED LEARNING โ€” Predict a Known Target
Linear Regression
Predict a continuous number
Supervised Coefficients (ฮฒโ‚€, ฮฒโ‚, ฮฒโ‚‚โ€ฆ) that form a prediction equation: ลท = ฮฒโ‚€ + ฮฒโ‚xโ‚ + ฮฒโ‚‚xโ‚‚ + โ€ฆ The coefficient table โ€” literally a list of weights. Multiply each input by its weight, sum them up, done. Excel SUMPRODUCT
REST API
Batch SQL
In-Database
Rยฒ, Adjusted Rยฒ, RMSE, MAE, residual plots, p-values When Rยฒ drops on new data, or residual patterns appear โ€” suggests relationships have shifted
Logistic Regression
Predict yes/no probability
Supervised Coefficients + intercept that output a probability via the sigmoid function: P = 1/(1+e^-(ฮฒโ‚€+ฮฒโ‚xโ‚+โ€ฆ)) Coefficient table + chosen threshold. Score = probability. Decision = "if P > threshold, then yes." Excel formula
Real-time API
Batch scoring
In-Database
AUC, accuracy, precision, recall, F1, confusion matrix, lift chart When AUC drops below business threshold, or class distribution shifts (concept drift)
Decision Tree
If-then rules for prediction
Supervised A set of IF-THEN-ELSE rules that split on feature thresholds โ€” the full tree structure The rule set. Each rule is a nested IF statement. Can be written out as plain English or Excel IF() chains. Nested IF() formulas
REST API
Batch SQL CASE
In-Database
Accuracy, precision, recall, confusion matrix, feature importance, pruning validation When accuracy degrades, or when new categories appear that the tree never saw
Random Forest
Ensemble of many trees
Supervised Hundreds of decision trees that each vote โ€” the majority wins. The forest = collection of trees The full model object (pickle/ONNX). Too complex for Excel. Deploy as API or use a surrogate decision tree. REST API
Batch scoring
In-Database (ONNX)
โš ๏ธ Not practical in native Excel
AUC, accuracy, precision, recall, OOB error, feature importance, cross-validation When OOB error increases, or feature importances shift significantly
Gradient Boosting / XGBoost
Sequential error-correcting trees
Supervised A sequence of small trees, each fixing the previous tree's mistakes. Final score = sum of all tree outputs The model object (pickle/PMML/ONNX). Like Random Forest โ€” needs an API or in-database runtime. REST API
Batch scoring
In-Database
โš ๏ธ Not practical in native Excel
AUC, log-loss, RMSE, learning curves, SHAP values for explainability When validation metrics degrade, or when retraining on recent data significantly changes predictions
Neural Network
Layers of weighted connections
Supervised Weight matrices + bias vectors for each layer, plus activation functions. A computational graph The trained model file (H5, SavedModel, ONNX). Requires a runtime environment to execute. REST API (TF Serving)
Batch inference
Edge/Mobile
โš ๏ธ Black box โ€” pair with SHAP
Accuracy, AUC, loss curves, confusion matrix โ€” same as logistic but less interpretable When performance degrades, or when data distribution shifts (monitor input distributions)
SVM
Support Vector Machine
Supervised Support vectors + hyperplane equation that maximizes the margin between classes The support vectors and kernel parameters. For linear SVM: equivalent to a coefficient table. REST API
Batch scoring
Excel (linear only)
In-Database
Accuracy, AUC, precision, recall, margin width, support vector count When class boundaries shift, or new data points fall in unexpected regions
Naive Bayes
Probability lookup table
Supervised Probability tables: P(class) and P(feature | class) for every feature-class combination The probability tables. Scoring = multiply prior ร— likelihoods. Pure lookup, no matrix math. Excel VLOOKUP + PRODUCT
REST API
Batch SQL
In-Database
Accuracy, precision, recall, F1, especially useful for text classification tasks When vocabulary shifts (for text) or when conditional independence assumption breaks down
๐Ÿ” UNSUPERVISED LEARNING โ€” Discover Hidden Structure
K-Means Clustering
Find natural customer groups
Unsupervised K cluster centroids (the center point of each group in feature space) Centroids + scaler parameters. New customer โ†’ standardize โ†’ calculate distance to each centroid โ†’ assign nearest. Excel SUMPRODUCT + SQRT
Batch nightly scoring
REST API
In-Database
Silhouette score, elbow method, cluster stability, business naming test ("can you name it?") When cluster sizes shift dramatically, or avg distance-to-centroid increases (segment drift)
Hierarchical Clustering
Tree of nested groups
Unsupervised A dendrogram (tree) showing how groups merge โ€” cut at desired height to get K clusters After cutting the tree: same as K-Means โ€” centroids (group averages) + scaler. Dendrogram is for analysis, not scoring. Excel (post-cut centroids)
Batch scoring
In-Database
Dendrogram = analysis artifact only
Dendrogram visual inspection, cophenetic correlation, silhouette score at chosen cut When the dendrogram structure changes significantly with new data
Anomaly Detection
Find what doesn't belong
Unsupervised A boundary defining "normal" โ€” anything outside it is an anomaly. One-Class SVM / Isolation Forest The boundary model (decision function). New record โ†’ score โ†’ if score < threshold โ†’ flag as anomaly. Real-time API (fraud)
Batch scanning
In-Database / Edge
โš ๏ธ Requires careful threshold tuning
Precision & recall on known anomalies (if available), false positive rate, domain expert review When "normal" behavior evolves โ€” seasonal patterns, new products, market changes
Market Basket / Association Rules
What items go together?
Unsupervised A rules table: "If {chips, salsa} then {beer}" with support, confidence, and lift The rules table itself. No scoring function โ€” it's a lookup: "customer has X โ†’ recommend Y." Excel filtered rules table
Batch recommendation
Recommendation API
Rules ARE the deployment โ€” no model object
Support, confidence, lift. Lift > 1 = positive association. Business validation: do the rules make sense? When product catalog changes, seasonal shifts, or when lift values decay (products no longer co-purchased)
๐Ÿ’ฌ TEXT ANALYTICS โ€” Turn Words into Numbers, Then Actions
Sentiment Analysis
VADER / TextBlob / Custom
Varies A sentiment lexicon (word โ†’ score dictionary) + scoring rules for negation, intensity, punctuation The lexicon + rules engine. New text โ†’ tokenize โ†’ look up each word โ†’ apply rules โ†’ output polarity score. Batch CSV scoring
Real-time API
Excel VBA (simplified)
In-Database text functions
Accuracy vs. hand-labeled sample, precision/recall on pos/neg classes, domain coverage check When domain language evolves (new slang), or when accuracy on fresh hand-labeled data drops
Topic Extraction / Text Clustering
TF-IDF + Clustering / LDA
Unsupervised A vocabulary + topic-word distributions (which words define each topic) from TF-IDF or LDA The vocabulary (word โ†’ index mapping) + topic centroids. New document โ†’ vectorize with same vocab โ†’ assign to nearest topic. Batch categorization
API for ticket routing
Excel word-match lookup
Vocabulary must match training exactly
Topic coherence score, human interpretability ("can you name each topic?"), classification accuracy if labels exist When new topics emerge, vocabulary drifts, or topic assignments no longer match domain expert expectations
๐Ÿ“ˆ TIME SERIES โ€” Predict What Happens Next
Exponential Smoothing
Trend + Seasonality forecasting
Supervised Smoothing parameters (ฮฑ, ฮฒ, ฮณ) for level, trend, and seasonal components. State-space model The smoothing parameters + last known state. Forecast = apply parameters to generate future points + confidence intervals. Excel FORECAST.ETS
Batch monthly forecasts
Streaming updates
In-Database
MAE, RMSE, MAPE, forecast vs. actuals plot, prediction intervals coverage Every forecast cycle (monthly/quarterly) with new actuals โ€” time series models are inherently refresh-heavy
ARIMA / Prophet
Advanced forecasting
Supervised AR/MA coefficients (ARIMA) or trend changepoints + Fourier seasonality (Prophet). Parametric time model The fitted model parameters. Generate forecasts for horizon N with confidence bands. Batch forecasting
API for demand planning
In-Database
Excel: limited to ETS; ARIMA needs Python/R
MAE, RMSE, MAPE, AIC/BIC for model selection, residual autocorrelation (Ljung-Box test) With every new data cycle โ€” refit on latest actuals and compare forecast accuracy to prior version
87%
87% of ML projects never reach production. The table above shows exactly what needs to happen for the other 13%. The "What Gets Deployed?" column is where most projects die โ€” teams build a model but never extract the deployable artifact. Whether it's coefficients, centroids, rules, or a lexicon, someone has to carry that artifact into the system where decisions get made.

The Pattern Across All Models

Supervised models deploy a scoring function โ€” feed in new inputs, get a prediction back. Unsupervised models deploy output artifacts โ€” centroids, rules tables, lexicons โ€” that become the inputs to downstream business logic. Time series models deploy parameters + state โ€” they generate forecasts forward from the last known data point. In every case, the deployment question is the same: "What artifact do I extract from training, and where does it live so decisions happen automatically?"