PK
r\ _rels/PK
r\ docProps/PK
r\ ppt/PK
r\
ppt/_rels/PK
r\ ppt/charts/PK
r\ ppt/charts/_rels/PK
r\ ppt/embeddings/PK
r\
ppt/media/PK
r\ ppt/slideLayouts/PK
r\ ppt/slideLayouts/_rels/PK
r\ ppt/slideMasters/PK
r\ ppt/slideMasters/_rels/PK
r\ ppt/slides/PK
r\ ppt/slides/_rels/PK
r\
ppt/theme/PK
r\ ppt/notesMasters/PK
r\ ppt/notesMasters/_rels/PK
r\ ppt/notesSlides/PK
r\ ppt/notesSlides/_rels/PK
r\
rp [Content_Types].xml
PK
r\] ] _rels/.rels
PK
r\! docProps/app.xml
0
0
Microsoft Office PowerPoint
On-screen Show (16:9)
0
8
8
0
0
false
Fonts Used
2
Theme
1
Slide Titles
8
Arial
Calibri
Office Theme
Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8
PptxGenJS
false
false
false
16.0000
PK
r\(. . docProps/core.xml
Healthcare Anomaly Detection - Technical Deep Dive
PptxGenJS Presentation
DATA 110 - UNC Chapel Hill
DATA 110 - UNC Chapel Hill
1
2026-03-18T20:53:23Z
2026-03-18T20:53:23Z
PK
r\g
ppt/_rels/presentation.xml.rels
PK
r\Oݨ ppt/theme/theme1.xmlPK
r\ڧ
ppt/presentation.xml
PK
r\X ppt/presProps.xml
PK
r\ ppt/tableStyles.xml
PK
r\D
>0 0 ppt/viewProps.xml
PK
r\H7t ! ppt/slideLayouts/slideLayout1.xml
PK
r\ђ7 7 , ppt/slideLayouts/_rels/slideLayout1.xml.rels
PK
r\85t ppt/slides/slide1.xml
Anomaly Detectionin Patient Vital SignsTechnical Deep Dive: Models, Features, and Evaluation MetricsFor: Data Science / Engineering Team | DATA 110 ProjectSchool of Data Science and Society • UNC Chapel HillPK
r\>D X X ppt/slides/_rels/slide1.xml.rels
PK
r\. ppt/notesSlides/notesSlide1.xml
1PK
r\:A * ppt/notesSlides/_rels/notesSlide1.xml.rels
PK
r\3 ppt/slides/slide2.xml
Dataset OverviewFeature Set (8 Vital Signs) Feature Unit Normal Range Heart Rate bpm 60 – 100 Systolic BP mmHg 90 – 140 Diastolic BP mmHg 60 – 90 Oxygen Saturation % 95 – 100 Temperature °F 97.0 – 99.5 Respiratory Rate bpm 12 – 20 White Blood Cells K/μL 4.5 – 11.0 Glucose mg/dL 70 – 140 Class Distribution n = 500 | Imbalanced dataset (12% anomaly rate) | class_weight='balanced' used to handle imbalance | 70/30 train-test split with stratificationPK
r\9\X X ppt/slides/_rels/slide2.xml.rels
PK
r\ ppt/notesSlides/notesSlide2.xml
2PK
r\xշ * ppt/notesSlides/_rels/notesSlide2.xml.rels
PK
r\~&` ` ppt/slides/slide3.xml
Methodology: Three Approaches Random ForestSupervisedn_estimators=100max_depth=10class_weight='balanced'How it works:Ensemble of 100 decision trees. Each tree votes on whether a patient is normal or anomalous. Majority rules.+ Highest accuracy; interpretable feature importance− Requires labeled training data Isolation ForestUnsupervisedn_estimators=100contamination=0.12How it works:Randomly partitions feature space. Anomalies are isolated in fewer splits (shorter path length).+ No labels needed; scales well− Can't identify anomaly type DBSCANUnsupervisedeps=2.5min_samples=10How it works:Finds dense clusters in feature space. Points not belonging to any cluster are flagged as anomalies (noise).+ Discovers natural groupings; flexible shapes− Sensitive to eps/min_samples tuningPK
r\W/ ppt/slides/_rels/slide3.xml.rels
PK
r\K|Ő ppt/notesSlides/notesSlide3.xml
3PK
r\9Y * ppt/notesSlides/_rels/notesSlide3.xml.rels
PK
r\dz#>. . ppt/slides/slide4.xml
Model Evaluation Metrics Metric DefinitionsAccuracyOverall correct predictions / total predictionsPrecisionOf those flagged anomaly, how many truly are? (Avoid false alarms)RecallOf all true anomalies, how many did we catch? (Don't miss critical cases)F1 ScoreHarmonic mean of Precision and Recall — balanced measure⚠ In healthcare, RECALL is critical — missing an anomaly (False Negative) can cost a life. Prioritize recall over precision.PK
r\LX X ppt/slides/_rels/slide4.xml.rels
PK
r\vs ppt/notesSlides/notesSlide4.xml
4PK
r\J * ppt/notesSlides/_rels/notesSlide4.xml.rels
PK
r\˰ ppt/slides/slide5.xml
Feature Importance Analysis InterpretationTop 3 predictors account for 57% of model's decision power.Heart Rate is the strongest signal — elevated in sepsis and cardiac events, depressed in hypothermia.Temperature and WBC are classic infection indicators, explaining their high ranking.Blood pressure features rank lower — they differ across anomaly types, reducing their individual discriminative power.PK
r\+hX X ppt/slides/_rels/slide5.xml.rels
PK
r\W8 ppt/notesSlides/notesSlide5.xml
5PK
r\Q e * ppt/notesSlides/_rels/notesSlide5.xml.rels
PK
r\yqM M ppt/slides/slide6.xml
Data Preprocessing Pipeline1Load & Inspectdf = pd.read_excel('data.xlsx')df.info() # Check types, nulls500 rows × 13 cols, no missing values2Feature Selectionfeatures = ['Heart_Rate_bpm', 'Systolic_BP_mmHg', ...] # 8 vitalsExcluded: Patient_ID, Timestamp, Gender (categorical)3Train/Test SplitX_train, X_test = train_test_split( X, y, test_size=0.30, stratify=y)Stratified split preserves 12% anomaly ratio in both sets4StandardScalerscaler = StandardScaler()X_train = scaler.fit_transform(X_train)X_test = scaler.transform(X_test)Fit on train only — prevents data leakagePK
r\ج+ ppt/slides/_rels/slide6.xml.rels
PK
r\z ppt/notesSlides/notesSlide6.xml
6PK
r\=| * ppt/notesSlides/_rels/notesSlide6.xml.rels
PK
r\I!i/ i/ ppt/slides/slide7.xml
Technical Lessons & Next Steps Lessons Learnedclass_weight='balanced' is essential for imbalanced anomaly datasetsStandardScaler must be fit on training data only to prevent data leakageDBSCAN eps parameter requires experimentation — too small = everything is noiseIsolation Forest contamination should approximate true anomaly rateSupervised models outperform unsupervised when quality labels exist Potential ImprovementsAdd temporal features (vital sign trends over time, rate of change)Try XGBoost or neural networks for higher accuracyImplement SMOTE oversampling as alternative to class_weightBuild a real-time streaming pipeline (Kafka → model → alert)Autoencoder for unsupervised — learns normal pattern, flags reconstruction errorsCross-validation (k=5) for more robust metric estimatesPK
r\F ppt/slides/_rels/slide7.xml.rels
PK
r\)l ppt/notesSlides/notesSlide7.xml
7PK
r\|g * ppt/notesSlides/_rels/notesSlide7.xml.rels
PK
r\J/F ppt/slides/slide8.xml
Questions?Let's Dive Into the Code.Jupyter Notebook + Excel Dataset Available for Hands-On ExplorationDATA 110 • School of Data Science and Society • UNC Chapel HillPK
r\WsX X ppt/slides/_rels/slide8.xml.rels
PK
r\iސ ppt/notesSlides/notesSlide8.xml
8PK
r\pO * ppt/notesSlides/_rels/notesSlide8.xml.rels
PK
r\K ! ppt/slideMasters/slideMaster1.xml
PK
r\N) , ppt/slideMasters/_rels/slideMaster1.xml.rels
PK
r\6T T ! ppt/notesMasters/notesMaster1.xml
7/23/19Click to edit Master text stylesSecond levelThird levelFourth levelFifth level‹#›PK
r\s* * , ppt/notesMasters/_rels/notesMaster1.xml.rels
PK
r\^^a ppt/media/image-1-1.pngPNG
IHDR \rf pHYs MIDATxi\E/[
cA0.%(0JQ
R([D)P1\@-@J-! B:ՙsΙy~~_a榦>̙stWHm)π_.>ج*08 nPmfĚM53 Xx*8amBM
L Vnv94!s-5)p&߆kj=
xuP4~#Ho- q3=emB S)4'65ǀ?a"ZkGֵy `*J
Bo]qkG$PcE%Iֵ/pO6HVSQ9<ڈ s`>fO>ܯ_nu~L^ep[cz]sVHp6$O Ѐmr2|amVM3es~Bi/>M1p5>\_.ete5|f뇞%'7qO_Z35g58\`
YzOħ:eA|^#| oSC#)G+wQQqGscO¾X٣N̄5b_+.]
xKp٣R-Qc}%]2|C
Hkh /
S={:o좦gY<WfVBrx)3폭?|!$@k?/~F
N`fj<5l$γ