Notebook 01
01 Scope And Research Questions
Download raw notebookFrames the dissertation around tabular baseline evaluation, image-model development, and synthetic multimodal experimentation under explicit non-clinical constraints.
PurposeDefine the study scope, central research questions, and the boundaries around exploratory multimodal claims.
The Wisconsin branch is used as the tabular baseline for comparison.
The BreaKHis build is the main image contribution.
Synthetic fusion is positioned as exploratory methodology, not clinical evidence.
Notebook 02
02 BreaKHis Dataset Exploration
Download raw notebookAudits class balance, magnification coverage, image size variation, and sample appearance before training the image branch.
PurposeEstablish an empirical understanding of the BreaKHis binary dataset and surface practical preprocessing constraints.
The dataset spans multiple magnifications and heterogeneous staining appearances.
Image dimensions and colour distributions vary enough to justify careful normalization.
Exploration outputs anchor later preprocessing and augmentation decisions.
Notebook 03
03 Split Audit And Patient Leakage
Download raw notebookDemonstrates leakage in the naive image-level split and motivates the patient-level evaluation protocol.
PurposeProve why patient-level separation is necessary before reporting any image-model result.
Naive image-level splitting leaks patient information across train and test.
The patient-level split materially changes the credibility of downstream metrics.
Leakage auditing becomes a first-class part of the dissertation narrative.
Notebook 04
04 Preprocessing And Dataloaders
Download raw notebookBuilds reproducible transforms, loaders, and normalization choices around the patient-level split.
PurposeTranslate the audited dataset into a stable image-processing pipeline ready for model development.
BreakHis-specific normalization is tracked explicitly rather than assumed.
Augmentation is controlled and lightweight instead of visually extreme.
The preprocessing pipeline is structured for reuse in later inference.
Notebook 05
05 Model Development
Download raw notebookCompares image-branch development runs and saves the patient-level ResNet18 checkpoint used by the app.
PurposeIdentify the most credible image model configuration for transfer-ready inference.
The final patient-level model is selected from the development sequence.
Best-checkpoint selection is based on validation behaviour under the patient-level split.
The saved clean checkpoint becomes the app-facing image artifact.
Notebook 06
06 Evaluation And Error Analysis
Download raw notebookEvaluates the patient-level image model with ROC, calibration, confusion, magnification, and failure analysis outputs.
PurposeProduce the test-set evidence used throughout the written dissertation and web application.
Patient-level performance remains credible under the leakage-safe split.
Calibration and error analysis are surfaced alongside accuracy and ROC rather than hidden.
Magnification-specific behaviour is examined instead of assuming uniform performance.
Notebook 07
07 Wisconsin Review And Integration
Download raw notebookReviews the Wisconsin branch, documents its transfer contract, and aligns it with the image-side app integration.
PurposePrepare the tabular baseline for safe reuse without altering the original notebook or artifacts.
Its input contract is made explicit for downstream app integration.
The tabular branch is treated as a baseline and comparison anchor.
Notebook 08
08 Synthetic Pairing Design
Download raw notebookConstructs the synthetic pairing logic used to test fusion strategies across independent unimodal datasets.
PurposeDefine how same-label and random pairing experiments are constructed while preserving the project’s non-clinical framing.
Pairings are explicitly synthetic and should never be interpreted as patient-level multimodal truth.
Same-label and random strategies are both retained for comparison.
The pairing process is auditable and reproducible across seeds.
Notebook 09
09 Fusion Experiments
Download raw notebookRuns exploratory early- and late-fusion experiments on synthetic pairings and benchmarks them against unimodal baselines.
PurposeEvaluate whether synthetic pairings can still support useful fusion-method comparisons under data scarcity.
Fusion outputs are reported as exploratory only.
Repeated-seed evaluation surfaces stability rather than relying on a single run.
The experiments are retained for comparison while keeping the claim boundary explicit.
Notebook 10
10 Model Comparison And Joint Analysis
Download raw notebookBrings the tabular, image, and synthetic-fusion branches into one comparison space for the final dissertation analysis.
PurposeCreate the cross-model evidence tables and figures used to explain the strengths and limits of each branch.
The tabular branch remains the strongest benchmark numerically.
The image branch provides the main new contribution.
Synthetic fusion comparisons are kept visible but carefully caveated.
Notebook 11
11 Results Synthesis And Defense Pack
Download raw notebookPackages the core dissertation claims, defense figures, and final synthesis outputs for written delivery and presentation.
PurposeProvide the final narrative layer that turns the research workflow into a defendable dissertation artifact set.
The final pack distills the project into a concise claim set.
Figures and tables are curated for communication rather than exploration alone.
The web experience can reuse this notebook as its top-level narrative anchor.
Notebook 12
12 Demo Preset Generation
Download raw notebookGenerates the traceable preset manifest used by the web app for tabular, image, and synthetic-fusion demo cases.
PurposeCreate auditable demo inputs without hard-coding preset data directly inside the application.
Tabular presets reuse the existing BreaScope AI profile values and validate them against Wisconsin feature ranges.
Image presets cover each binary BreaKHis label and magnification using real held-out examples.
Fusion presets are explicitly synthetic story cases with recorded probability construction.