The Architecture of Discovery: Multi-Objective Diffusion for Spatial Proteomics
The "Where" Problem in Biology
Spatial proteomics (IMC) is revolutionary because it tells us where proteins are, not just what they are. But the data is incredibly scarce. My goal was simple: use Generative AI to hallucinate realistic, high-fidelity biological tissues to augment these small datasets.
The path to getting there, however, was anything but linear. Here is the story of how we evolved the architecture.
Phase 1: Naive Pixel Diffusion
My first attempt was a standard pixel-space Diffusion model (DDIM). I trained it directly on the raw protein channels.
- Result: Disappointing. The images were noisy, lacked clear cellular boundaries, and often looked like textured noise rather than biological tissue.
Phase 2: Architectural Overhaul
I realized standard U-Nets weren't capturing the complex, multi-scale features of tissue. I modified the architecture—adding deeper residual blocks and enhanced attention mechanisms—while still operating in pixel space.
- Result: Significant improvement in visual quality. The textures looked "biological," but the computational cost was massive, and training was slow.
Phase 3: The Latent Trade-off (Speed vs. Detail)
To generate high-resolution images (512x512) efficiently, I moved to a Latent Diffusion Model (LDM). By compressing images into a latent space via a VAE, we could train much faster.
- The Trade-off: While training became efficient, we noticed a slight dip in fine-grained detail compared to the pixel-space models. It was a classic engineering compromise: trading a bit of fidelity for the ability to scale up.
Phase 4: Structural Adherence via Mesmer
A major issue remained: the model would sometimes generate proteins in biologically impossible locations. To fix this, I experimented with Structural Conditioning. I used Mesmer (a segmentation tool) to extract masks of nuclei and cell membranes, then conditioned the diffusion process on these masks.
- Result: The model started respecting cellular boundaries. Nuclei proteins stayed in the nucleus; membrane proteins stayed on the membrane.
The Breakthrough: Multi-Objective Training
Even with structural guidance, the model sometimes failed to capture the correlations between different proteins (e.g., if Protein A is high, Protein B should also be high).
The key breakthrough came when I designed a Multi-Objective Training strategy. I trained the model to perform two distinct tasks simultaneously:
- Unconditional Generation: "Dream up a random tissue sample from scratch."
- Protein-to-Protein Translation: "Given this specific protein channel (e.g., DNA), generate the missing channel (e.g., Ki-67) for the same tissue."
Why this changed everything
By forcing the model to solve the translation task, it had to learn the underlying biological relationships between proteins. It couldn't just memorize textures; it had to understand that Protein A implies Protein B.
This significantly improved the medical plausibility of the samples. It also unlocked a powerful real-world application: Missing Channel Imputation. We can now take incomplete biological experiments and plausibly "fill in" the missing protein markers, saving researchers time and money.