tech
Jun 16, 2025
2 min read

CHEESE Search: A Billion-Scale Vector Molecular Search

This is my billion-scale vector engine for 3D shape & electrostatic molecular similarity. It turns ligand-based screening into nearest-neighbour lookups—so you can *search, rank, and visualise* huge chemical spaces in seconds.
Visualisation of chemical spaces using CHEESE embeddings.

What I built:

  • Architected and trained the large AI models and led the R&D behind CHEESE.
  • Implemented a super low-cost from-scratch vector database (no FAISS/Pinecone) that now indexes 30B+ vectors and costs essentially nothing to run.
  • Designed and shipped the Python FaspAPI as well as the Vite + React UI.

👉 Try the demo: cheese.deepmedchem.com 📄 Read the paper: CHEESE: 3D Shape and Electrostatic Virtual Screening in a Vector Space

For a bit of the backstory on my frontend journey, see My Flight from Flask to React, via Streamlit and Dash.

Visualisation of chemical spaces using CHEESE embeddings.

Check out this ACS Spring presentation about it: