Skip to content

Bridging Benchling and AWS: How GenAI and MCP Accelerate Scientific Workflows

Watch now: Kevin Moore & Vasisht Tadigotla: Accelerating Cross-Assay Correlation with GenAI
Transform Hours of Tedious Manual Data Analysis Into Minutes with Connected Scientific Data and AI

Like many modern biotech companies, Sail Biomedicines conducts multiple types of experiments on its translatable circular RNA technology, often using the same samples. But answering simple experimental questions can take hours of work since results and metadata are spread across multiple systems. In Sail's case, two of their most important systems are Benchling and AWS.

  • Benchling: The system of record for bench scientists, tracking samples, assays, and experimental context, and home to low-throughput experiment results: ELISAs, PCR, Flow Cytometry, etc.
  • AWS: The home for computational biology, where Nextflow pipelines generate terabytes of RNA-Seq and NGS data.

Even answering simple questions requires data from both of these systems. For example, if a scientist wants to know, "Which circular RNA construct(s) generated the highest protein expression, and is there a correlation between RNA expression and protein expression?"

They typically have to ask a colleague on the data science team, who then must spend hours:

  • Finding the relevant datasets in Benchling and AWS
  • Pulling expression data from Nextflow results in Amazon S3
  • Matching the metadata with results in the Benchling warehouse
  • Pulling the data into a Jupyter notebook
  • Generating a set of plots
  • Writing a report to share the results with the team

This delay costs teams more than just time. When scientists are stuck waiting to answer “did my experiment work and what should I do next?”, innovation slows down.

How AI and Model Context Protocol (MCP) Enable Faster Scientific Data Analysis

At the 2025 Nextflow Summit, Vasisht Tadigotla from Sail Biomedicines presented a new, AI-first approach that replaces these time-consuming, manual steps with a single AI prompt. Connecting Anthropic's Claude Sonnet 4.5 to MCP servers from Benchling and Quilt.bio (the scientific data management system for AWS), bench scientists can get answers to their cross-assay questions in minutes instead of hours, and loop in comp bio only when deeper analysis is needed.

Giving LLMs the Tools They Need for Scientific Data Analysis

Today's LLMs can reason about scientific questions, but their context windows are much smaller than biotech datasets. For an LLM to help scientists analyze data across systems, a single AI session must search and query data from multiple sources, and therefore relies on numerous tools:

  • Searching and filtering massive datasets.
  • Querying structured tables (SQL over columnar data).
  • Joining data across disparate systems (e.g., Bench metadata ↔ Pipeline outputs).
  • Executing complex computations.
  • Producing reproducible artifacts (versioned datasets, plots).

That’s where MCP comes in. MCP is a standard that allows Large Language Models (LLMs) to interact with external tools in a structured, auditable way. Instead of each system having its own isolated AI, MCP allows one AI session to securely access multiple systems through standardized tool interfaces. Crucially, this must be achieved without the model ingesting all the data. Data stays in its source system, and access control is enforced by that system, which makes MCP servers well-suited to highly regulated, large-scale scientific environments. For companies – like Sail – that manage large-scale scientific data in AWS, AI requires an extra layer to transform raw S3 (records) into analysis-ready data. Quilt provides this layer. With Quilt, users can store, organize, and retrieve terabyte-scale data effortlessly within Amazon S3 and attach it to Benchling Notebooks via a seamless integration. This integration ensures that all data is securely managed, easily accessible, and fully labeled for streamlined workflows, enhancing productivity in the lab.

MCP Tools Needed to Bridge Benchling and AWS

A Concrete Example: RNA–Protein Correlation at Sail

Sail scientists wanted to understand how RNA expression from sequencing pipelines in S3 correlated with protein abundance measured in assays in Benchling

Using Anthropic's Claude Sonnet 4.5 connected to MCP Servers from Quilt and Benchling, Sail scientists ran a single prompt that:

  1. Located relevant Quilt packages in S3
  2. Queried RNA-seq expression tables at scale
  3. Pulled matching protein assay results from Benchling
  4. Aligned samples and gene identifiers
  5. Generated comparative visualizations
  6. Packaged the analysis as a new Quilt package

Total time to analysis: ~10 minutes

No file wrangling.
No hand-written SQL.
No hand-off to a data science team.

With GenAI and MCP bridging their Benchling and AWS systems, Sail’s Bench scientists can now:

  • Perform meaningful analyses themselves
  • Quickly see computational results in their experimental context
  • Iterate faster on hypotheses, accelerating experimentation

Data and platform teams can:

  • Avoid bespoke integrations
  • Preserve AWS-native governance
  • Focus on advanced analysis rather than routine requests

MCP enables AI systems to accelerate scientific workflows that cut across system boundaries. Data analysis that normally requires hours to days of tedious, manual work across bench scientists and computational teams takes minutes with this breakthrough approach. To learn more, watch the video:

Watch now: Kevin Moore & Vasisht Tadigotla: Accelerating Cross-Assay Correlation with GenAI

 

Comments