Introducing
Tacit Labs

by Nicole Fitzgerald & Anne Marie Droste

Biology offers greater potential to improve human life than any other scientific domain. It's also the most complex — demanding measurement, integration, and reasoning across multiple scales, over hundreds of variables interacting within systems that we still only partially understand.

Drug development is the ultimate test of our ability to intervene on those systems and one of the highest leverage activities humans can undertake. The discovery of new mechanisms and the design of new drugs have given us protection from parasites and pathogens, cured many once deadly diseases, and given us greater autonomy over our personal well-being than at any point in history. Today, however, the completion of this test (the clinical trial) takes us around 10 years and costs approximately $1B. Coupled with the fact that programs that enter the clinic still fail at a rate of 90%, the cost of verification is exceedingly high.

AI has progressed most rapidly in domains where verification is fast and abundant. In code, a proposed solution can be checked by a compiler, executed against tests, benchmarked for performance, or evaluated in a sandbox. In math, formal systems like Lean can turn proofs into machine-checkable objects, where each step must satisfy the rules of the system. Biology does not yet have an equivalent verification loop. We founded Tacit to build it.

Faster, denser feedback for large-scale agentic systems is the greatest bottleneck in training and meaningfully deploying models in this domain. We are starting with long-horizon evaluations across the drug discovery and development pipeline to understand exactly when and where agentic systems fail.

Biotech companies are inefficient search algorithms and biological experiments are expensive verifiers

Making a drug is a uniquely difficult process to model; in fact, it is one of the longest time-horizon tasks that we regularly undertake.

Drug discovery is a search through a space of dependent choices: target, modality, molecule, assay, biomarker, patient population, and development path. Each decision constrains the next, and a single wrong assumption can invalidate the chain. The structure of today's biotech companies is no match for the scale of this search space or its combinatorial complexity. At inception, the company itself is already a commitment: to a thesis, a team, a platform, a set of tools, and a narrow region of biology. Exploring another branch often means assembling new expertise, running new experiments, finding a different set of vendors, and raising more capital. Current biotechs are slow, process-heavy, sequential search algorithms, where the cost of changing course is starting over from the beginning.

The feedback loop is just as difficult — there is no simple test to tell you if a drug will work. Drug development works with a sequence of increasingly slow and expensive proxies that give you partial signal into what matters. A molecule can bind cleanly and fail in cells. It can work in cells and fail in animals. It can work in animals and fail in humans. It can look safe in early studies and fail when tested in the right patients, at the right dose, over the right time horizon. The strongest verifier is testing a drug in a human being, which for a number of reasons (among them — safety, ethics, cost) is reserved for the final stage of the process. Until then, teams are forced to make expensive decisions from weak and often conflicting signals.

Agents parallelize search and biochemical models compress experimental feedback

We now have increasingly powerful foundation models for modeling biology at the molecular, cell and tissue level. These models compress large datasets of experimental outcomes into in silico signals: how molecules bind, fold, clear, distribute, and fail; how cells, pathways, tissues, and patients respond under different conditions.

These in silico signals are increasingly being looked to as cheaper, faster, and soon more reliable stand-ins for wet lab experimentation. This allows agentic systems to quickly and cheaply garner signal against learned approximations of biology and chemistry, backtrack if necessary, and orient towards the most promising routes before the strongest, most expensive verifiers are invoked.

While these models serve as increasingly strong intermediate verifiers, they hold little information as to how the final outcome of the program will play out. Even the cell-based and animal-based experiments they aim to replace have poor predictive power over whether a drug will perform well in human biology.

The full impact of these advances will be unlocked when we can optimize the drug development process as a set of interconnected decisions that correlate to human outcomes. Dense, intermediate rewards must correlate to sparse, long-horizon rewards, allowing agentic systems to efficiently learn end-to-end.

The verification stack for biology requires evaluation across every link of the decision chain

Drug development will not be transformed by isolated capability gains. It will be transformed by composing atomic tasks into a decision loop that makes stronger, more well-informed choices along the chain of decisions that lead to a drug.

Tacit is starting by building realistic, long-horizon evaluations that test whether AI systems can make linked decisions across drug discovery and development. These evaluations combine human reasoning and task design with well-calibrated tools (biological foundation models, raw biological readouts, and easy-to-access lab infrastructure).

How we do this:

We build functional, long-horizon evaluations that represent the full trajectory of discovering and developing a drug. We bring these simulated evaluations as close to reality as possible, ensuring that they are multimodal, information-dense, and augmented with real-world experimental data and contextual artifacts.
These evaluations allow us to understand model performance, exposing where agentic systems are weakest and where corresponding experimental and human data is most needed, which we in turn supply.
As models and agents become more capable on real-world research tasks, we will build out the infrastructure that enables them to operate safely and effectively in real-world research environments.
We can continue to run the evaluation loop on different problems and parts of the stack to expose new gaps and weaknesses, making the end-to-end stack more capable and the verification loop faster, cheaper and closer to reality.

In short, we build evaluations that leverage verification methods from across the ecosystem and stitch them together into increasingly long and more realistic drug development loops. Over time, these loops converge to the inference-time infrastructure required to enable autonomous drug discovery at scale.

Towards one million zero-person biotechs

In the limit, these forces will compress drug development to a function of compute.

The shape of company that will bring the next generation of drugs to market will look radically different as a result — model-centric, massively parallelized, and increasingly autonomous. We are building towards a future in which a single person with a laptop and a credit card will be able to specify a disease area, target or therapeutic hypothesis and then spin up a team of agents to explore and deliver a human-ready molecule.

Today, starting a biotech company is a large commitment to a narrow thesis, and what problems get worked on is a function of how many teams and how much capital can be assembled around them. Resources concentrate in a small number of these but most never get a shot.

In the future, the number of biotechs will no longer be constrained by human capital — bottlenecked by organizational capacity or scientific expertise. Thousands of new biotechs will pursue millions of therapeutic hypotheses in parallel — including rare diseases and indications that are not popular or perceived as economically viable today.

The shift from scientific scarcity to abundance is sure to change the practice of medicine and our understanding of disease — from personalized medicines and combination therapies to functional consumer products and over the counter compounds.

To get there, our scientific infrastructure will need to be radically reconfigured to be used by models and agents for massively parallel discovery and development. This needs to start with evaluations and tooling that captures context and verifies results across the end-to-end decision chain.