I feel that October 2022 marks an important milestone in the evolution of regulatory document automation with the submission of our grant application seeking support for the development of Overdrive, an adaptive artificial intelligence-driven platform designed to transform the way clinical trial protocols are created.
The proposal represents the culmination of nearly a decade of thinking about how software can reduce the growing administrative burden associated with pharmaceutical development. While an earlier initiative focused on automating Clinical Study Reports (CSRs) [1], the new proposal shifts attention further upstream in the development pathway to the clinical trial protocol itself, the document that ultimately determines the success or failure of a study.
The timing could hardly be more appropriate. The pharmaceutical industry is emerging from the COVID-19 pandemic while simultaneously confronting rising development costs, increasing protocol complexity, shortages of experienced clinical scientists, and mounting regulatory expectations [2][3][4]. Clinical development remains a lengthy and expensive endeavour, with average development timelines frequently extending beyond a decade and development costs continuing to rise across both large and emerging biotechnology companies [2][5].
Within this challenging environment, it could be considered that the clinical protocol has become one of the most strategically important documents in drug development.
Protocol Design as a Critical Bottleneck
Every clinical trial begins with a protocol. It defines the scientific rationale, objectives, methodology, eligibility criteria, study procedures, statistical analyses, safety monitoring plans and operational requirements that guide trial execution [6].
The quality of this document directly influences regulatory approval, site activation, patient recruitment, protocol compliance, data integrity and study completion. Unfortunately, protocol development has become increasingly difficult.
Over the past two decades, clinical trial protocols have grown substantially more complex. Analyses conducted by the Tufts Center for the Study of Drug Development have demonstrated significant increases in study procedures, eligibility criteria, endpoints and operational requirements across multiple therapeutic areas [7][8]. Modern protocols frequently require coordination across multiple countries, healthcare systems, disciplines and regulatory environments, creating unprecedented challenges for sponsors and investigators.
This increasing complexity has consequences. Poorly designed protocols often lead to delays, amendments and avoidable costs. Protocol amendments are particularly problematic because they frequently occur after trial initiation, requiring regulatory resubmissions, retraining of study personnel and modifications to operational procedures [8]. Industry analyses suggest that amendments can add months to development timelines while substantially increasing overall programme costs [9][10].
Recruitment challenges further compound the problem. Clinical trials routinely struggle to identify and retain suitable participants, with overly complex designs frequently contributing to recruitment failure and participant withdrawal [11]. Consequently, protocols are oftn modified to allow less stringent recruitment criteria.
For an industry already under pressure to improve productivity, protocol development has become an increasingly important target for innovation.
Building Upon Earlier Ideas
The origins of Overdrive can be traced back almost ten years.
In 2013, we submitted a grant application exploring the possibility of automating Clinical Study Report production [1]. That project recognised that regulatory documents are highly structured information products. Much of the information required to produce them already exists within electronic systems, yet considerable manual effort remains necessary to transform data into submission-ready documents. Although our 2013 proposal was not funded, the underlying concept remained compelling.
What has changed since then is the technological landscape.
The past decade has witnessed remarkable advances in cloud computing, machine learning, natural language processing (NLP), digital health infrastructure and large-scale biomedical data repositories. Public databases now contain hundreds of thousands of clinical trial records. The modest ambitions that were based on our experience with machine learning company Cardionetics have expanded. Sophisticated NLP algorithms can now identify relationships within complex text corpora and extract meaningful information from unstructured documents [12][13][14][15].
These developments have created opportunities that simply did not exist when the original CSR automation concept was proposed. Rather than focusing on reporting completed studies, Overdrive seeks to improve studies before they begin.
Introducing Overdrive
Overdrive is envisioned as an adaptive, intelligence-based software platform capable of assisting users throughout the protocol design process. The proposed system combines machine learning, natural language processing, predictive analytics and regulatory intelligence within a single integrated environment.
The platform draws upon multiple information sources, including historical protocols, regulatory guidance documents, standard operating procedures and publicly accessible trial repositories containing hundreds of thousands of protocol records. Each of these can be tailored to individual organisations, templates and therapy areas.
Unlike conventional protocol-authoring systems that rely primarily on static templates, Overdrive is intended to learn continuously from accumulated experience and user interactions. The system would analyse previous study designs, identify relevant patterns and assist users in generating protocol content tailored to specific therapeutic areas, development phases and regulatory jurisdictions.
Our objective is (again) ambitious: reducing protocol development timelines from 6 to 8 weeks to as little as 1 to 2 weeks. Importantly, like its previously proposed predecessor [1], Overdrive is not intended to replace protocol authors or clinical scientists. Instead, it seeks to augment human expertise by automating repetitive activities, improving consistency and providing evidence-based decision support.
Harnessing Artificial Intelligence for Protocol Development
Artificial intelligence has already demonstrated substantial potential across healthcare and biomedical research. Natural language processing systems have successfully extracted information from electronic health records, scientific publications and clinical documentation [12][13][14][15][16]. Machine learning approaches are increasingly being applied to patient stratification, predictive modelling and drug development decision making [17][18]. Overdrive aims to apply these same principles to protocol development.
Using NLP techniques, the platform would analyse historical successfully concluded protocols and identify relevant design features. Machine learning algorithms would then support automated pre-population of protocol sections, recommend variable selection and generate draft text based on accumulated knowledge.
The proposal also includes predictive modelling capabilities designed to help users understand the implications of design decisions before a study begins. These tools could potentially evaluate feasibility, patient recruitment risks, operational complexity, budget implications and quality concerns. In effect, Overdrive seeks to transform protocol development from a largely experience-driven activity into a more data-informed and analytically supported process.
Embedding Regulatory Intelligence
Clinical protocols are scientific documents, but they are also constrained by regulatory requirements.
Global clinical research is governed by increasingly sophisticated regulatory frameworks, including the International Council for Harmonisation’s Good Clinical Practice guidelines and evolving requirements from agencies such as the FDA, EMA and MHRA [6][19][20]. Maintaining compliance across these frameworks can be challenging, particularly for smaller organisations and emerging biotechnology companies.
One of Overdrive’s most innovative features is its proposed ability to embed regulatory requirements directly into the authoring process. The system can incorporate discipline-specific language, region-specific requirements and evolving regulatory expectations into protocol generation workflows. Rather than relying solely on downstream review, compliance considerations would be integrated from the earliest stages of document development.
Potentially, such an approach could markedly reduce review cycles, improve consistency and minimise the likelihood of your protocol being found to have regulatory deficiencies.
Potential Benefits for Industry and Patients
If successful, the implications extend far beyond document creation. Faster protocol development could accelerate initiation of clinical programmes, reduce resource demands and decrease reliance on increasingly scarce specialist expertise.
Improved protocol quality could reduce amendment rates and improve operational efficiency. Studies have consistently shown that protocol amendments are among the most expensive and disruptive events in clinical development [9][10]. Preventing avoidable amendments offers substantial opportunities for cost reduction and productivity gains. The proposal estimates that even modest reductions in protocol development timelines could generate significant programme-level savings.
Most importantly, faster and more efficient clinical development has the potential to accelerate patient access to innovative therapies. Every avoidable delay in development postpones the availability of potentially life-saving treatments [2][4]. The recent COVID-19 pandemic provided a vivid reminder of the societal importance of developing time-efficient clinical research systems.
Beyond the Protocol
While protocol generation represents the immediate focus, Overdrive reflects a much broader vision. Clinical protocols sit at the beginning of a chain of regulatory and scientific documents. Digitised information generated during protocol development ultimately feeds eCRF design, statistical analysis plans, investigator brochures, clinical study reports, regulatory submissions and scientific publications. By establishing a structured, intelligent foundation at the protocol stage, future opportunities emerge to automate additional components of the development pathway.
In many respects, Overdrive represents the next logical evolution of the document automation concepts first explored by NST nearly a decade ago [1]. The difference is that today’s technologies may finally be capable of delivering that vision.
Looking Ahead
The submission of our Innovate UK grant application marks the beginning of what could become a transformative project for clinical development. The challenges facing pharmaceutical research are unlikely to diminish. Clinical trials will continue to grow in complexity, regulatory expectations will continue to evolve, and pressure to reduce development timelines will remain intense.
Against this backdrop, technologies capable of improving how scientific knowledge is generated, organised and communicated are becoming increasingly important.
Overdrive seeks to address these challenges through a novel combination of artificial intelligence, machine learning, natural language processing and regulatory expertise. The real benefits here are to be found in the use of Overdrive by experienced operators who can moderate discussions between the representatives of the various disciplines involved in the protocol development to facilitate efficient delivery. If successful, Overdrive (or an equivalent) could reduce protocol development from a labour-intensive activity measured in months to an intelligent, data-driven process measured in days.
More importantly, it could help redefine how clinical development knowledge is captured and applied across the pharmaceutical industry.
Nine years after first exploring automated regulatory writing [1], Niche Science & Technology has returned with a significantly more ambitious vision. The next chapter may not simply be about writing documents faster, it may be about designing better clinical trials altogether.
References
- Hardman TC. (2013). Can Software Write a Clinical Study Report? A Vision for the Future of Regulatory Documentation
- DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: New estimates of R&D costs. J Health Econ. 2016;47:20-33.
- Paul SM, Mytelka DS, Dunwiddie CT, Persinger CC, Munos BH, Lindborg SR, et al. How to improve R&D productivity. Nat Rev Drug Discov. 2010;9(3):203-214.
- Wong CH, Siah KW, Lo AW. Estimation of clinical trial success rates and related parameters. Biostatistics. 2019;20(2):273-286.
- Wouters OJ, McKee M, Luyten J. Estimated research and development investment needed to bring a new medicine to market. JAMA. 2020;323(9):844-853.
- International Council for Harmonisation. ICH E6(R2) Good Clinical Practice Guideline. 2016.
- Getz KA, Campo RA. Trial watch: Trends in clinical trial design complexity. Nature Reviews Drug Discovery, vol. 16, no. 5, May 2017, p. 307.
- Getz KA, Campo RA, Kaitin KI. Variability in protocol design complexity by phase and therapeutic area. Drug Inf J. 2011;45:413-420.
- Getz KA, Wenger J, Campo RA, Seguine ES, Kaitin KI. Assessing the impact of protocol design changes on clinical trial performance. Am J Ther. 2008;15(5):450-457.
- Getz KA. Protocol design trends and their effect on clinical trial performance. RAJ Pharma. 2015;14:30-36.
- Sully BGO, Julious SA, Nicholl J. A reinvestigation of recruitment to randomised controlled trials. Trials. 2013;14:166.
- Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents. J Am Med Inform Assoc. 2004;11(5):392-402.
- Jensen PB, Jensen LJ, Brunak S. Mining electronic health records and biomedical literature. Nat Rev Genet. 2012;13(6):395-405.
- Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES). J Am Med Inform Assoc. 2010;17(5):507-513.
- Spasic I, Nenadic G. Clinical text data in machine learning. Int J Med Inform. 2020;141:104179.
- Wang Y, Wang L, Rastegar-Mojarad M, et al. Clinical information extraction applications. J Biomed Inform. 2018;77:34-49.
- Marshall SF, Burghaus R, Cosson V, et al. Good practices in model-informed drug discovery and development. CPT Pharmacometrics Syst Pharmacol. 2016;5:93-122.
- Wang DD, Zhang S, Zhao H, Men AY, Parivar K. Fixed and model-based approaches in drug development. CPT Pharmacometrics Syst Pharmacol. 2019;8:152-160.
- International Council for Harmonisation. ICH E8(R1) General Considerations for Clinical Studies. 2021.
- S. Food and Drug Administration. Guidance for Industry: E6 Good Clinical Practice. FDA; pre-2022 guidance documents.