Implementing systems thinking and data science in the training of the regenerative medicine workforce

The demand for a data-literate workforce creates a need for synergy between stakeholders across sectors (e.g., industry, academia, and government) of the RMAT enterprise. In academic settings, curricula to prepare data scientists for work in regenerative medicine fields might include topics in regulatory affairs, clinical development, and manufacturing. These topics will equip students to understand the unique context of regenerative medicine and appreciate systems thinking. Academia can also bolster the RMAT ecosystem by increasing the knowledge base of educators and providing resources for students interested in nonacademic careers. Synergy occurs from sectors working together on key areas of interest and providing both formal and informal educational opportunities that bridge data science and regenerative medicine. The organizations mentioned below are examples of strong multisector activities that can play an effective role in developing the RMAT workforce.


Workforce development in data-dependent regulatory issues warrants special attention. Because RMAT products are highly variable and customizable, the regulatory landscape continually evolves, and the regulatory sector must keep pace with advancements. To facilitate crosstalk and opportunities to use data science for regulatory decision-making, the regulatory affairs workforce needs to understand data science fundamentals, and the data science workforce needs a basic understanding of regulatory issues. Now is a key time to train a data science workforce to manage the regulatory environment of a rapidly evolving product and application space.

Regulatory training can begin through academic courses and degree programs36. The U.S. Food and Drug Administration (FDA) interfaces with several academic institutions within the Centers of Excellence in Regulatory Science and Innovation (CERSI) program to help train students in regulatory science37. However, regulatory instruction could also be incorporated into existing courses. Reallocating modest portions of curricula across multiple academic stages could better prepare students to interact with regulatory guidance and to prepare documents for regulatory agencies. Statistics courses could train students to analyze large datasets with the goal of introducing regulatory concepts such as critical quality attributes (CQAs), critical process parameters (CPPs), normal operating range (NOR), and proven acceptable range (PAR). Lab-based courses could teach students to prepare process descriptions in accordance with FDA guidance documents and to describe theoretical process-characterization strategies based on risk-assessment exercises conducted in class. Advanced courses could encourage students to discuss the applicability of data science approaches (e.g., real-world data, digital transformation, AI/ML) for regulatory decision-making through answering questions such as:

  • How can real-world data address regulatory filing requirements from clinical, preclinical, and manufacturing perspectives?

  • How can data science tools help establish the safety and efficacy of a regenerative medicine product?

Clinical and translational science

Discoveries in stem cell biology and associated technologies often occur in basic science laboratories, but the promise of regenerative medicine is realized in the clinic. Academia can prepare students with interest in clinically oriented roles by emphasizing data science and theoretical analysis in programs like the Institute for Clinical and Translational Research (ICTR) at the University of Wisconsin–Madison, which offers minors in Clinical Investigation for PhD students, or the Georgia Clinical & Translational Science Alliance (CTSA), an NIH-funded program across Georgia-based universities that offer a Master of Science in Clinical Research and certificate program in translational research for PhD trainees. Such programs encourage students to consider the human impact of scientific discoveries and promote translational research.

Ranging from data acquisition and harmonization to patient privacy, data-related challenges can impede the translation of regenerative medicine discoveries to the clinic. Small sample sizes often limit research in the rare disease space, and data integration among research groups and datasets can hamper progress. To contend with these issues, several institutions established their own data-sharing capabilities, including Johns Hopkins, the Mayo Clinic38, CMaT39, and others. CMaT, for example, works to standardize methods across its eight-university ecosystem and partners with companies to record data in a unified format via batch recording software. Data can be stored in the cloud for collaborators across partner organizations to access and analyze.

Started by nine medical research organizations, another initiative, the National Center for Data to Health (CD2H), facilitates data sharing and collaboration across the community of health informatics researchers34. The National Heart, Lung, and Blood Institute (NHLBI), including the NIH-wide Regenerative Medicine Innovation Project (RMIP)40, also established a data-sharing platform for NIH-funded projects—the Biodata Catalyst—which is intended to serve as a central data repository for open sharing among eligible researchers20. Moreover, a workforce that can manage, interpret, and deploy data science efficiently and securely will allow the RMAT industry to capitalize on potential benefits, both nationally and globally41.


As the RMAT industry grows and new, increasingly complex products enter the pipeline, highly trained and appropriately certified workers will be in-demand to manufacture high-quality products, at scale, with low batch failure rates and maximal reproducibility—all while ensuring efficacy and patient safety and maintaining strict regulatory standards. Biomanufacturing is increasingly digital and therefore requires proficiency in both quantitative methods and biology-based lab techniques. Moreover, a potential future model of manufacturing is decentralized, remote manufacturing in hospital settings. To support this scheme and other distributed manufacturing approaches in the future, understanding remote access, digital networks, and AI would benefit the workforce.

Academia can leverage guidance documents designed for industry to better prepare students for roles in manufacturing regenerative medicines42. When designing curricula, academic entities could integrate discussions with industry and clinical manufacturers to better understand the skillsets needed for biomanufacturing. Manufacturing sciences as a separate discipline is only available at a few academic centers; rather, manufacturing is often taught through mechanical or chemical engineering departments. Collaboration between experts in manufacturing sciences and other specialized domains (e.g., cell therapy, biofabrication) would help ensure curricula and training materials reflect domain knowledge in the context of broader manufacturing principles. This collaborative, transdisciplinary effort across engineering, cell biology, clinical translation, and industrial manufacturing will be critical to prepare the next generation of RMAT workforce.

Beyond educational efforts, academic scientists can better coordinate with industry by considering manufacturing guidelines, such as Good Manufacturing Practice (GMP) regulations, in their research. To facilitate translation from bench to bedside and increase uptake of innovations in the clinic, academic laboratories and researchers could aim to develop new devices, tools, software, and technologies that are GMP-compatible, follow Quality-by-Design principles, incorporate standardized analytical tools and measurements, and implement regulatory constraints.

Finally, key technology hubs and public-private partnerships play an important role in integrating data knowledge into workforce development. Organizations focused on RMAT manufacturing include the aforementioned CMaT, Marcus Center for Therapeutic Cell Characterization and Manufacturing (MC3M), National Institute for Innovation in Manufacturing Biopharmaceuticals (NIIMBL), BioFabUSA by the Advanced Regenerative Manufacturing Institute (ARMI), the Catapult Network in the UK, and the Centre for Commercialization of Regenerative Medicine (CCRM) in Canada. Examples of important steps these organizations have taken to build a dynamic workforce include:

  • Collaborative course module development for both technical and ethics/regulatory competencies.

  • Collaborations with 2-year college systems for hands-on training and curriculum development.

  • NSF-funded Future Manufacturing Network (FMNet) Consortium for “Building a Network to create the Workforce Foundation, Actionable Roadmap, and Infrastructure Design to Integrate Data Science, AI, and Predictive Analytics throughout Biomanufacturing”.

Notably, most of these efforts are in the early stages and need investment to scale up nationally or internationally. Community-based, distributed workforce training programs that build industry-identified skillsets and incorporate robust certification could significantly advance the successful use of large-scale data in regenerative medicine.