You are here

Initiative Updates: Project Data Sphere


July 25, 2022
FDA Renews Public Private Partnership With PDS
Project Data Sphere

The US Food and Drug Administration (FDA) has reaffirmed a Public-Private Partnership (PPP) with Project Data Sphere (PDS). Scientific engagement with the FDA has been foundational to PDS and the PPP dates back at least 5 years.

Through our seven-year FDA-PDS symposium series, we have identified a number of areas at the frontier of analytics and regulatory science that PDS has developed into multi-institutional research programs.

In addition to our data sharing platform and collaborative analytics workspace, hosted by SAS, PDS has run five different research programs as a result of member engagement following these FDA-PDS workshops.

  • The Prostate DREAM challenge – a crowd-sourced challenge to use emerging analytics approaches to identify biomarkers to aid in the management of patients with castrate-resistant metastatic prostate cancer.
  • The External Control Arm program – a series of methods to construct and use existing patient performance data on standard-of-care therapy to supplement the control arm of a clinical trial, streamline evidence generation, and honor patient contributions to science.
  • The Rare Tumor Registries program – a focused effort in Merkel Cell Carcinoma to demonstrate how high quality, longitudinal clinical data can transform patient care and inform the development of the next generation of medicines.
  • The Immune-Related Adverse Events (irAE) program – a crowd-sourced challenge to establish clinical definitions of dozens of tissue-/organ-specific adverse events that occur in a small fraction of patients receiving immuno-oncology treatments. With these definitions in hand for a number of tissues and organs, we are now embarking on several additional projects (biomarkers, genomics, and digital health) that can improve patient management practices to keep patients on life-saving therapies.
  • The autoRECIST program – a consortium-supported radiomics project where state-of-the-art machine learning techniques are being applied to medical images to ascertain whether a patient with advanced cancer has responded to therapy. This program has reported substantial progress on quantifying liver metastases and has pilot work demonstrating that these approaches can quantify metastatic lesions in lung and lymph nodes as well.

In the new iteration of the PPP, the FDA and PDS will be focusing on our data sharing platform, the irAE program, and autoRECIST. This focus highlights the necessity for continued work in these areas to establish widely adopted best practices. Our other research programs are undergoing lifecycle review with the Life Sciences Council (PDS’ scientific advisory board) and will continue in some fashion, though without the formal engagement with FDA.

Program lifecycle management also allows us to more thoroughly evaluate new research proposals and bring them before our membership. Our current portfolio of research proposals includes: digital pathology, pediatric oncology, cellular therapy, and developing tools to support the community of scholars across our partner HBCUs. This research portfolio is in review by the Life Sciences Council and potential funders.

May 02, 2022
Scientific Publications Reflect Progress by MCC Registry Program
Project Data Sphere

The Merkel Cell Carcinoma (MCC) Patient Registry initiative organized by Project Data Sphere is off to a fast start in 2022 with four manuscripts accepted for publication and a key contributor to the program, Dr. David Miller from Mass General Brigham and his team, has submitted two new manuscripts.  Miller is co-chair of PDS' Merkel Cell Carcinoma Task Force.

The MCC Patient Registry is a national multi-institutional collaborative effort to record outcomes and events in MCC patients. MCC is a rare skin cancer, and this registry will trailblaze new methodologies that continue to enable investigators to derive insights about patient care from the real-world outcome data.

Dr. Sophia Shalhout, a data scientist from Mass General Brigham, presented her work on “Real world assessment of ipi-nivo in anti-PD-(L)1 refractory Merkel cell carcinoma” at the MCC Multi-center Interest Group meeting on March 25. Multiple members of the MCC taskforce -- Miller,  Shalhout, and Dr. Kenneth Tsai, Pathology Research Vice Chair at Moffitt Cancer Center – have been invited to present on important aspects of Merkel Cell Carcinoma (including the registry work) at 2nd International Symposium on Merkel Cell Carcinoma (April 25 and 26).

We have learned that patient populations and clinical management decisions are more varied across sites than initially expected. This finding suggests that more patients will be required for certain subgroup analyses. We are evaluating options to increase patient accrual.

For more information, or to get involved, please contact Ravi Komandur, PhD, MCC Patient Registry Program Director, at

May 02, 2022
FDA Grand Rounds Features Images and Algorithms Program
Project Data Sphere

Project Data Sphere’s Images & Algorithms (I&A) Program presented preliminary results from its autoRECIST project at the FDA Research Grand Rounds on Feb. 25. Nearly 200 attendees participated in this webinar which generated a great discussion.

The goal of the autoRECIST project is to develop deep-learning algorithms to reduce the time and cost and improve the performance of imaging in clinical trials, shortening the time from discovery to implementation, improving the accuracy of the reviews, and, ultimately, improving patient lives.

In oncology clinical trials, the overall assessment of tumor burden and response to therapy is estimated by a set of complex quantitative and qualitative criteria called Response Evaluation Criteria in Solid Tumors (RECIST). To perform a RECIST assessment, a radiologist reads Computed Tomography Digital Imaging and Communications in Medicine format (CT DICOM) images, identifies measurable lesions, picks two target lesions per organ (in up to five organs per patient), and records the largest diameter in target lesions. The same tumor images and measurements are then evaluated by an independent radiologist. An average of 30% discordance in radiological interpretation has been reported between readers.

Dr. Asba (AT) Tasneem, PDS Executive Director of the program, provided an overview of the work and discussed a four-year roadmap for the program. The two Principal Investigators at Columbia University Medical Center -- Dr. Binsheng Zhao, Director, Computational Image Analysis Lab, Department of Radiology; and Dr. Larry Schwartz, Chairman, Department of Radiology -- presented results from developing Liver Artificial Intelligence (AI) – the foundational AI which detects and segments liver lesions.

In the next four years, the autoRECIST project will develop deep learning algorithms to 1) calculate RECIST assessment based on volumetrics measurements of all lesions (autoVOL); and 2) automate the current RECIST 1.1 (autoRECIST).

For more information on the autoRECIST project please contact Asba (AT) Tasneem, PhD, Executive Director, Images and Algorithms Program (

April 26, 2022
Data Platform continuously improving
Project Data Sphere: In the News

Researchers who have not logged on to the Project Data Sphere data platform recently are invited to check out all the improvements we’ve been making to respond to user feedback. Data accessed via the platform has contributed to at least 138 peer-reviewed publications and 15 associated with research programs.

There are new resources on the platform including an instructional video published by Cytel, statistical software developer and contract research organization, about synthetic controls planning and best practices (leveraging data from platform).

Navigation has been improved on the data access page. A new search feature includes keyword highlighting and expanded search fields. There also is improved organization of data within cancer areas and tumor types and improved visibility to National Cancer Institute studies thanks to the addition of study sponsors to navigation bar.

The data repository has been expanded to accommodate real world data with the first open access RWD contribution by Asociación Colaboración Cochrane Iberoamericana (ACCIb). Also SAS Life Science Analytics Framework (LSAF) has published and shared survival analysis across a set of pancreatic cancer trials (Clovis, Celgene, Sanofi, EMD Serono).

More information on the data platform is available from Holly Smith, MBA, PMP, Director, Data Sharing Products & Platform, at


July 08, 2021
Project Data Sphere

The American Society of Clinical Oncology (ASCO) Annual Meeting (June 4-8), one of the largest scientific gatherings in cancer research, will feature several abstracts related to Project Data Sphere® research programs on External Control Arms and Immune-related Adverse Events.

External Control Arms

Of the 4,600 posters at ASCO this year, an emerging trend (more than 350 posters) is the use of real world data (RWD) and real world evidence (RWE). A landscape analysis by the FDA’s Donna Rivera and Paul Kluetz describes some of the more than 140 scenarios where RWD was presented in regulatory submissions over the past decade. (Abstract #18787:

One prominent use of RWD is as an external control or comparator arm to support claims of effectiveness where overall survival and response rate are used as primary outcomes. The primary conclusion from their analysis is the need to establish “metrics for robust data characterization and outcome validation” so that “RWD can be appropriately evaluated and provide the rigor necessary to be considered adequate RWE.”

Project Data Sphere’s External Control Arm program directly focuses on these two issues. Instead of using clinical data extracted from electronic health records, the PDS program is building external control populations from completed clinical trials and purpose-built registries. These data assets have quality components built in with clinical assessments and data entry performed under stringent protocols. We believe that this well-defined data environment will more rapidly clarify how external data assets should be developed and qualified for regulatory decision making.

Well-constructed, high quality external control arms can inform decision making throughout clinical development beginning well before regulatory engagement and be incorporated in complex innovative trial designs to: support a smaller control arm through hybrid approaches, accelerate interim analyses, and help optimize patient allocation in adaptive platform trials. [,]

PDS collaborators from Dana-Farber are presenting some of the initial findings from the GBM-INSIGhT trial, an adaptive platform trial in newly diagnosed glioblastoma.

Abstract #2006:

Abstract #2014:

Thirteen additional abstracts at the 2021 ASCO Annual Meeting demonstrate various uses of external control arms.

In addition, patient-level data from completed clinical trials made available through the portal was used in at least three posters. These studies applied deep learning to these patient-level data assets to identify candidate predictors of outcome, phenotypes of super-responders to specific therapies, and illustrate the risk of introducing bias through patient censoring.

Abstract #1549:

Abstract #1548:

Abstract #e13543:

Immune-Related Adverse Events (irAEs)

As more cancer patients are treated with immunotherapy, benefits have been observed as well as an increase in patients experiencing neurotoxicity. There is an urgent need to understand how and why these neurologic irAEs occur, and how to best manage them.

Comprehensive knowledge of neurologic irAEs is limited and treatment guidelines are based on consensus, not evidence.

PDS irAE Task Force Co-Chairs from Massachusetts General Hospital -- Dr. Amanda Guidon and Dr. Kerry Reynolds -- as well as PDS are among the authors of an abstract on consensus definitions that were achieved for seven core disorders. The authors believe the definitions now can be used broadly across clinical and research settings.

Abstract #2647:

July 08, 2021
Project Data Sphere

Project Data Sphere’s open access data platform has achieved a major milestone: Boosting the impact of oncology research by contributing to publication of more than 100 peer-reviewed articles that impact research and clinical practice.

The platform aggregates patient-level trial data from biopharmaceutical companies, academic medical centers, and government organizations and makes it available with free access to SAS™ analytic tools.

“Secondary analysis of patient-level data is valuable,” said Bill Louv, President of PDS. “Analyzing the combined datasets increases statistical power to learn more about standard-of-care outcomes, rare adverse events, treatment effects in patient subsets, and reproducibility of results. That’s not possible with single datasets.”

Louv said less than 1 percent of data is reused in this manner despite the existence of multiple sites offering inventories, metadata, and controlled access to the majority of clinical trials sponsored by the pharmaceutical industry.

In the six years since it launched, the PDS platform has grown from nine datasets to more than 150 and has been accessed by more than 3,000 scientists making more than 26,000 downloads of information. The data covers more than 160,000 patient lives in clinical trials studying prostate, breast, colorectal, non-small cell lung, Merkel cell, and pancreatic cancers.

New data types are being added and they include real world data, images, and genomics information.  There also is curated data, including some from RTI International that is enhanced with social, economic, and health care-related data from the national Medical Expenditure Panel Survey (MEPS). At least two publications have focused on this analytically enhanced data that enable researchers to assess the impact of socioeconomic factors on cancer survival and related outcomes.

The PDS platform is home to the world’s largest curated prostate cancer dataset, which yielded more than 25 publications in peer-reviewed journals. The Prostate Cancer DREAM Challenge, which generated the comprehensive dataset, attracted 50 international teams to help predict survival for prostate cancer patients and treatment discontinuation for those treated with docetaxel. The Challenge produced a model that accurately predicts patient outcomes that could lead to improved clinical trial design and treatment options.

Three years ago, the platform amassed enough data to build a portfolio of research programs. Those programs focus on innovative trial designs, automating lesion annotation, streamlining adverse event reporting and management of patients on immunotherapy, and multi-institutional registry-building and common data models.

The steering committee of each research program includes an observer from the FDA Oncology Center of Excellence.  This activity is governed by a Public-Private Partnership between FDA and PDS.  The senior sponsor of the PPP is Paul Kluetz, Associate Director of the FDA Oncology Center of Excellence.

FDA and PDS also have worked together to convene experts from academia, industry, and government to address the latest issues in oncology research.  The 2021 event will be held Sept. 23 and the focus is on defining a pathway to greater reuse of trial data. Learn more here:

April 06, 2021
Data Sharing Platform Gets Major Updates
Project Data Sphere

Project Data Sphere kicked off 2021 with exciting news: We have completely revamped our Data Sharing Platform. It’s the largest update and enhancement since PDS originally launched more than 6 years ago. 

What does this mean for you? 

  • It’s easier to find data and gain perspective around what is included within each study or contribution. 
  • Along with improved navigation, we have integrated content from so you have visibility to additional metadata.  
  • If you are interested in finding any data curations or what we consider linked-data, such as the AHRQs Medical Expenditure Panel Survey data, it’s incredibly easy to find through new filter options. 
  • You will also notice that the process for contributing data is greatly improved. 

Thanks go to our active user community (now over 3,000 registered) for making this platform what it is today. We are listening and striving to maximize data reuse, advance oncology research, and serve our patient population. 

Feedback has been gratifying. A user from the University of North Carolina with a newly accepted publication said, “Thank you for building this amazing platform so we could use the data to tackle real problems.”

The trust and engagement by data-sharing partners remain a critical piece of the platform. A big shoutout to Merck KGaA (EMD Serono), Eli Lilly, and G1 Therapeutics for their recent contributions across Glioblastoma, Pancreatic Cancer, Small Cell Lung Cancer, and Non-Small Cell Lung Cancer. 

Every data contribution counts, and Project Data Sphere is committed to advocating on behalf of all patients participating in clinical trials by driving greater reuse of their individual data.  Our laser focus for 2021 is ‘Data Reuse’ through stronger positioning of our sharing platform and analytics space, active research programs, and by establishing key partnerships to drive progress. 

We have been working hard to plan an upcoming virtual symposium on this topic so mark your calendar for September 23 and join to hear how effective reuse of clinical trial data can lead to faster approvals and gain insights around how you can contribute to this critical ecosystem.

SAS Global Forum 2021

SAS Global Forum, May 18-20 2021

April 06, 2021
SAS Invites Registration for Global Forum 2021
Project Data Sphere

Registration is open and free for the annual SAS Global Forum, May 18-20, which brings together the brightest minds in analytics. The analytics community comes together to exchange ideas in a one-of-a-kind conference – where learning, training, networking and inspiring sessions converge.

The Global Forum 2021 is complimentary to all attendees. Each participant in the virtual event will have access to general sessions, breakout sessions, the exhibit hall, training sessions and networking opportunities all at no cost.

Speakers include SAS executives Jim Goodnight, Co-Founder and Chief Executive Officer as well as one of the Project Data Sphere™ Directors. Other speakers have ties to the Ted Podcast Worklife and NASA, consult on artificial intelligence and are well known entertainers. 

There will be more than 150 sessions, real-world stories from frontline leaders, meetings with technology and industry experts, live demos in the Innovation Hub, and access to world-class training at no cost.

Registration is online:

SAS is a member of the CEO Roundtable on Cancer and a longtime Gold Standard accredited employer.


December 15, 2020
PDS-RTI Work Supporting Health Disparity Research is Focus of Manuscript
Project Data Sphere

A manuscript published in December 2020 in Contemporary Clinical Trials Communications highlights work by Project Data Sphere and RTI International (RTI) to improve access to clinical trial data supporting research on health care disparities.

“Enhancing the Analytic Utility of Clinical Trial Data to Inform Health Disparities Research” describes how Steven B. Cohen and his team at RTI are augmenting selected PDS patient-level cancer phase III clinical datasets by linking the social, economic, and health-related characteristics of like cancer survivors from nationally representative health and health care-related survey data from the Medical Expenditure Panel Survey (MEPS).

MEPS, sponsored by the Agency for Healthcare Research and Quality (AHRQ), is the nation's primary source of nationally representative, comprehensive, person-level data on health care use, insurance coverage, and expenses.

“Clinical trials, for example, are used to identify safe and effective treatments for all those with cancer but are often conducted among younger, healthier, and less racially diverse patients than the population at large,” the article notes. “As a result, there is an increasing interest in diversifying clinical trial patients to ensure that resultant treatments are suited for those who are disproportionately affected in the first place.”

Data providers are required to de-identify patient-level data before submitting it to PDS. That means removing social and demographic content that could otherwise be used to study underserved populations and factors that contribute to health inequities.

With support from the Robert Wood Johnson Foundation, PDS and RTI International are working to address that gap. The enhanced data will help researchers explore the influence of healthcare access, socioeconomic factors, and health behaviors on the patient-level representativeness and outcomes data.

October 07, 2020
FDA-PDS Symposium Considers Rare Cancer Registries
Project Data Sphere

Experts from the U.S. Food and Drug Administration (FDA), industry, and academia convened on October 7 for a virtual symposium on Rare Cancer Registries. Dr. Julia Beaver, Chief of Medical Oncology at the FDA said the meeting focus was: “To address critical questions in the field of rare cancer registries, with an ultimate goal to drive improvements in patient treatment, bringing safe and effective drugs to patients with rare malignancies in the most efficient and expeditious manner.”

The speakers shared how registry data have been used to advance research and improve clinical care for rare cancers within their own disciplines, discussed best practices for registry construction and data application, addressed how to integrate diverse types of data to make rare cancer registry data even more valuable, and strategized how best to support data-sharing and generalizability. Throughout the presentations and discussions, collaboration, transparency, and long-term planning emerged as fundamental to the most effective use of this powerful research tool.

Here is the link to read more or watch the videos.

Project Data Sphere Symposium IX
July 14, 2020
October FDA-PDS Symposium IX will focus on registries
Project Data Sphere

Are registries the key to advancing treatment of rare cancers?  We hope to find answers to that question and to identify obstacles blocking the promise of registries and solutions to overcome those challenges in the FDA-PDS Symposium IX on Oct. 7.

This symposium will be held online for the first time but the goal is the same as in past years: elevate the conversation about this research area, which is ripe for fresh attention and energy from academia, industry and FDA. The event will run from noon until about 3 pm EDT.

We are honored and excited to welcome keynote speaker Dr. David Fajgenbaum, a groundbreaking physician-scientist, disease hunter, speaker, and author of the national bestselling memoir, Chasing My Cure: A Doctor's Race to Turn Hope Into Action.

Fajgenbaum nearly died five times battling Castleman disease. To try to save his own life, he developed and led an innovative approach to research through the Castleman Disease Collaborative Network (CDCN) and discovered a possible treatment that has put him into an extended remission.

We are eager to hear the unique perspectives and insights of Dr. Fajgenbaum. This is a session that’s sure to inspire.

The symposium brings together multiple stakeholders representing most aspects of the rare disease challenge: practice in the clinic, pharmaceutical research, policy and regulatory science, and patient advocacy to brainstorm the operational dynamics between the groups.

Mark your calendar.  More details and registration information will be coming via email soon.

Project Data Sphere
April 29, 2020
Updated Project Data Sphere platform, website to debut
Project Data Sphere

More than 2,400 researchers are using the Project Data Sphere® platform to answer important questions about cancer. Many of them have offered feedback on how we can make the platform even easier to use.  We are delighted to announce that we expect an updated platform to be available early in 2Q 2020.

A new PDS website will be launched at the same time, offering more information on our programs, introductions to new program managers and easier ways to communicate with us.

The website will be a place that you can register for important events like the PDS Symposium IX on rare cancer registries that we are planning for Oct. 7, 2020, in the Washington DC area. 

This annual symposia series has consistently elevated the conversation about research areas that are ripe for fresh attention in the realm of cancer clinical trials, often deriving major new thrusts from academia, industry as well as from our FDA.

Rare cancer registries hold potential to solve some of the most difficult challenges to advancing treatment of these rare cancers, which account for more than 25% of the cancers reported worldwide.

The symposium attempts to bring together multiple stakeholders representing all sides of this challenge: Practice in the clinic, Pharmaceutical research, Policy and regulatory science, and Patient advocacy to brainstorm the operational dynamics between the groups.

Registries can impact these four areas in the following ways:

  • Refine clinical decision making and more effectively manage patient therapy.
  • Assist pharma in designing comparator arms efficiently, thus accelerating the drug development process.
  • Inform regulators who must decide whether to approve new medicines and how to use them in rare cancers.
  • Provide relief for the suffering patients with better treatment strategies.

Symposium IX will consider how registries for rare tumors might address these existing shortcomings and drive improvements in patient care. More information will be coming soon.


April 17, 2019
FDA-PDS Symposium VII
Project Data Sphere: FDA Symposium

More than 90 experts in the oncology community were convened on April 17, 2019, in Bethesda, Md., to focus on immune-related adverse events (irAEs) that can occur with the use of combination checkpoint blockade therapies. Co-sponsored by Project Data Sphere, LLC (PDS) and the U.S. Food and Drug Administration (FDA), this symposium focused on ways to strengthen irAE and toxicity reporting and data collection so as to better understand which patients are more likely to suffer these often-devastating adverse events. As attendees learned from a patient advocate who spoke, it is not unheard of for a patient who succumbs to an irAE to be tumor-free at the time of death. 

Thought leaders organized presentations and panels on the clinical presentation of irAEs and toxicity during combination therapy, preliminary findings from immune checkpoint inhibitor (ICI) clinical trial data submitted to the FDA and post-marketing surveillance data, and current initiatives to accelerate knowledge with biorepositories and patient registries. The Symposium was distinguished by the FDA presentations of selected clinical trial data on ICIs and the post-market surveillance (FAERS) data on adverse events associated with ICIs; neither of these rich FDA data sources had ever been presented.

January 24, 2019
PDS Highlighted in In Vivo Feature
Project Data Sphere: In the News

Now that Project Data Sphere has established itself as a world-class data-sharing platform, the group's leadership is planning for its next phase of the struggle against cancer. In this In Vivo piece, PDS President Bill Louv talks about how the organization has "raised the bar on what is possible to achieve in cancer research."


Photo from FDA-PDS Symposium VI
August 08, 2018
FDA-PDS Symposium VI
Project Data Sphere

More than 60 oncologists, data scientists, pharma/biotech industry leaders, patient advocates, health science research investigators, and regulators representing myriad esteemed institutions convened in Bethesda, Md., on August 8 for Project Data Sphere, LLC’s (PDS) sixth symposium co-sponsored with the U.S. Food and Drug Administration (FDA). With the goal of accelerating new options for small cell lung cancer patients (SCLC) in a stagnant treatment landscape, PDS is collaborating with the FDA on the development of an external control arm for SCLC clinical trials. This would enable patients to be enrolled directly into a trial’s new drug investigational arm, which ultimately would reduce the number of patients per trial as well as the cost and time of discovery for new treatment options. The daylong event featured a combination of individual and panel presentations and lively discussions