OpenTrials is a collaborative and open database for all available structured data and documents on all clinical trials, threaded together by individual trial. With a versatile and expandable data schema, it is initially designed to host and match the following documents and data for each trial: registry entries; links, abstracts, or texts of academic journal papers; portions of regulatory documents describing individual trials; structured data on methods and results extracted by systematic reviewers or other researchers; clinical study reports; and additional documents such as blank consent forms, blank case report forms, and protocols. The intention is to create an open, freely re-usable index of all such information and to increase discoverability, facilitate research, identify inconsistent data, enable audits on the availability and completeness of this information, support advocacy for better data and drive up standards around open data in evidence-based medicine. The project has phase I funding. This will allow us to create a practical data schema and populate the database initially through web-scraping, basic record linkage techniques, crowd-sourced curation around selected drug areas, and import of existing sources of structured and documents. It will also allow us to create user-friendly web interfaces onto the data and conduct user engagement workshops to optimise the database and interface designs. Where other projects have set out to manually and perfectly curate a narrow range of information on a smaller number of trials, we aim to use a broader range of techniques and attempt to match a very large quantity of information on all trials. We are currently seeking feedback and additional sources of structured data.


Trials are used to inform decision making, but there are several ongoing problems with information management on clinical trials, including publication bias, selective outcome reporting, lack of information on methodological flaws, and duplication of effort for search and extraction of data, which have a negative impact on patient care. Randomised trials are used to detect differences between treatments because they are less vulnerable to confounding, and because biases can be minimised within the trial itself. The broader structural problems external to each individual trial result in additional biases, which can exaggerate or attenuate the apparent benefits of treatments.


We are building an open free database and web service to identify, aggregate, store, match, index and share all available documents and data on all clinical trials. We are keen to receive feedback on the current methods, design, and data schema; feature requests; offers or suggestions of further data sources; and collaborations or methods to expand or improve the specification. Progress can be viewed at OpenTrials.​net where the service will be hosted.