Opening up Pillbox and the data processing code

Pillbox is one of the largest free databases of prescription and over-the-counter drug information and images, combining data from pharmaceutical companies, Food and Drug Administration, National Institutes of Health, and Department of Veterans Affairs.

Pillbox for Developers is a resource for getting open access to the data processing code, understanding the methodology, and contributing to the project.

Understand the Process

Pillbox’s primary data source (FDA drug labels) is complex and does not organize information based on individual pills. A Python data process is used to download the source data and produce an easy to use, “pill-focused” dataset. This improved data process is the beginning of greater flexibility for developers to understand the process and access the data.

This script can be run on your local machine by installing Python and necessary requirements. Follow this setup guide and steps for running the scripts to start processing on your own machine.

Download the Source Data

Source data is available from DailyMed in XML format. DailyMed is a service of the National Library of Medicine (NLM)

Data Processing with Python

Python scripts download the DailyMed XML files, and process them into a JSON API and CSV.

Contribute Code Back

Pillbox for Developers uses GitHub to enable anyone to join the process and help further development of the project. You can contribute to the project in by flagging errors or writing code to improve the process.

This space will be used to track development and enable open access to the data processing code. Future development will include improved data format outputs and great flexibility in data access. Access at github.com/HHS/pillbox-data-process.

Open Issues

Use the open issues queue to file bugs, or to help tackle existing issues.

Read Documentation

Find documentation in the repository.

File a Pull request

You can add features, or improve existing code by starting work on a new branch. After you've completed your update, submit a pull requests for Pillbox to review.

Access the Data and Images

Pillbox data is currently provided in two forms: raw CSV and an API. Download the data or register to key an API key to access the data.

Data and Images Download

Download the raw data in CSV format

Access the API

Get access to the API search data with custom queries