.gitignore | ||
example_plot_1.png | ||
example_plot_2.png | ||
example_plot_3.png | ||
NIH_Grants.R | ||
NIH_RePORTER_Project_Data_Dictionary | ||
readme.md |
NIH Project Data, 2019-2024
The following repository contains links and some sample code written in R to help you parse NIH project data.
NIH Project Data
The raw data in .CSV format is rather large, so it is stored here: https://drive.google.com/drive/folders/1iE3hYTTO7IXaBadpOJT9wL1VmBLJ3Wpc?usp=sharing
The original data can be found on the NIH website at the following URL: https://reporter.nih.gov/exporter/projects
The data dictionary, defining each column in the CSV, is available here: https://report.nih.gov/exporter-data-dictionary
For your convenience and so you can see it locally, I've replicated the data dictionary in this repo in the text file named NIH_RePORTER_Project_Data_Dictionary.
Setting up
If you want to try out the R code, you'll need to download the individual .CSV files of RePORTER data either at the link above or from the NIH website. Each .CSV file represents one year of data. I recommend making a folder structure like this:
Top level folder: NIH_Data
Folder within that top level folder: data
In that data folder, place your downloaded .CSV files. The code in the script will combine these into a single dataframe which you will work with by filtering the data and visualizing it by generating charts.
R sample code for parsing this data and making simple plots
The sample code contained here will help you do some basic data cleanup, like combining the .CSV file of each year of the RePORTER data into a single dataframe, and separating date columns into year, month, date columns to make them easier to work with.
I recommend running this code line by line in RStudio, which is what I did as I wrote it and explored this dataset.
What do the charts look like?
I have included a few example plots in this repository, showing how the charts generated by this code look for a specific state.