adding setup details

This commit is contained in:
Lisa Williams 2025-02-01 23:33:17 -05:00
commit 76a2446c72

View file

@ -12,6 +12,15 @@ The data dictionary, defining each column in the CSV, is available here: https:/
For your convenience and so you can see it locally, I've replicated the data dictionary in this repo in the text file named NIH_RePORTER_Project_Data_Dictionary.
## Setting up
If you want to try out the R code, you'll need to download the individual .CSV files of RePORTER data either at the link above or from the NIH website. Each .CSV file represents one year of data. I recommend making a folder structure like this:
Top level folder: NIH_Data
Folder within that top level folder: data
In that data folder, place your downloaded .CSV files. The code in the script will combine these into a single dataframe which you will work with by filtering the data and visualizing it by generating charts.
## R sample code for parsing this data and making simple plots
The sample code contained here will help you do some basic data cleanup, like combining the .CSV file of each year of the RePORTER data into a single dataframe, and separating date columns into year, month, date columns to make them easier to work with.