This folder contains the code to replicate the analysis in the story. It is divided into three files, in the preferred order of execution:
-
1-TCPD.R
- Takes in theTCPD.csv
file and runs the name similarity analysis. It will create an intermediate file for yearState-Year.csv
in thedata/intermediate-data
folder and will be joined into a final csv in thedata
folder. -
2-data-analysis.R
- Takes in theState-Year.csv
file and runs the analysis. This is not perfectly structured, though the code is commented and should be easy to follow. The JSONs for each step are not written in all cases. -
3-name-transliterations.py
: (Not used in the story). This file goes through rows of thesimilar-candidates.csv
file and transliterates the names of candidates from the original language to English using the Gemini API. It was not used in the story but is included for completeness, and possibly for future use.