adapted from Software Carpentry
This example data analysis project analyzes the word count for all words in 4 novels. It reports the top 10 most occurring words in each book in a report.
Clone this repo, and using the command line, navigate to the root of this project.
Run the following commands to create the conda environment:
conda-lock install --name da-pipeline-sh conda-lock.yml
Activate the conda environment:
conda activate da-pipeline-sh
Count the words:
python scripts/ \
--input_file=data/isles.txt \
python scripts/ \
--input_file=data/abyss.txt \
python scripts/ \
--input_file=data/last.txt \
python scripts/ \
--input_file=data/sierra.txt \
Create the plots:
python scripts/ \
--input_file=results/isles.dat \
python scripts/ \
--input_file=results/abyss.dat \
python scripts/ \
--input_file=results/last.dat \
python scripts/ \
--input_file=results/sierra.dat \
Render the report:
quarto render report/count_report.qmd
Your task is to add a data analysis pipeline using a shell/bash script! It should accomplish the same task as outlined in the file when you type:
- Quarto
- Python & Python libraries: