This repo includes a Python script to interact with an SQL Database on Azure Databricks. The project also implements continuous integration through GitHub Actions to automate the setup of the environment, perform testing, code formatting, and code linting.
Extract (E): Retrieves a dataset in CSV format from a specified URL. Transform (T): Cleans, filters, and enriches the extracted data, preparing it for analysis. Load (L): Loads the transformed data into a SQLite Database table using Python's sqlite3 module. Query (Q): Writes and executes SQL queries on the SQLite database to analyze and extract insights from the data.
Dataset: Baskin Robbins Ice-Cream
To run the project, you can use the Makefile and follow these commands:
-
# To install the required the python packages make install
-
# To check code style make lint
-
# To run tests make test
-
# To format the code make format
-
# To extract data make extract
-
# To tranform data make transform_load
-
# To query data make query