Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide sample data for contributors external to CSH wishing to contribute #320

Open
3 tasks
MoralCode opened this issue May 21, 2023 · 4 comments
Open
3 tasks
Labels
dev improvement Something that could be fixed to improve the development flow

Comments

@MoralCode
Copy link
Contributor

MoralCode commented May 21, 2023

Since ScheduleMaker data is mainly powered by daily SIS data dumps that are from RIT ITS (and presumably somewhat private), the ability to maintain CSH is somewhat limited to only CSH members who are trusted with access to this data or the databases created from them (using the tools dir of this repo presumably).

While the S3 dependency (seemingly for storage of generated schedule images based on the code) seems relatively easy to substitute with another S3 installation, and a new database can be created pretty easily, populating that database is much harder, if not impossible to set up independently of CSH.

Since Schedulemaker has had no commits in over a year and has at least one relatively major bug that hasn't been addressed in this time (#310) it seems as though it may be useful to allow the rest of the RIT open source community to run local dev versions of this code to develop their own fixes and improvements.

I propose this be done by:

  • providing documentation or a script that makes it easy to initialize a new, empty database using the current schema (without needing to run every single time-consuming migration)
  • providing one or more complete sample data dumps (including all of the dump files, either made-up with fake/sample data, or using a real ITS-provided data dump from some number of years ago) to allow contributors to populate their empty databases with realistic-enough data to allow schedulemaker to work and populate data locally.
  • If scuedulemaker is going to continue to become unmaintained, provide access to the real, production data dumps from ITS, or the scheduleMaker DB that is generated from them so that someone who wants to maintain a fork of the app (whether inside or outside CSH) can continue providing this service to students
@MoralCode
Copy link
Contributor Author

MoralCode commented May 22, 2023

For anyone following in these footsteps, here is a line of sample data that I had laying around (I forget where from). It is from just one of the many different dump files available. This sample data appears to be from the dump file for classes, but it seems like most of the other files are likely to use a similar format for their data as well.

This particular class appears to have been offered in Spring 2019 (semester code 2185) but no longer seems to be available. I was able to use the code in the tools directory, specifically this section to map each of the pipe-delimited fields into table names that are a little more descriptive. These names are copy pasted into a CSV-like header line above the sample data, but I suspect this header row is not present in the data from ITS.

Here is the header row that I made:

crse_id | crse_offer_nbr | strm | session_code | class_section | subject | catalog_nbr | descr | topic | class_nbr | ssr_component | units | enrl_stat | class_stat | class_type | schedule_print | enrl_cap | enrl_tot | institution | acad_org | acad_group | acad_career | instruction_mode | course_descrlong |

and the sample data itself:

202083| 1|2185|1|01|MGMT|  90|Student Accelerator| |57640|SEM| 12.00|O|A|E|Y|  50|  17|RIT01|MGMT|SCB|UGRD|P|This series of non-credit workshops and lectures provides students with the tools needed for successful completion of Saunders College of Business programs. Students will develop and practice essential skills, including critical thinking, how to analyze a problem, oral and written communications, working in a team environment, and ethics.   Students become familiar with value creation management strategies and tools.|

Hope this is useful to anyone wishing to create a longer sample course dump file using generated or real course data to aid future schedulemaker contributors

@MoralCode
Copy link
Contributor Author

This may be part of #56

@jabbate19 jabbate19 added the dev improvement Something that could be fixed to improve the development flow label Nov 27, 2023
@jabbate19
Copy link
Contributor

#327 could be used to do this. Will research during rewrite

@MoralCode
Copy link
Contributor Author

even if this goes through, id probably caution against intentionally giving up CSH's private, grandfathered data feed from the registrar - i bet it'll be helpful for validating the data from the API, or potentially getting data faster/without as many rate limits or something, even if it can't be made available to any student with an RIT login

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dev improvement Something that could be fixed to improve the development flow
Projects
Status: No status
Development

No branches or pull requests

2 participants