Extract Features from Occupancy Permit Data #11

eclee25 · 2018-01-09T00:50:13Z

The purpose of this issue is to create a feature engineering script that may be run repeatedly on the occupancy permit dataset, as new entries or files (by year) are added.

This issue was originally posted in the dc_doh_hackathon repository ,which can be found here:
issue_10

Start with the Occupancy Permit data in the /Data Sets/Occupancy Permits/ folder in Dropbox.
Write a script that uses this data to produce a feature data table for the number of new occupancy permits issued in the last 4 weeks.

You can find the data format and examples on the Feature Dataset Format tab in this document

Input:
CSV files with data for each given year

Output:
A script that produces a CSV file with the below format:

1 row for each occupancy permit type, and each week, year, and census block
The dataset should include the following columns:

feature_id: The ID for the feature, in this case, "occupancy_permits_issued_last_4_weeks"
feature_type: Occupancy permit type, found in the EVENTTYPESCODEDESC column of the source data
feature_subtype: Left blank
year: The ISO-8601 year of the feature value
week: The ISO-8601 week number of the feature value
census_block_2010: The 2010 Census Block of the feature value
value: The value of the feature, i.e. the number of new occupancy permits of the specified types issued in the given census block during the previous 4 weeks starting from the year and week above.

The final script must be able to be run from the command line taking three arguments:

A folder with the occupancy permit data files (the script should concatenate and merge the files in the directory as appropriate)
The shapefile for census blocks
The output CSV filename

Please also provide a README.md that describes the script and how to run it.

You can model the solution for the command line modifications after the files here or
here

Place all of your files in the codefordc/the-rat-hack repository under a new scripts/feature_engineering/extract_occupancy_permit_features/ folder

** Hints:**
The solution to Hackathon issue_3 may provide some helpful inspiration for the data cleaning steps.

The text was updated successfully, but these errors were encountered:

eclee25 mentioned this issue Jan 9, 2018

Extract Features from Occupancy Permit Data jasonasher/dc_doh_hackathon#10

Open

jasonasher added the improvement label Jan 9, 2018

eclee25 added the new feature label Jan 9, 2018

jasonasher assigned anishnyk Jan 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract Features from Occupancy Permit Data #11

Extract Features from Occupancy Permit Data #11

eclee25 commented Jan 9, 2018 •

edited

Loading

Extract Features from Occupancy Permit Data #11

Extract Features from Occupancy Permit Data #11

Comments

eclee25 commented Jan 9, 2018 • edited Loading

eclee25 commented Jan 9, 2018 •

edited

Loading