This is the main web page for Sections 102 and 104 of the course DS 644: Introduction to Big Data taught by Dr. William DeMeo in the spring semester of 2023 at NJIT.
This page will be updated throughout the semester. Students are expected to visit this page periodically.
Contents
- What is this web page?
- What is the course about?
- When and where are the lectures?
- Who is the instructor?
- Am I qualified to take this course?
- Where is the course schedule?
- Where is the Canvas page for my section?
- Where is information about homework?
- When and where are the exams?
- What textbooks are required?
- Do I have to attend all the lectures?
- What is expected of an NJIT student?
- Can I use electronic devices during lecture?
- What is the grading policy?
- Where can I find the homework?
- Can I turn in assignments after the deadline?
- Is it possible to make up a missed exam?
- Should I ask questions? (Slack)
- Is tutoring available?
- Can I email the instructor?
- What if I have a disability?
This course provides an in-depth coverage of various topics in big data from data generation, storage, management, transfer, to analytics, with focus on the state-of-the-art technologies, tools, architectures, and systems that constitute big-data computing solutions in high-performance networks. Real-life big-data applications and workflows in various domains (particularly in the sciences) are introduced as use cases to illustrate the development, deployment, and execution of a wide spectrum of emerging big-data solutions.
Ref. NJIT Graduate Computing Sciences Course Catalog
- §102 Monday 18:00--20:50 in CKB Room 124
- §104 Wednesday 18:00--20:50 in CKB Room 124
Instructor. Dr. William DeMeo
Email. williamdemeo at gmail
Office location. GITC 4201A
Office hours. Wednesdays 15:00--17:00
Students should have a had some prior computer programming experience. Students who lack such preparation must check with the instructor at the beginning of the semester to discuss whether/how they can stay/succeed in the class.
The schedule for this course will evolve over the course of the semester, so students should check it often. It is found in the schedule directory of this repository: github.com/williamdemeo/ds644-spring2023/schedule
Information about homework is in the homework directory: github.com/williamdemeo/ds644-spring2023/homework.
All exams take place in the same classroom as the lecture, CKB Room 124.
-
MIDTERM EXAM DATE
- § 102 (Mon) midterm exam 27 March 2023 18:00--20:00 in CKB Room 124
- § 104 (Wed) midterm exam 29 March 2023 18:00--20:00 in CKB Room 124
-
FINAL EXAM DATE (cumulative)
- § 102 (Mon) final exam 8 May 2023 18:00--20:00 in CKB Room 124
- § 104 (Wed) final exam 10 May 2023 18:00--20:00 in CKB Room 124
In accordance with university policy, the final exam must be taken by all students at the scheduled time in the usual classroom. Do not make plans which would have you depart campus before the scheduled final exam date.
Students are expected to have access to the following textbooks (electronic or hard copy---either is fine).
-
Learning Spark: lightning-fast data analytics (free from Databricks)
Edition. 2nd
Authors. Jules S. Damji, Brooke Wenig, Tathagata Das, Denny Lee
Year. 2020
Publisher. O'Reilly
ISBN. 9781492050049If you want a hard copy, you can buy Learning Spark from Amazon (often at a discounted price ≈$35).
-
Functional Programming in Scala
Edition. 2nd
Authors. Michael Pilquist, Rúnar Bjarnason, and Paul Chiusano
Release date. Early 2023 (estimated)
ISBN. 9781617299582Attention! At the time of this writing, the 2nd edition of FP in Scala (covering Scala 3) is available via the "Manning Early Access Program" (MEAP) from the Manning web site. You should not buy this book from Amazon just yet, as you may end up with the 1st edition (covering Scala 2).
The source code that accompanies FP in Scala is available in this zip archive or from the FP in Scala GitHub repository.
These are not required but might be helpful.
-
Online courses and videos
-
Coursera offers many courses covering Scala, Spark, and Big Data (search for "scala", search for "spark" or search for "big data" at the Coursera website), but we especially recommend these two:
- Functional Programming Principles in Scala, by Martin Odersky.
- Big Data Analysis with Scala and Spark, by Heather Miller.
-
Udemy offers many courses covering Scala, Spark, and Big Data as well
-
-
Books
In addition to the primary book for the course, students may find the following books helpful.
- The Scala 3 Book: an informal introduction to the Scala language.
- The Science of Functional Programming: A tutorial, with examples in Scala, by Sergei Winitzki (2022).
-
Miscellaneous
- Volume, velocity, and variety: Understanding the three V's of big data, David Gewirtz, 2018.
- Immutability Changes Everything, Pat Helland, 2015.
Short Answer: Yes!
Long Answer: Students are expected to attend all classes. A grade penalty will be assessed if a student has an excessive number of absences (whether excused or unexcused). Specifically, students are permitted (but strongly discouraged from taking) five absences in total, and one must email the instructor if/when they miss a class. Each additional absence, and any absence not mentioned to the instructor, may result in the deduction of points from the final grade.
Occasionally attendance will be recorded at the start of class.
Important. If you plan to leave before class is over, or arrive more than five minutes after class has begin, the correct procedure is to inform the instruct in advance. It is impolite and disruptive to other students arrive late or leave early. It is also very distracting to the instructor and other students to begin packing up belongings before the lecture has ended.
Students must abide by the university's code of conduct. In particular, cheating is unacceptable and will not be tolerated. Violations of this policy will be dealt with in a manner consistent with university regulations, which range from a warning to expulsion from the university.
Use of electronic devices in lecture for purposes unrelated to the course is not allowed.
Laptops computers may be used in the classroom for work related to this course.
Mobile phone use during the lecture, whether speaking or not, is not permitted, even if it is merely to send or receive text messages. If you are using your phone during the lecture, you may be asked to leave the classroom.
As a general rule, students should silence and refrain from using electronic devices, such as phones, ipods, microwave ovens, etc. in class. There is one exception to this rule: a laptop or tablet may be used for working on something related to the course material.
Tiktoking, Tweeting, Tindering, Facebooking, Instagramming, Snapchatting, Whatsapping, YouTubing, and gaming are strictly prohibited during the lecture. Such activities make it hard for the student to concentrate on the lecture and can also be very distracting to other students and the instructor. Students who violate this policy may be asked to leave the classroom.
The breakdown of the final course grade is as follows:
- Homework: 20 points total
- Projects: 30 points total
- Midterm exam: 25 points
- Final exam: 25 points
At the end of the semester, letter grades will be assigned roughly according to the following table. However, the scale may be shifted (or "curved") depending on overall student performance.
- A: 91--100
- B+: 86--90
- B: 81--85
- C+: 76--80
- C: 71--75
- D+: 66--70
- D: 60--65
- F: 0--59
The online homework assignments will account for 20% of the course grade and will be due roughly twice per month, unless otherwise indicated. See the tentative course schedule.
Students are strongly encouraged to start the homework early, so that they have time to get help from the instructor or a tutor if needed.
No late homework will be accepted for any reason. If a student fails to submit homework before the deadline, a score of 0 will be recorded for the assignment in question.
Generally speaking, there are no make-up exams. However, if a student misses an exam for one of the legitimate reasons listed below, and if that student contacts the instructor well before the day of the exam, then it might be possible for that student to take an alternative version of the exam before the scheduled exam time.
To request such an alternative exam, the student must provide documented evidence of one of the following:
- Documented medical excuse - student's own medical emergency.
- Extra curricular activity sponsored and authorized by the university.
- Armed forces deployment (military duty).
- Officially mandated court appearances, including jury duty.
- A conflict with another exam. (In such cases the exam with the fewest students is the one that must be rescheduled.)
If a student misses an exam due to some unforeseen circumstance, that student must contact the instructor within one class meeting after the missed test and provide an explanation. If the excuse is accepted, the missed test score may be replaced by 80% of the student's final exam score. For example, if the final exam score is 90%, then the student would receive a 72% for the missed test (0.80*0.90 = 0.72).
Students are strongly encouraged to ask many questions! When in doubt, ask!
-
In Lecture. The best time and place to ask a question is during lecture or during office hours.
-
On Slack. Another good place to ask a question is the Slack discussion forum for the course. Rather than emailing questions to the teaching staff, students are encouraged to post questions on Slack.
-
By email. Students may ask questions by emailing the instructor or on by sending a message via Canvas. However, the response time will likely be slower than if the question is asked on Slack.
Important Remarks about electronic correspondence
-
Do not assume the instructor knows to which class/section a question or comment pertains. Students should include their section number in all correspondence (§102 for Mondays; §104 for Wednesdays).
-
To ask a question that isn't private and doesn't contain answers to homework problems, students are encouraged to post their questions in a Slack channel instead of sending a direct message (DM) to the instructor. This will save the instructor from having to answer the same question more than once.
"YWCC and its ACM student chapter have partnered to create an online tutoring program available to undergraduate and graduate students looking for tutoring assistance."
See computing.njit.edu/tutoring for more information.
The undergraduate tutoring schedule is available at computing.njit.edu/undergraduate-tutoring-1
Students may email the instructor directly, though the response time will likely be much slower than if the question is asked on Slack.
If emailing the instructor, students are urged to use an informative subject field. If an email message does not at least mention the class name and section number, responses to that email may be delayed.
If you believe that you have a disability that qualifies under the Americans with Disabilities Act and Section 504 of the Rehabilitation Act and requires accommodations, you should contact the Office of Accessibility Resources and Services for information on appropriate policies and procedures. The next step is to talk to the instructor who will be happy to assist with accommodations, but will not provide them retroactively (so file the appropriate requests and paperwork well before the first exam!).
Students must have their paper work in order and should contact the instructor early in the semester in order to have their learning needs appropriately met.