what data engineering skills you want to learn from this course
Course Content
Course Overview
- VM setup, Python crash course
SQL
- Basic SQL
- Data models and relational SQL
- Many-to-many relationships in SQL
NoSQL
- NoSQL databases
- Access web data
Different Types of Data
- Spatial data
- Text data
- Image and time-series data
- Big data
Attendance
Attendance is mandatory
UF policy for excused absences applies (must notify instructor in writing before class when possible)
Each unexcused absence results in a 1.5% deduction from the final grade
>3 unexcused absences results in failure
Homework
6 homework assignments
- 10% each
- the highest 5 grades will count towards the final grade
Often simple programming exercises
Requirements:
- turn in assignment no later than 11:59 pm on the day it is due
- late assignment will NOT be accepted
- no handwritten assignment
- DO NOT copy others' work
Course Project
Requirements:
- must include at least 1 non-traditional data source (i.e. spatial data, text data, image data, time-series data)
- uses of semi-structured and unstructured data are encouraged
- uses of web data accessed by API or scraping are encouraged
Some examples (from last year):
- Side Effects and Adverse Reactions to Painkillers: Analysis with FDA Adverse Event Reporting System
- Utilizing nontraditional data sources for near real-time estimation of Zika virus case trends during the 2016 Florida USA Zika outbreak
- Twitter Mining for Cocaine Use
- Medical Marijuana Laws and Change of Number of Tweets towards Marijuana: A Time Series Analysis Using Data from Twitter
Course Project (continued)
You can work individually or work as a team
If choose to work as a team:
- each team can have up to 2 members
- clearly delineate roles and responsibilities of each team member
Project Due:
- Feb 5, 2018: form a project team
- March 12, 2018: midterm presentation and project proposal
- Apr 16, 2018: final presentation
- Apr 23, 2018: final project report
Midterm
Project proposal:
- Abstract: up to 1 page
- Project description: up to 5 pages
~ Specific Aims/Objectives
~ Background and Significance
~ Approach/Research Design
~ Timeline
- Citations: no page limit, use the Vancouver style
- Single column, single spacing; Arial or Times New Roman font; font size no smaller than 11 point; tables and figure labels can be in 10 points; minimum 0.5 inch margins
Proposal presentations:
- up to 15 slides
- up to 15 minutes presentation with 5 minutes Q&A
- send the slides to instructor at least 3 days in advance
Final
Final Report: up to ten pages (including references)
- Abstract: no more than 250 words summarizing the project
- Introduction: a short background and objective(s) of the study
- Methods: design, setting, dataset, approaches, and main outcome measurements
- Results: key findings
- Discussion: key conclusions with direct reference to the implications of the methods and/or results
- References: please follow the Vancouver style
Final presentations:
- up to 15 slides
- up to 15 minutes presentation with 5 minutes Q&A
- send the slides to instructor at least 3 days in advance
Note: analyses are required, but you should focus more on the data accessing and engineering part.