Kaggle Paris Meetup

  • Meetup #11
  • 3 Proposals :
  1. Data Science Bowl 2017​(image),large dataset​
  2. The Nature Conservancy Fisheries Monitoring (NCFM)
  3. DengAI: Predicting Disease Spread

Data Science Bowl 2017

  • Image Classification  (voxel) 

  • $1,000,000 · 1,116 teams · 2 months to go (2 months to go until merger deadline)

 

 

The Nature Conservancy Fisheries Monitoring (NCFM)

DengAI: Predicting Disease Spread

Open until dec 17. For the glory.

The goal is to predict the total cases for two cities :

  • Iquitos (Peru)
  • San Juan (Puerto Rico)

The training set mainly consist of quantitative data :

  • 24 columns
  • 1456 rows (936 for San Juan, 520 for Iquitos)
  • 20 years of data ranging from 1990 to 2010
  • 1 "city" column
  • 3 dates columns (year, week of year, week start date)
  • 4 columns about Satellite Vegetation (in pixel, N/S/E/W) 

 

DengAI: Predicting Disease Spread

  • 6 columns about precipitation and humidity
  • 10 columns about temperature

 

 

 

 

next slide : General information about the problem at hand

DengAI: Predicting Disease Spread

General information about the problem at hand :

  • Deng is a disease caused by a virus (Flavivirus)
  • This virus is spread by  only one specific type of mosquito (Andes aegypti)
  • This specific mosquito  :
    • has to be a female in order to spread the virus
    • has to bites someone with the deng fever in order to cary the virus
    • has to incubate for 8 days (avg)
    • then has to bite someone (often operates when it's daylight)
    • does not travel more than 100 meters from its home
    • can live two to three weeks
    • Flavivirus is a family of virus where we find Zika, Yellow Fever, Aedes is also a known vector for Chikungunya

DengAI: Predicting Disease Spread

Accurately predict the number of next cases can contribute to the set of existing  responses for a global situation

 

KaggleParisMeetup-11

By bruno16

KaggleParisMeetup-11

Slides for Kaggle Paris Meetup

  • 1,309