your self data

(for non-programmers)

Oct. 30, 2013

Creative Commons License

Who are you?

  • I’m a PhDc at SNU and Data Scientist at Team POPONG.  
  • My interests lay on the usability, openness, and freedom of data. 
  • Friends call me a dreamer or romantist to a fault – but I enjoy living my life as such. 
  • In my free time, I play around surfing the web, or chuckling at xkcd.
  • I love continuous learning, and pursuit community action.

Our objective


A lot more out there...


The BIG Picture

1. Defining your problem

2. Searching the solution space

3. Locating your data sources

4. Crunching the data

5. Convincing others

1. Defining your problem


What are you interested in?

  • Is there a problem you've been wanting to solve? 
  • If you don't, it may be easier to start with exploring resources. 

1. Defining your problem

What is the scope of your problem?
  • What are the boundaries? 
  • What is it that you're not going to solve? 
  • What are the assumptions? Are they realistic? 
  • What are the constraints?

What are the challenges?
  • Do you have enough resources to solve your problem?
  • Time, ability, enthusiasm, ...

What happens when you solve this problem?
  • How does it benefit your life?

1. Defining your problem

    Define your problem 
    in one sentence.

    2. Searching the solution space

    What are the causes of the problem?
    • Why did this problem occur in the first place?
    • Is there a major cause, or do some work together?
    • This is where you brainstorm with your teammates.

    How have others approached your problem?
    • Is your problem something new?  (Probably not.)
    • Where is the main research community located? (the "domain")
    • What are the "jargons" they use? (literature survey)
    • Google Scholar at your service.

    2. Searching the solution space

    Choose your own approach.
    • Specify your hypothesis.
    • Try to solve one problem at a time. (divide and conquer)
    • The more time, the higher performance.
    • What ingredients (data) do you need?

    2. Searching the solution space

    Now, try defining your solution in one sentence.

    3. Locating your data sources

    So, where can you get the data?

    Data is practically everywhere.
    (Though  the data you really need is never there.)

    4. Crunching the data

    Get to know your data
    (a.k.a. "Data Exploration")


    Spotfire, Tableau, ...
    Google trends, Google graphs, Google fusion tables
    php, python, html, css, javascript


    4. Crunching the data

    What's the type of your data?
    Numerical? Textual? Graphical? Spatial? Temporal? ...

    What are the dimensions?
    • Variables  & records (==the columns & rows)
    • How many variables? Type of variables? (e.g., Nominal, ordinal, interval, ratio)
    • How many records?

    4. Crunching the data

    Some very easy and useful tools

    Publishing on the Web


    5. Convincing others

    Reporting & Presenting the results
    • Your conclusions (the performance) are important, but your reasoning counts too.
    • What were your options? Why did or didn't you choose an option?
    • What were your assumptions? How much are they valid?
    • What did you consider as variables, and what did you fix?
    • What have you considered, but didn't implement? (Future work)

    Galleries you can check out

    5. Convincing others

    Some more useful tools

      Some tips

      Plan ahead.
      You have deadlines. Make milestones, and keep them.
      Even if your results don't satisfy your standards, get over it.

      Keep the development cycle short.
      First make something that runs.
      Then make enhance the performance. Then add more components.

      Work as a team.
      Find our what your teammates do best.
      Know what they want.

      Documentation on-the-fly helps.
      It really does.


      Korea National Assembly, Now

      Visualization of the seatings of the 18th National Assembly of Korea.

      Now let's try one

      Made with