About the instructor

Course logistics

  • Course objective: Understanding data mining algorithms
  • Course website
  • Office hours
    • Right after every class
    • You are welcome to ask any kind of questions
    • You are also encouraged to book ahead, or your meeting may have to be deferred to another time
  • Grading
    • Assignments (30%): Three graded assignments for you to submit online
    • Final exam (30%): In-class exam covering the whole semester
    • Term project (40%): Group work solving a real world problem

Assignments (30%)

  • Three graded take-home assignments for you to submit online
  • Instead of taking a mid-term exam, we will have assignments for reviewing purposes
  • Assignments consist of a quiz and a programming task
    • Frankly, the quiz is not for assessment but for you to review and/or preview class materials
    • Programming tasks will cover what you have studied during class. You will have done most of the work in class already. What you're going to do at home is to wrap up your work and document it.
    • Will be open early, and can be submitted at any time
  • Note the submission date
    • The assignments will still be open even after the due date but you won't get any credit for solving it

Final exam (30%)

  • In-class exam covering the whole semester
  • Consists of an easy 80%, a relatively hard 20%
    • If you have fully understood the contents of the assignments, the easy 80% wouldn't be a problem

Term Project (40%)

Proposal (10/40)

  • Individual work
  • Submit 1 data mining project idea within 1 page (Due: 2015-03-18 23:59)
    • Delay penalty: 100% off after due (No delays allowed)
  • Suggested contents
    1. Background: What question do you have? Why is this problem important?
    2. Formulation: Convert your problem into a data mining problem. What are the inputs and outputs? What algorithms are you going to use? (ex: Classification, clustering, regression, text mining, etc.)
    3. Data acquisition: How are you going to obtain the data?
    4. Expected results
    5. Expected (business) implications
  • Evaluation will be based on following criteria
    1. 아이디어가 가지는 의미 (5점)
    2. 아이디어의 구체화 정도 (5점)
  • Fix the team with a team leader at e-class (Due: 2015-03-18 23:59)
  • Choose a topic (Due: 2015-03-29 23:59)
    • Among your teammates' proposals, select one topic and conduct a project as a group
    • You can also choose topics among Kaggle

Progress presentation (15/40)

  • Submit presentation slides (Due: 2015-05-07 23:59)
    • It is recommended that you have done approx. 70-80% of your whole project by this date
    • Submit presentation slides to the e-class
    • No page limits
    • Delay penalty: 50% off (next day) 100% off (after presentation)
  • Suggested contents: Your proposal + the following
    1. (Optional) If you have changed the subject, what was the reason?
    2. Data Exploration
    3. What approach you chose to alleviate such questions
    4. What results you achieved
    5. What questions you further got and what you plan to do next
    6. Tricks and tips you want to share with the class
  • Presentation day (2015-05-08)
    • You will present your project progress in front of the class (Max. 10 min)
    • Peer assessment
      • You will also be grading your peers' work on presentation day
      • You will be given three votes
      • You can give one vote to three teams, or give all votes to one team
      • You will also be reviewing contributions of your own teammates
    • Be brief yet clear
    • Record the feedback
      • One can make the slides
      • Another can prepare for the presentation
      • And another can do the presentation
      • Yet another can take feedback notes

Final report (15/40)

  • Present what your group has done throughout the whole project in more than 10 pages (Due: 2015-06-14 23:59)
    • Delay penalty: 50% off (next day) 100% off (after that)
  • You will also be grading your teammates, based on their contributions
  • Extra credit will be given to those who submit and/or rank in an open tournament (ex: Kaggle)
  • Suggested contents: Your progress report + the following
    • Enhanced results
    • (Business) implications
    • Future work

More advice on your projects

  1. The best topics are the topics you are actually interested in
    • You should be able to "dogfood" your own analysis
  2. Don't be afraid to shift the project's direction
    • However, shifting too much will give you less time for real work -- balance!
  3. Feel free to use project results in your graduation project or paper
    • Grab two rabbits at once!
    • These projects have potential to become something in your portfolio
    • May be a plus when you get a job, or apply for grad school

Asking questions

Never hesitate in asking questions

  • Private questions: [email protected]
    • Personal questions and/or requests
    • Assignment submissions that regard privacy
  • Public questions: Everything else you want to ask goes to e-class
    • Using any language of your choice (ex: English, Korean, Java, ...)
    • Asking good questions
      • Provide as much details as you can
      • However, be "brief" and "clear"
      • In case of programming questions, explicitly list versions of software being used (including packages and OSs)