Link Search Menu Expand Document

About

Table of contents

  1. Teams
  2. Project Templates
  3. Timelines (To-Be-Decided)
  4. Broad Topics:
  5. Example Datasets
  6. Sample Project Reports

Teams

Link to Spreadsheet

Project Templates

Please choose one of the following templates to submit the final project report:

Timelines (To-Be-Decided)

  • Abstract of the Project (A Class Presentation): A Project Proposal : October 13, 2022 (Thursday)
    • Expectation from the Project Proposal: Each team should update EXCEL SPREADSHEET with their project title and prepare a 10 minutes presentation. The presentation should clearly bring across three main points: (1) Motivation (Real-world examples of the project that would show the importance of project), (2) Problem Statement, (3) Why Machine Learning is appropriate?, (4) Potential Datasets, and (5) Problem Modeling (similar to Question 1 in HTCA-1).
  • Project Update (A Class Presentation): November 17, 2022 (Thursday)
  • Project Report Review : After Thanksgiving Break Tuesdays and Thursdays (Office Hours)
  • Project Report Submission : December 15, 2022 (Thursday) Firm Deadline
    • The team who hasn’t submitted the report would not present on Tuesday.
  • Project Presentation : December 20, 2022 (Tuesday) (10:30 AM to 12:30 PM) (All Groups)
  • Template for Final Project Presentation

Broad Topics:

Introduction to Machine Learning, Learning Problem, Generative Modeling, Linear Regression and Probability Estimation, Optimization and Geometry, Support Vector Machine and Kernel Methods, Trees and Forest, Probabilistic Graphical Models, Learning Theory, Representation Learning (Basic projects in Deep Learning) and Interpretable, Explainable Machine Learning.

Example Datasets

  • https://pslcdatashop.web.cmu.edu/KDDCup/downloads.jsp
  • http://archive.ics.uci.edu/ml/index.php
  • ML Collective
  • NBA statistics data: This download contains 2004-2005 NBA and ABA stats for:
    • Player regular season stats
    • Player regular season career totals
    • Player playoff stats
    • Player playoff career totals
    • Player all-star game stats
    • Team regular season stats
    • Complete draft history
    • coaches_season.txt - nba coaching records by season
    • coaches_career.txt - nba career coaching records

Currently all of the regular season. Project idea:

  • outlier detection on the players; find out who are the outstanding players.
  • predict the game outcome.

  • CMU World Wide Knowledge Base (Web->KB) project:
    • Project ideas:

    (a) Learning classifiers to predict the type of webpage from the text.

    (b) Can you improve accuracy by exploiting correlations between pages that point to each other using graphical models?

  • Email Annotation: The datasets provided below are sets of emails. The goal is to identify which parts of the email refer to a person name. This task is an example of the general problem area of Machine Learning and Information Extraction.
    • Project Ideas: Model the task as a Sequential Labeling problem, where each email is a sequence of tokens, and each token can have either a label of “person-name” or “not-a-person-name”.
    • Sample Project Report
  • Machine Learning for Good

Sample Project Reports

Reports 1 and 2 are not in the correct template. These reports are examples of how the project reports are to be created and what type of content is required in the project report.