For your final assignment in this course you will work on a project. The goal of the project is to develop interactive notebooks that analyzes and visualizes data you have chosen, and to answer questions your data. You will acquire the data, design your visualization, implement it using python tools, and evaluate the results.

Project Team

You will work closely with other classmates in a 2-3 person project team. You can come up with your own teams and use our discussion forum to find prospective team members. If you can’t find a partner we will team you up randomly. We recognize that individual schedules and other constraints might limit your ability to work in a team. If this the case, ask us for permission to work alone. In general, we do not anticipate that the grades for each group member will be different. However, we reserve the right to assign different grades to each group member based on peer assessments (see below) and other indicators.

Project Steps

There are a few actions you have to for your final project. It is critical to note that no extensions will be given for any of these dates for any reason. For due dates see the schedule. Late days may not be used. Projects submitted after the final due date will not be graded. These steps are:

  • Announce your project team and project title using this google form. Each team will only need to submit one form.
  • Project proposals
  • Milestone 1, a functional project prototype
  • Project review with the staff
  • Final project submission & peer evaluations. Use this form for the peer evaluation.
  • Project presentations.

Expectations

Here is what we expect from your project:

  • It should be unique. You should not copy ideas of an existing notebook.
  • The dataset should be non-trivial and interesting. For example, the avalanche data we have been using in homework meets this criteria. It is a relatively large dataset, requires some processing, and enables us to ask many different questions. In contrast, the penguins, or cars, or the iris dataset we sometimes use in lecture to illustrate concepts don’t meet that bar. We recommend to stay away from standard Kaggle datasets.
  • You should develop an Exploratory Notebook: where you do the data wrangling and look at views of the data to make judgements about what to do next. The documentation here should be mostly technical.
  • Then, you should develop an Explanatory Notebook. Here you want to tell a story about the data. You should load processed data and only focus on the visual display. Your notebook should be well narrated and highlight insights you identified. All figures should have excellent captions and labeling. You should annotate relevant findings directly in the plots.

Proposal

The proposal document should address the following points. Use these points as headers in your document.

  • Basic Info. The project title, your names, e-mail addresses, UIDs, a link to the project repository.
  • Background and Motivation. Discuss your motivations and reasons for choosing this project, especially any background or research interests that may have influenced your decision.
  • Data. From where and how are you collecting your data? If appropriate, provide a link to your data sources.
  • Data Processing. Do you expect to do substantial data cleanup? What quantities do you plan to derive from your data? How will data processing be implemented?
  • Analysis Questions. Provide the primary questions you are trying to answer with your visualization. What would you like to learn and accomplish? List the benefits.
  • Visualization Design. How will you display your data? Provide some general ideas that you have for the visualization design: which visualization do you want to use for which aspect. Discuss three alternative prototype designs for your visualizations. Create one final design that incorporates the best of your three designs. Describe your designs and justify your choices of visual encodings. Describe how your visualizations address the analysis questions.
  • Must-Have Features. List the features without which you would consider your project to be a failure.
  • Optional Features. List the features which you consider to be nice to have, but not critical.
  • Project Schedule. Make sure that you plan your work so that you can avoid a big rush right before the final project deadline, and delegate different modules and responsibilities among your team members. Write this in terms of weekly deadlines.

This proposal is the first part of your process book. As a ballpark number: your proposal should contain about 3-4 pages of text, plus 3-4 pages of sketches.

Based on your proposals we will assign a staff member to your team who will guide you through the rest of the project. You will schedule a project review meeting with a staff member. Make sure all of your team members are present at the meeting.

The proposal will be submitted to Canvas.

Project Milestone

For your Milestone we expect you to hand in a good draft of your exploratory notebook and possibly a sketch for your explanatory notebook. For your Milestone you should have completed your data acquisition, or at least have a significant sample of your data. You must have your data structures in place. For example, if you plan to collect 1000 data records, but only have 200, that’s fine. If you are missing one of two datasets you want to use you will lose points, since you have to have the whole structure.

If you are uncertain about the scope, please contact your project TA.

Since Jupyter Notebooks are not great for collaborative work, we recommend Google Colab instead. If you choose google colab, either download your work and submit as a Jupyter Notebook (together with all the data), or submit a PDF with a link to your notebook. We must be able to access the notebook.

You can use any library available; but provide instructions on which non-standard libraries we need to install to run your notebook in a Readme.

Final Project Submission

For your final project you must hand in the following items:

  • Exploratory Notebook: The notebook where you document all your experiments and reason about why or why not you’re pursuing a design for your explanatory notebook.
  • Explanatory Notebook: This notebook should read like a newspaper article describing your dataset, your visualizations, and your insights. It should use interactivity and contain your custom chart.

Again either submit a PDF with a colab link, or a Jupyter Notebook with all the data.

Project Presentation

Prepare a five-minute presentation about your notebook, for one of the two presentation slots during the last two classes. All team members musts participate in the presentation. You can use slides or you can present from your notebook, or you can use a combination, but you should focus on the results and your visualizations, and not on code.

Peer Assessment

It is important to provide positive feedback to people who truly worked hard for the good of the team and to also make suggestions to those you perceived not to be working as effectively on team tasks. We ask you to provide an honest assessment of the contributions of the members of your team, including yourself. The feedback you provide should reflect your judgment of each team member’s:

  • Preparation – were they prepared during team meetings?
  • Contribution – did they contribute productively to the team discussion and work?
  • Respect for others’ ideas – did they encourage others to contribute their ideas?
  • Flexibility – were they flexible when disagreements occurred?

Your teammate’s assessment of your contributions and the accuracy of your self-assessment will be considered as part of your overall project score.

Submission Instructions

Submission will be handled through canvas. Make sure to include all the data so that we can run your notebooks. Alternatively, you can submit a PDF to canvas that links to a public Google Colab notebook.

Grading Criteria

  • Solution - Is your notebook effective in answering your intended questions? Was it designed following visualization principles?
  • Ambition / Scope – Is your notebook doing interesting analysis of an ambitious dataset and does it have good scope?
  • Implementation - What is the quality of your implementation? Is it appropriately polished, robust, and reliable?
  • Presentation - How are you presenting your project in class?

Your individual project score may also be influenced by your peer evaluations.