IN-STK5000 - Autumn 2018

Grading scheme

There are 2 project (formally take-home exams), split into 3 parts each. Each one takes 2-4 hours and is partly done in a tutorial session.

Each question is weighted equally in each home exam, so that by correctly answering the elementary parts of each question, students can be guaranteed a passing grade. Each exam counts for 40% of the score. A 15-min seminar is also given by the students. This counts for 20% of the final score.

Criteria for full marks in each part of the exam are the following.

Documenting of the work in a way that enables reproduction.
Technical correctness of their analysis.
Demonstrating that they have understood the assumptions underlying their analysis.
Addressing issues of reproducibility in research.
Addressing ethical questions where applicable.
Consulting additional resources beyond the source material with proper citations.

The follow marking guidelines are what one would expect from students attaining each grade.

A

Submission of a detailed report from which one can definitely reconstruct their work without referring to their code. There should be no ambiguities in the described methodology. Well-documented code where design decisions are explained.
Extensive analysis and discussion. Technical correctness of their analysis. Nearly error-free implementation.
The report should detail what models are used and what the assumptions are behind them. The conclusions of the should include appropriate caveats. When the problem includes simple decision making, the optimality metric should be well-defined and justified. Simiarly, when well-defined optimality criteria should given for the experiment design, when necessary. The design should be (to some degree of approximation, depending on problem complexity) optimal according to this criteria.
Appropriate methods to measure reproducibility. Use of cross-validation or hold-out sets to measure performance. Use of an unbiased methodology for algorithm, model or parameter selection. Appropriate reporting of a confidence level (e.g. using bootstrapping) in their analytical results. Relevant assumptions are mentioned when required.
When dealing with data relating to humans, privacy and/or fairness should be addressed. A formal definition of privacy and/or should be selected, and the resulting policy should be examined.
The report contains some independent thinking, or includes additional resources beyond the source material with proper citations. The students go beyond their way to research material and implement methods not discussed in the course.

B

Submission of a report from which one can plausibly reconstruct their work without referring to their code. There should be no major ambiguities in the described methodology.
Technical correctness of their analysis, with a good discussion. Possibly minor errors in the implementation.
The report should detail what models are used, as well as the optimality criteria, including for the experiment design. The conclusions of the report must contain appropriate caveats.
Use of cross-validation or hold-out sets to measure performance. Use of an unbiased methodology for algorithm, model or parameter selection.
When dealing with data relating to humans, privacy and/or fairness should be addressed. While an analysis of this issue may not be performed, there is a substantial discussion of the issue that clearly shows understanding by the student.
The report contains some independent thinking, or the students mention other methods beyond the source material, with proper citations, but do not further investigate them.

C

Submission of a report from which one can partially reconstruct most of their work without referring to their code. There might be some ambiguities in parts of the described methodology.
Technical correctness of their analysis, with an adequate discussion. Some errors in a part of the implementation.
The report should detail what models are used, as well as the optimality criteria and the choice of experiment design. Analysis caveats are not included.
Either use of cross-validation or hold-out sets to measure performance, or use of an unbiased methodology for algorithm, model or parameter selection - but in a possibly inconsistent manner.
When dealing with data relating to humans, privacy and/or fairness are addressed superficially.
There is little mention of methods beyond the source material or independent thinking.

D

Submission of a report from which one can partially reconstruct most of their work without referring to their code. There might be serious ambiguities in parts of the described methodology.
Technical correctness of their analysis with limited discussion. Possibly major errors in a part of the implementation.
The report should detail what models are used, as well as the optimality criteria. Analysis caveats are not included.
Either use of cross-validation or hold-out sets to measure performance, or use of an unbiased methodology for algorithm, model or parameter selection - but in a possibly inconsistent manner.
When dealing with data relating to humans, privacy and/or fairness are addressed superficially or not at all.
There is little mention of methods beyond the source material or independent thinking.

E

Submission of a report from which one can obtain a high-level idea of their work without referring to their code. There might be serious ambiguities in all of the described methodology.
Technical correctness of their analysis with very little discussion. Possibly major errors in only a part of the implementation.
The report might mention what models are used or the optimality criteria, but not in sufficient detail and caveats are not mentioned.
Use of cross-validation or hold-out sets to simultaneously measure performance and optimise hyperparameters, but possibly in a way that introduces some bias.
When dealing with data relating to humans, privacy and/or fairness are addressed superficially or not at all.
There is no mention of methods beyond the source material or independent thinking.

F

The report does not adequately explain their work.
There is very little discussion and major parts of the analysis are technically incorrect, or there are errors in the implementation.
The models used might be mentioned, but not any other details.
There is no effort to ensure reproducibility or robustness.
When applicable: Privacy and fairness are not mentioned.
There is no mention of methods beyond the source material or independent thinking.

Published Nov. 20, 2018 11:47 AM - Last modified Nov. 20, 2018 11:47 AM