AMLH 2024 Assignment Instruction

最新推荐文章于 2025-11-24 09:56:31 发布

原创

最新推荐文章于 2025-11-24 09:56:31 发布 · 747 阅读

20 ·

CC 4.0 BY-SA版权

文章标签：

#neo4j

AMLH 2024 Assignment Instruction (Full)

Deadline for submission: Wednesday, 10thJuly, 2024, 17:00 GMT

• Length: The summative assessment refers to the outcome of the entire module and culminates in a documented paper of 2000-2500 words (excluding figures, references) and the submission of an iPython-notebook.

• Python packages: tensorflow, pytorch, keras, scikit-learn, etc.

• Each student needs to select one dataset as your choice from the provided list. The maximum number of students for any topic/dataset is 23. First come first served. (Last year, all students selected their favourite topics)

Dataset options for the Assignment:

A. CT 3D Volume Segmentation

B. Glucose prediction for T1D

C. Identify phenotypes from clinical notes

The assignment should follow this overall outline (the requirements of each dataset are provided in a separate section):

1. Introduction Words:

Please describe the background, motivation and importance of the data in light of related literature. Show a sound interpretation of the medical problems presented in the data. Outline the selected dataset (including features and class labels) and provide descriptive statistics of the contained variables. Visualise the data or feature space in a plot if it is possible and explore the underlying characteristics.

[15 marks] 400

2. Methodology

2.1 Preprocessing Words:

Please describe details of preprocessing of the data, including data cleaning, imputation, normalization, augmentation, up or down sampling, feature engineering etc. of the chosen dataset.

[15 marks] 450

2.2 Algorithm design and implementation Words:

Select two AI models of the course, or two neural networks which have very different structures. Given a high-level description of both algorithms including their rationale or model structure. Describe and demonstrate for both algorithms:

a) How to generate training and testing data with appropriate format that suit the AI model

b) Optimization of hyper-parameters

c) Model evaluation based on the outcomes including widely-used criteria that mentioned in the course

Demonstrate your solution with an attached iPython notebook. Ensure reproducibility and transparency.

[35 marks] 650

3. Results Words:

Present optimized hyper-parameters and reasonable evaluation criteria such as a confusion matrix, precision, recall, RMSE, MARD, F1, AUC and a ROC-plot, etc. Provide an analysis for both algorithms with different parameters and give a textual description of the results. (please choose appropriate metrics carefully)

[25 marks] 600

4. Discussion and Conclusion Words:

Compare and discuss your findings (results of two algorithms). If it is possible, it would be good to compare with other scientific publications that used the same medical dataset. Discuss how you would improve your methodology, current limitations and future work.

[10 marks] 400

5. Reference

6. Appendix

Attach a reproducible iPython Jupiter notebook.

Total: 100 marks

Plagiarism or collusion is not allowed. The module can be failed straightaway if the assignment or codes notebook fails in the plagiarism test.

Markers will look for the following sections in the assignment:

• Sound understanding of the provided dataset and appropriate pre-processing to obtain a dataset that is suitable for the machine learning model

• Appropriate selection and learning/training of two algorithms (one of them can be

seen as the baseline) to address the target problems in medical imaging, time series or NLP

• Evaluation of the performance and meaningful discussion

• Other requirements that have been asked in the associated dataset instruction

• The layout, presentation, references of the paper/report

Please there is any question regarding the dataset or coursework, please post it in the forum on moodle page, or email Kevin/Ken/Honghan directly.

Provisional mark and feedback will be released in Autumn 2024.

CT 3D Volume Segmentation

Data:

1. Introduction:

In this assignment, we will use a subset of the Pancreas data from the Medical Segmentation Decathlon. Your task will be to segment the 3D volumes and

最低0.47元/天解锁文章