EE5434 learning model

洛雨听花發表於2024-12-05

EE5434 final project

Data were available on Nov. 5 (see the Kaggle website)Report and source codes due: 11:59PM, Dec. 6thFull mark: 100 pts.During the process, you can keep trying new machine learning models and boost the learningaccuracy.You are encouraged to form groups of size 2 with your classmatesso that the team canintroduce yourexpertise. If you prefer to do this project yourself, you canget 5 bonus points.Submission format: Report should be in PDF format. Source code should be in a notebook file(.ipynb) and also save your source code as a HTML file (.html). Thus, there are three files youneed to upload to Canvas. Remember that you should not copy anyone’s codes, which can leadto faisure of this course.Files and naming rules: If you have two members in the team, start the file name with G2,otherwise, G1. For example, you have a teammate and the team members are: Jackie Lee andXuantian Chan, name it as G2-Lee-Chan.xxx. 5 pts will be deducted if the naming rule is notfollowed. In your report, please clearly show the groupmembers.How do we grade your report? We will consider the following factors.

  1. You would get 30% (basic grade) if you correctly applied two learning models to ourclassification problem. The accuracy should be much better than random guess. Yourreport is written in generally correct English and is easy to follow. Your report shouldinclude clear explanation of your implementation details and basic analysis oftheresults.
  1. Factors in grading:
  2. Applied/implemented and compared at least 2 different models. You show goodsense in choosing appropriate models (such as some NLP related models).
  1. For each model, clear explanation of the feature encoding methods, modelstructure, etc. Carefully tuned multiple sets of parameters or feature engineeringmethods. Provided evidence of multiple methods to boost the performance.Consider performance metrics beyond accuracy (such as confusion matrix, recall,ROC, etc.). Carefully compare the performance of differentmethods/models/parameter sets. Being able to present your results using the mostinsightful means such as tables/figures etc.
  1. Well-written reports that are easy to follow/read.
  2. Final ranking on Kaggle.For each of the factor, we have unsatisfactory (1), acceptable (2), satisfactory (3), good (4),excellent (5). The sum of each factor will determine the grade. For example, student A got 4 代寫EE5434 learning model good and 1 acceptable for a to e. Then, A’s total score is 4*4+2=16. The fullmark for a to e isSo, A’s percentage is 64%.Note that if the final performance is very close (e.g. 0.65 vs 0.66), the correspondingsubmissions belong to the same group in the ranking.Factors that can increase yourgrade:
  1. You used a new learning model/feature engineering method that was not taught inclass. This requires some reading and clear explanation why you think this model fits thisproblem.
  1. Your model’s performance is much better than others because of a new or optimizemethod.

The format of the report

  1. There is no page limit for the report. If you don’t have much to report, keep it simple.Also, miminize the language issues by proofreading.
  1. To make our grading more standard, please use the following sections:
  2. Abstract. Summarize the report (what you done, what methods you use and theconclusions). (less than 300 words)
  1. Data properties (data explortary analysis). You should describe yourunderstanding/analysis of the data properties.
  1. Methods/models. In this section, you should describe your implemented models.Provide key parameters. For example, what are the features? If you use kNN,what is k and how you computed the distance? If you use ANN, what is thearchitecture, etc. You should separate the high-level description of the modelsand the tuning of hyper-parameters.
  1. Experimental results. In this section, compare and summarize the results usingappropriate tables/figures. Simplying copying screening is acceptable but willlead to low mark for sure. Instead, you should *summarize* your results. Youcan also compare the performance of your model under differenthyperparameters.
  1. Conclusion and discussion. Discussion why your models perform well or poorly.
  2. Future work. Discuss what you could do if more time is given.
  3. For each model you tried, provide the codes of the model with the best performance. Inyour report, you can detail the performance of this model with different parameters.

The code The code should include:

  1. Preprocessing of the data2. Construction of the model
  2. Training
  3. Validation
  4. Testing
  5. And other code that is necessary

相關文章