CS 839: FOUNDATION MODELS

n58h2r發表於2024-10-03

原文網址 : https://www.cnblogs.com/comp9021/p/18445454

HOMEWORK 1

Instructions: Read the two problems below. Type up your results and include your plots in LaTeX. Submit youranswers in two weeks (i.e., Oct. 3 2024, end of day). You will need a machine for this assignment, but a laptop(even without GPU) should still work. You may also need an OpenAI account to use ChatGPT, but a free accountshould work.

NanoGPT Experiments.

We will experiment with a few aspects of GPT training. While this normallyrequires significant resources, we will use a mini-implementation that can be made to run (for the character level)on any laptop. If you have a GPU on your machine (or access to one), even better, but no resources are strictly

required.

1. Clone Karpathy’s nanoGPT repo (https://github.com/karpathy/nanoGPT). We will use this repo for allthe experiments in this problem. Read and get acquainted with the README.

2. Setup and Reproduction. Run the Shakespeare character-level GPT model. Start by running the prepcode, then a basic run with the default settings. Note that you will use a different command line if you havea GPU versus a non-GPU. After completing training, produce samples. In your answer, include the first twolines you’ve generated.

3. Hyperparameter Experimentation. Modify the number of layers and heads, but do not take more than 10minutes per run. What is the lowest loss you can obtain? What settings produce it on your machine?

4. Evaluation Metrics. Implement a specific and a general evaluation metric. You can pick any that youwould like, but with the following goals: Your specific metric is meant to capture how close your generateddata distribution is to the training distribution. Your general metric need not necessarily do this and shouldbe applicable without comparing against the training dataset. Explain your choices and report your metricson the settings above.

5. Dataset. Obtain your favorite text dataset. This might be collected data by a writer (but not Shakespeare!),text in a different language, or whatever you would prefer. Scrape and format this data. Train nanoGPT 代寫 CS 839: FOUNDATION MODELS onyour new data. Vary the amount of characters of your dataset. Draw a plot on number of training charactersversus your metrics from the previous part. How much data do you need to produce a reasonable scoreaccording to your metrics?

6. Fine-tuning. Fine-tune the trained Shakespeare model on the dataset you built above. How much dataand training do you need to go from Shakesperean output to something that resembles your dataset?

Prompting.We will attempt to see how ChatGPT can cope with challenging questions.

1. Zero-shot vs. Few-shot. Find an example of a prompt that ChatGPT cannot answer in a zero-shot manner,but can with a few-shot approach.

2. Ensembling and Majority Vote. Use a zero-shot question and vary the temperature parameter to obtainmultiple samples. How many samples are required before majority vote recovers the correct answer?

3. Rot13. In this problem our goal is to use Rot13 encoding and ‘teach’ ChatGPT how to apply it. You canuse rot13.com to quickly encode and decode. Also read about it at https://en.wikipedia.org/wiki/ROT13.Our goal is to ask questions like

What is the capital of France?, but encoded with Rot13, i.e.,

Jung vf gur pncvgny bs Senapr?, 1Homework 1 CS 839: Foundation Models

– What do you obtain if you ask a question like this zero-shot? Note: you may need to decode back.

– What do you obtain with a few-shot variant?

– Provide the model with additional instructions. What can you obtain?

– Find a strategy to ultimately produce the correct answer to an encoded geographic (or other) questionlike this one.2

839相似字串
2021-01-31
字串
不在models.py中的models
2018-08-09
iOS引用轉換：Foundation與Core Foundation對
2021-09-09
iOS
Probabilistic Models
2020-12-17
Large language models as surrogate models in evolutionary algorithms: A preliminary study
2024-12-06
Go
瞭解下Foundation Joyride
2023-01-16
IDE
瞭解下Foundation 表格
2022-08-18
Structuring Your TensorFlow Models
2018-09-07
Struct
Beego Models之二
2021-09-09
Go
瞭解下Foundation 按鈕
2022-08-19
Laravel view models [翻譯]
2019-01-20
LaravelView
12、flask-模型-models
2024-07-07
Flask模型
Enhancing Diffusion Models with Reinforcement Learning
2024-07-24
微軟workflow foundation介紹
2020-04-04
微軟
Foundation 價格表簡介
2022-10-31
瞭解下Foundation 按鈕組
2022-08-19
CS 3800 python
2024-03-18
Python
cs上線
2020-10-21
CS支付寶
2019-05-11
編譯開源 Swift Foundation 庫
2018-10-23
編譯Swift
瞭解下Foundation 均衡器(Equalizer)
2023-01-17
瞭解下Foundation 網格系統
2023-01-18
瞭解下Foundation 網格例項
2023-02-14
iOS引用轉換：Foundation與Core Foundation物件互相轉換（__CFString轉NSString，void *轉id等等）
2019-03-01
iOS物件
Multi-lingual Models for Compostional Semantic representations
2018-08-11
As a reader --> TabDDPM: Modelling Tabular Data with Diffusion Models
2024-04-23
As a reader --> Diffusion Models for Imperceptible and Transferable Adversarial Attack
2024-04-23
[Paper Reading] DDIM: DENOISING DIFFUSION IMPLICIT MODELS
2024-03-12
oracle data Format Models---二(轉)
2019-03-18
OracleORM
配置AutoFacManger.cs
2024-05-20
ACM
CS 551 Systems Programming
2024-10-19
Apache資源（Apache Software Foundation Distribution Directory）
2018-09-07
Apache
【VMware VCF】VMware Cloud Foundation Part 01：概述。
2024-07-18
Cloud
PHP 基金會，是個好事 (PHP Foundation)
2021-11-24
PHP
瞭解下Foundation 網格 – 大型裝置
2023-02-09
翻譯：Bullet Proofing Django Models 待更新
2018-10-26
Django
As a reader --> AdvDiffuser: Natural Adversarial Example Synthesis with Diffusion Models
2024-04-23
怎麼使用Stable diffusion中的models
2024-05-28

CS 839: FOUNDATION MODELS

相關文章