Merge pull request #629 from panpan0000/main

add rm dataset explanation
This commit is contained in:
hoshi-hiyouga 2023-08-22 13:41:44 +08:00 committed by GitHub
commit 4da719c830
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 25 additions and 0 deletions

View File

@ -16,3 +16,15 @@ If you are using a custom dataset, please provide your dataset definition in the
```
where the `prompt` and `response` columns should contain non-empty values. The `query` column will be concatenated with the `prompt` column and used as input for the model. The `history` column should contain a list where each element is a string tuple representing a query-response pair.
For Reward-Modeling(rm) dataset, the first n examples represent chosen examples and the last n examples represent rejected examples.
```json
{
"instruction": "Question?",
"input": "",
"output": [
"chosen answer",
"rejected answer"
]
}
```

View File

@ -16,3 +16,16 @@
```
其中 `prompt``response` 列应当是非空的字符串。`query` 列的内容将会和 `prompt` 列拼接作为模型输入。`history` 列应当是一个列表,其中每个元素是一个字符串二元组,分别代表用户请求和模型答复。
对于奖励模型(rm)的数据集头N个输出表示`chosen`的数据后N个输出表示`rejected`的数据,例如:
```json
{
"instruction": "Question?",
"input": "",
"output": [
"chosen answer",
"rejected answer"
]
}
```