Looking at the evolution report of large models from ChatGPT Lite version

Mondo Technology Updated on 2024-01-26

What I am sharing today is [Looking at the Evolution Report of Large Models from ChatGPT - Simplified Version] Report Producer: Pengcheng Laboratory.

Featured Reports** Public Title: A global repository of industry reports

ChatGPT's development process is a reverse conclusion.

Language generation ability + basic world knowledge + upper and lower literature Xi all come from pre-training (d**inci), the ability to store large amounts of knowledge from 175 billion parameters.

The ability to follow instructions and generalize to new tasks comes from expanding the number of instructions in the instruction Xi (d**inci-instruct-beta's ability to perform complex reasoning is likely to come from training (code-d**inci-002)

The ability to generate neutral, objective, safe, and informative answers comes from alignment with humans. Specifically:

If it is supervised Xi version, the resulting model is text-d**inci-002

In the case of the Intensive Chemical Xi Edition (RLHF), the resulting model is text-d**inci-003

Regardless of whether it is supervised or RLHF, the performance of the model cannot exceed that of code-d**inci-002 in many tasks, and this phenomenon of performance degradation due to alignment is called alignment tax.

ChatGPT's conversational ability also comes from RLHF, specifically it sacrifices the ability to Xi up and down literature in exchange for:

Model conversation history.

Increase the amount of information in the conversation.

Reject questions that are outside the scope of the model's knowledge.

This article is for informational purposes only and does not represent any investment advice from us. To use the information, please refer to the original report. )

Featured Reports** Public Title: A global repository of industry reports

Related Pages