2022-12-28
OpenAI’s ChatGPT chatbot is
available for testing using GPT engine v3.5, an improvement from GPT-3 released
in 2021.
ChatGPT’s capabilities focus on
conversations, answering questions, and accept mistakes from conversations. In a
demonstration, ChatGPT can find a bug from a sample code.
Mira Murati, CTO of OpenAI said
that ChatGPT is different from the old models. The program can accept when it
does not know something, or when it answers something incorrectly.
Behind the mechanism of ChatGPT is a training method called Reinforcement Learning from Human Feedback (RLHF). The first process is humans train AI to make conversations. Then, rank the conversation by quality. Quality conversations will receive rewards, this is called rewarding model. For the final training process, researchers modify the AI using Proximal Policy Optimization (PPO) technique.
The entire AI training process performs on the Microsoft Azure supercomputer because previously Microsoft has invested in OpenAI.
The OpenAI researchers accept that ChatGPT still has limitation. It might answer something that sound convincing but the answer does not base on facts or logic. OpenAI is still developing a new GPT-4 model. The company does not announce launch date for this new version yet.