OpenAI is concentrating on cleaning up its dataset and removing examples where the model has shown a preference for falsehood. It is also working on a technique known as reinforcement learning from human feedback, which improves the model’s responses based on user feedback. OpenAI has also chosen the prompts that have resulted in unwanted content in order to improve the model and prevent it from repeating these generations.
The company has stated that it will begin gathering more public feedback to shape its models, including the use of surveys and the formation of citizens assemblies to discuss what content should be completely prohibited. With its “consensus project,” OpenAI researchers are also looking at how much people agree or disagree on various topics generated by the AI model. Finally, OpenAI believes it may be possible to train AI models to represent various perspectives and worldviews.