Once we know a good part of the benefits that the different AI platforms currently offer us, the responsible companies try to optimize their use. And it is a technology that consumes high resources, something that we try to solve as much as possible.
One of the companies that could be considered one of the leaders in this sector is OpenAI. Not in vain is it proposed by the artificial intelligence that most of you know very well, ChatGPT. A good part of these proposals already allow us to generate texts automatically, as well as images, videos or programming code.
But in most cases, something that directly affects the aforementioned ChatGPT, latency is a major problem. For example, this is something that significantly affects scenarios such as programming code hints and modifying long documents. This latency can significantly affect the overall user experience. Hence, some of these large companies have set to work to try to solve it.
Surely on more than one occasion you have experienced the unpleasant situation that, when it comes to obtaining the desired results from AI, more time is spent than expected. Obviously, most users would prefer to obtain such automatic content instantly. It must be taken into account that the current APIs of the large language models today require that all requested content be regenerated. This causes considerable latency for users.
Hence, OpenAI is currently trying to solve this problem with a new feature for developers.
This is the Predicted Outputs function to accelerate AI
Specifically, we are referring to the function called Predicted Outputs that the technology giant wants to implement in its popular ChatTGPT. This function is a functionality that can be used in cases where most of the results are known in advance.
Here, tasks as common in artificial intelligence as editing documents or obtaining source code come into play. All of this can be significantly improved using this function. Predicted Outputs uses an internal method to skip known content, making iterations much faster. This means that we will obtain results in much less time.
In this way, developers can significantly reduce latency by passing existing content as a prediction. In addition, they will be able to regenerate all the content much more quickly thanks to this new tool.
It is worth mentioning that OpenAI tested this feature with some external partners and the results were very positive in terms of performance increase. To give you an idea, based on benchmark tests by Microsoft’s GitHub team, the results in Copilot Workspace workloads allowed the AI processing speed to be multiplied by 5.8.
This means that the results are really fast, so the use of artificial intelligence by users will improve significantly. Of course, when using Predicted Outputs, there are some limitations for developers. For example, it only supports the GPT-4o and GPT-4o-mini series of language models.
But despite this, the potential benefits of this new feature are substantial in making large language models more efficient and faster.