OpenAI launches 5.4 and improves GDPVal to 83%
The new version of ChatGPT brings a new round of sophistication to how it can be used in an organisational context including and improved ability to create and edit spreadsheets, presentations, and documents.
Including:
🏅 Investment Banking Analyst spreadsheet tasks increasing from a GDPVal of 68.4% to 87.3%
🏅 Investment Banking Analyst presentations being chosen 68% more often due to "stronger aesthetics, greater visual variety, and more effective use of image generation".
🏅 On the BigLaw Bench evaluation, it scored 91%, compared to other models, It is better at structuring complex transactional analysis, maintaining accuracy across lengthy contracts, and delivering the high level of detail legal practitioners require.
It works better with tools and at Agentic search.
The safeguards introduced with 5.3 have been strenghened further including an expanded cyber safety stack, including monitoring systems, trusted access controls, and asynchronous blocking for higher-risk request for those that use them.
This is another big step in demonstrating how models can take on complex reasoning tasks, with ease.
SOURCE
More on 5.4: https://openai.com/index/introducing-gpt-5-4/
More on GDPVal and 5.2: https://www.linkedin.com/feed/update/urn:li:activity:7405567803639898112
BESCI AI OPINION
Why does this matter?
The more competent models become at managing the GDPval tasks, the harder it is for organisations to ignore them.
They are the spelling bee of the Gen AI world.
The greatest value for Gen AI companies is the redistribution of wealth in corporations. Big subscriptions and token usage, managed as a thick pipe, rather than the hobbiest comsumer accounts, which need constant recruitment to avoid switching.