Non Cult Crypto News

Non Cult Crypto News

in

OpenAI’s new updates enhance voice and vision capabilities of AI

OpenAI has released a series of updates aimed at improving its AI models with advanced voice and vision features for real-time conversations and better image recognition.

Own this piece of crypto history

Collect this article as NFT

COINTELEGRAPH IN YOUR SOCIAL FEED

Artificial intelligence developer OpenAI entered October with several updates to its models, helping its AI models engage in better conversations and improve image recognition.

On Oct. 1, OpenAI unveiled four updates that introduce new tools designed to make it easier for developers to build on its AI models.

It speaks!

One major update is the Realtime API, which allows developers to create AI-generated voice applications using a single prompt.  

The tool, available for testing, supports low-latency, multimodal experiences by streaming audio inputs and outputs, enabling natural conversations similar to ChatGPT’s Advanced Voice Mode. 

Previously, developers had to “stitch together” multiple models to create these experiences. Audio input would typically need to be fully uploaded and processed before receiving a response, which meant higher latency for real-time applications like speech-to-speech conversations. 

Related: Apple, Google to use AI to maintain dominance — Cathie Wood’s ARK Invest

With Realtime API’s streaming capability, developers can now enable immediate, natural interactions, much like voice assistants. The API runs on GPT-4, released in May 2024, which can reason across audio, vision and text in real time.

AI can see clearly now

Another update includes a fine-tuning tool for developers, allowing them to improve AI responses generated from images and text inputs. 

The image-based fine tuners enable the artificial intelligence to have a better capacity to understand images, in turn enhancing visual search and object detection capabilities, according to the developer. The process includes feedback from humans who provide examples of good and bad responses.

In addition to its voice and vision updates, OpenAI also rolled out “model distillation” and “prompt caching,” which allow smaller models to learn from larger ones and reduce development costs and time by reusing already processed text. 

The advanced capabilities of its models are a key selling point, as a major chunk of revenue for OpenAI comes from businesses building their own applications on top of OpenAI’s technology. 

According to Reuters, OpenAI projects its revenue to rise to $11.6 billion next year, up from an estimated $3.7 billion in 2024.

Magazine: AI may already use more power than Bitcoin — and it threatens Bitcoin mining

This article first appeared at Cointelegraph.com News

What do you think?

Written by Outside Source

Is the Bitcoin Price in Danger of Dropping to $42K? Here’s a Worrying Scenario

Bitcoin traders see $54K BTC price or lower amid Middle East tensions

Back to Top

Ad Blocker Detected!

We've detected an Ad Blocker on your system. Please consider disabling it for Non Cult Crypto News.

How to disable? Refresh

Log In

Or with username:

Forgot password?

Don't have an account? Register

Forgot password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

To use social login you have to agree with the storage and handling of your data by this website.

Add to Collection

No Collections

Here you'll find all collections you've created before.