AI

Sesame, the startup behind the viral virtual assistant Maya, releases its base AI model

AI company Sesame has released the basic model that Maya, the impressive realistic voice assistant.

The model, which is 1 billion parameters in size (“parameters” that refers to individual components of the model), is located under an Apache 2.0 license, which means that it can be used commercially with few limitations. The model is called CSM-1B and generates “RVQ audiocodes” from text and audio inputs, according to Sesame’s description on the AI ​​DEV platform Cuddling face.

RVQ refers to ‘remaining vector quantization’, a technique for coding audio in discreet tokens called codes. RVQ is used In a number of recent AI -AudiotechnologiesIncluding Google’s Soundstream and Meta’s Encodec.

CSM-1B uses a Meta Lama family as a backbone combined with an audio component “Decoder”. A refined variant of CSM forces Maya, says Sesame.

“The model that is opened here is a model for basic generation,” writes Sesame in CSM-1Bs Hug And Gitub repositories. “It is able to produce a variety of voices, but it is not refined on a specific voice […] The model has some capacity for non-English languages ​​due to data pollution in the training data, but it will probably not do well. “

It is unclear which data Sesam used to train CSM-1B. The company did not say.

It is worth mentioning that the model has no real guarantees to talk about. Sesam has an honor system and only urges developers and users not to use the model to simulate a person’s voice without creating his permission, misleading content such as fake news or “harmful” or “malignant” activities.

See also  DeepSeek-R1: Transforming AI Reasoning with Reinforcement Learning

I tried it The demo On a hugging face and cloning my voice lasted less than a minute. From there it was easy to generate speech into the desire of my heart, also on controversial topics such as the elections and Russian propaganda.

Consumer reports have recently warned that many popular AI-driven speech cloneling tools on the market Don’t have “meaningful” guarantees To prevent fraud or abuse.

Sesam, co-founded by Oculus co-maker Brendan Iribe, went viral for his assistant technology at the end of February, which comes close to cleaning up a creepy Vallei territory. The other assistant of Maya and Sesame, Miles, breathes and speaks with disfluencies and can be interrupted while speaking, just like OpenAi’s speech mode.

Sesam has picked up an unprecedented capital from Andreessen Horowitz, Spark Capital and Matrix Partners. In addition to building speech assistant technology, the company says that the prototyping of AI glasses is “designed to be worn all day” that will be equipped with its adapted models.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button