PANORA

Home About Us Contact Us Blog News

OpenAI Unveils Text-to-Voice Model with Human-like Speech Capability

Published: 30 March 2024 at 01:54

Artificial Intelligence

OpenAI introduced Voice Engine, a text-to-speech model that can replicate individual voices with just 15 seconds of audio, but decided against a wide release due to deepfake risks. The company is engaging with stakeholders and testing the tool with select developers for applications like voice restoration and podcast translation, while emphasizing consent and disclosure policies. OpenAI aims to address AI risks by urging banks to phase out voice authentication and advocating for public education on AI deception.

DEEP DIVE


OpenAI Refines Voice Cloning Technology Responsibly


OpenAI introduces Voice Engine, allowing users to generate synthetic copies of voices from 15-second voice samples, ensuring responsible deployment with no public availability date disclosed yet. The model behind Voice Engine has been used in ChatGPT and Spotify, trained on a mix of licensed and publicly available data, a sensitive topic in the AI industry.

OpenAI's AI Video Generator Sora Sparks Controversy in Hollywood


OpenAI's Sora, a highly realistic AI model, is being promoted to filmmakers and studios in Hollywood, despite concerns about training data transparency and potential impact on the film industry. The company faces copyright infringement lawsuits related to its AI models. Hollywood unions are advocating for limits on AI use in writers' rooms. Testimonials from testers praise Sora's ability to create videos from text prompts, but critics raise ethical concerns about data sourcing. OpenAI's promotional efforts and interactions with Hollywood raise skepticism and fears about job displacement in the industry.

Anthropic unveils Claude 3, surpassing GPT-4 and Gemini Ultra in benchmark tests


Join Gen AI enterprise leaders in Boston on March 27 for an exclusive night of networking, insights, and conversations surrounding data integrity. Request an invite here. Anthropic , a leading artificial intelligence startup, unveiled its Claude 3 series of AI models today, designed to meet the diverse needs of enterprise customers with a balance of intelligence, speed and cost efficiency. The lineup includes three models: Opus, Sonnet and the upcoming Haiku. The star of the lineup is Opus , which Anthropic claims is more capable than any other openly available AI system on the market, even outperforming leading models from rivals OpenAI and Google.

Elon Musk sues OpenAI, accuses ChatGPT maker of abandoning its original mission


A previous version of this article contained a misspelling of Jason Kwon's last name. The article has been corrected. Elon Musk is suing OpenAI, the artificial intelligence powerhouse he helped found and now competes with, alleging it has strayed from its original nonprofit mission of researching AI for the benefit of humanity. In a lawsuit filed Thursday in San Francisco Superior Court against OpenAI, CEO Sam Altman and co-founder Greg Brockman, Musk asked the court to block OpenAI from using its products, such as the popular large-language model ChatGPT, for financial benefit, including in its multibillion-dollar partnership with Microsoft. Musk wants a court order requiring OpenAI to follow its long-standing practice of making AI research and technology developed at OpenAI available to the public rather than keeping it proprietary.

Microsoft, OpenAI plan $100 billion data-center project for AI supercomputer named Stargate


Microsoft and OpenAI are collaborating on a data-center project estimated at $100 billion to build the AI supercomputer named Stargate by 2028. The project includes five phases, with Stargate being the fifth, and is part of OpenAI's upcoming major AI upgrade expected by early next year. The U.S.-based supercomputer will be the largest in a series of installations planned over the next six years, with Microsoft responsible for financing and progressing through the phases.

Character.ai (Wikipedia)


Character.ai (also known as c.ai or Character AI) is a neural language model chatbot service that can generate human-like text responses and participate in contextual conversation. Constructed by previous developers of Google's LaMDA, Noam Shazeer, and Daniel De Freitas, the beta model was made available to use by the public in September 2022.Users can create "characters", craft their "personalities", set specific parameters, and then publish them to the community for others to chat with. Many characters may be based on fictional media sources or celebrities, while others are completely original, some being made with certain goals in mind such as assisting with creative writing or being a text-based adventure game.In May 2023, the service added a monthly subscription option and a mobile app was released for both the Apple App Store and Google Play Store, which had over 1.7 million downloads within its first week.

OpenAI's AI Video Generator 'Sora' Shakes Up Hollywood


OpenAI's new AI video generator, Sora, has been showcased to actors and directors in Hollywood, causing concern among industry professionals like Tyler Perry who fears job losses. Sora uses text prompts to generate one-minute videos and has the potential to disrupt the film creative industry.

Voice engine (Wikipedia)


A voice engine is a software subsystem for bidirectional audio communication, typically used as part of a telecommunications system to simulate a telephone. It functions like a data pump for audio data, specifically voice data. The voice engine is typically used in an embedded system.The term became popularized after 2000 with the proliferation of voice over internet protocol technology in software DSP systems. Voice engines handle the voice processing for an IP Phone system on a standard processor, compared to prior generations of systems which required dedicated, math-optimized digital signal processor chips.Voice engines are highly optimized software subsystems due to the mathematically complex signal processing required for voice filtering and speech coding. The filter stages and coding elements within a voice engine are designed to work in conjunction with a larger telecommunications system, including only a specific and limited range of processing to minimize the voice engine's memory size and processor usage. Compared to software desktop applications which might employ plugins to continually add flexibility or extensibility, a voice engine is designed to meet specific industry standards for interoperability.

Microsoft and OpenAI plan $100 billion supercomputer known as Stargate


Microsoft and OpenAI are collaboratively working on a groundbreaking supercomputer project called Stargate, which could cost over $100 billion and launch by 2028. The plan involves the use of millions of specialized server chips, with Microsoft likely providing funding. Stargate is part of a larger five-phase project focusing on developing advanced AI infrastructure. Challenges include procuring necessary server chips, energy sources, and optimizing chip performance. The project signifies the significant investment in AI technology, with potential implications for the future of artificial intelligence.

OpenAI (Wikipedia)


OpenAI is a U.S. based artificial intelligence (AI) research organization founded in December 2015, researching artificial intelligence with the goal of developing "safe and beneficial" artificial general intelligence, which it defines as "highly autonomous systems that outperform humans at most economically valuable work".As one of the leading organizations of the AI spring, it has developed several large language models, advanced image generation models, and previously, released open-source models. Its release of ChatGPT has been credited with starting the AI spring.The organization consists of the non-profit OpenAI, Inc. registered in Delaware and its for-profit subsidiary OpenAI Global, LLC. It was founded by Ilya Sutskever, Greg Brockman, Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, Jessica Livingston, John Schulman, Pamela Vagata, and Wojciech Zaremba, with Sam Altman and Elon Musk serving as the initial board members. Microsoft provided OpenAI Global LLC with a $1 billion investment in 2019 and a $10 billion investment in 2023, with a significant portion of the investment in the form of computational resources on Microsoft's Azure cloud service.On November 17, 2023, the board removed Altman as CEO, while Brockman was removed as chairman and then resigned as president. Four days later, both returned after negotiations with the board, and most of the board members resigned. The new initial board included former Salesforce co-CEO Bret Taylor as chairman. It was also announced that Microsoft will have a non-voting board seat.

OpenAI's Sora Text-to-Video Platform Used by Artists to Create Surreal Videos


OpenAI collaborated with visual artists and filmmakers to utilize its new Sora text-to-video generation platform, resulting in a collection of conceptual videos with surreal characters and real-life settings. The artists produced abstract and entertaining short films, showcasing the platform's ability to merge the fantastical with reality. The platform has not been publicly released yet, but the experimental videos demonstrate its potential for creating innovative and surreal visual content.

Databricks Introduces New Generative AI Model DBRX


Databricks has unveiled a new generative AI model called DBRX, comparable to OpenAI's GPT series and Google's Gemini. The model is available on GitHub and Hugging Face for both research and commercial use, offering base and fine-tuned versions. DBRX, optimized for English but capable of multiple languages, was trained with $10 million over two months. While outperforming other open source models, it requires high-end hardware like Nvidia H100 GPUs to run efficiently, posing a challenge for non-Databricks users due to the substantial cost involved.

Artificial intelligence (Wikipedia)


Artificial intelligence (AI) is the intelligence of machines or software, as opposed to the intelligence of living beings, primarily of humans. It is a field of study in computer science that develops and studies intelligent machines. Such machines may be called AIs.AI technology is widely used throughout industry, government, and science. Some high-profile applications are: advanced web search engines (e.g., Google Search), recommendation systems (used by YouTube, Amazon, and Netflix), interacting via human speech (e.g., Google Assistant, Siri, and Alexa), self-driving cars (e.g., Waymo), generative and creative tools (e.g., ChatGPT and AI art), and superhuman play and analysis in strategy games (e.g., chess and Go).Alan Turing was the first person to conduct substantial research in the field that he called machine intelligence. Artificial intelligence was founded as an academic discipline in 1956. The field went through multiple cycles of optimism, followed by periods of disappointment and loss of funding, known as AI winter. Funding and interest vastly increased after 2012 when deep learning surpassed all previous AI techniques, and after 2017 with the transformer architecture. This led to the AI spring of the early 2020s, with companies, universities, and laboratories overwhelmingly based in the United States pioneering significant advances in artificial intelligence.The growing use of artificial intelligence in the 21st century is influencing a societal and economic shift towards increased automation, data-driven decision-making, and the integration of AI systems into various economic sectors and areas of life, impacting job markets, healthcare, government, industry, and education. This raises questions about the ethical implications and risks of AI, prompting discussions about regulatory policies to ensure the safety and benefits of the technology.The various sub-fields of AI research are centered around particular goals and the use of particular tools. The traditional goals of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception, and support for robotics. General intelligence (the ability to complete any task performable by a human) is among the field's long-term goals.To solve these problems, AI researchers have adapted and integrated a wide range of problem-solving techniques, including search and mathematical optimization, formal logic, artificial neural networks, and methods based on statistics, operations research, and economics. AI also draws upon psychology, linguistics, philosophy, neuroscience and other fields.

Elon Musk's AI Startup xAI to Launch Enhanced Chatbot Grok-1.5 for Social Media Platform X


Elon Musk's artificial intelligence startup xAI is set to release an improved version of its chatbot Grok called Grok-1.5 for early testers and existing users on social media platform X, focusing on coding and math-related tasks. Musk's xAI aims to provide a "maximum truth-seeking AI" as an alternative to OpenAI and Google. Musk has made Grok available to X premium subscribers and plans to open-source it. The chatbot, named after a term meaning deep understanding, has faced controversy for its responses regarding transgender rights and racism.

Mistral AI (Wikipedia)


Mistral AI is a French company selling artificial intelligence (AI) products. It was founded in April 2023 by previous employees of Meta Platforms and Google DeepMind. The company raised 385 million euros in October 2023 and in December 2023 it was valued at more than $2 billion.It produces open source large language models, citing the foundational importance of open-source software, and as a response to proprietary models.As of March 2024, two models have been published and are available as weights. Three more models, Small, Medium and Large, are available via API only.

OpenAI Five (Wikipedia)


OpenAI Five is a computer program by OpenAI that plays the five-on-five video game Dota 2. Its first public appearance occurred in 2017, where it was demonstrated in a live one-on-one game against the professional player Dendi, who lost to it. The following year, the system had advanced to the point of performing as a full team of five, and began playing against and showing the capability to defeat professional teams.By choosing a game as complex as Dota 2 to study machine learning, OpenAI thought they could more accurately capture the unpredictability and continuity seen in the real world, thus constructing more general problem-solving systems. The algorithms and code used by OpenAI Five were eventually borrowed by another neural network in development by the company, one which controlled a physical robotic hand. OpenAI Five has been compared to other similar cases of artificial intelligence (AI) playing against and defeating humans, such as AlphaStar in the video game StarCraft II, AlphaGo in the board game Go, Deep Blue in chess, and Watson on the television game show Jeopardy!.

OpenAI Codex (Wikipedia)


OpenAI Codex is an artificial intelligence model developed by OpenAI. It parses natural language and generates code in response. It powers GitHub Copilot, a programming autocompletion tool for select IDEs, like Visual Studio Code and Neovim. Codex is a descendant of OpenAI's GPT-3 model, fine-tuned for use in programming applications.OpenAI released an API for Codex in closed beta. In March 2023, OpenAI shut down access to Codex. Due to public appeals from researchers, OpenAI reversed course. The Codex model can still be used by researchers of the OpenAI Research Access Program.

Apple to Focus on AI at WWDC Conference


Apple will be highlighting its approach to generative artificial intelligence at the upcoming Worldwide Developers Conference (WWDC) from June 10th to 14th. The company has been investing heavily in training AI models and may partner with Google, OpenAI, Anthropic, or Baidu for cloud-based AI features while keeping generative features on-device. There are rumored updates for iOS, iPadOS, macOS, and watchOS, including new AirPods models and updated AirPods Max headphones. Previous WWDC events introduced innovations like the Vision Pro headset, WatchOS revamp, iOS 17 StandBy Mode, and Mac Pro with M2 Ultra chip.

OpenAI Unveils Text-to-Voice Model with Human-like Speech Capability OpenAI Unveils Text-to-Voice Model with Human-like Speech Capability

SOURCES

The Verge

OpenAI’s voice cloning AI model only needs a 15-second sample to work

The Verge

NDTV

OpenAI Unveils Audio Feature That Read Texts, Clones Human Voices

NDTV

Tech Radar

OpenAI's new voice synthesizer can copy your voice from just 15 seconds of audio

Tech Radar

Mashable

OpenAI previews synthetic voice creator, Voice Engine

Anna Iovine

PANORA

OpenAI Refines Voice Cloning Technology Responsibly

PANORA

PANORA

OpenAI's AI Video Generator Sora Sparks Controversy in Hollywood

PANORA

PANORA

Anthropic unveils Claude 3, surpassing GPT-4 and Gemini Ultra in benchmark tests

PANORA

PANORA

Elon Musk sues OpenAI, accuses ChatGPT maker of abandoning its original mission

PANORA

PANORA

Microsoft, OpenAI plan $100 billion data-center project for AI supercomputer named Stargate

PANORA

Wikipedia

Character.ai

Wikipedia

PANORA

OpenAI's AI Video Generator 'Sora' Shakes Up Hollywood

PANORA

Wikipedia

Voice engine

Wikipedia

PANORA

Microsoft and OpenAI plan $100 billion supercomputer known as Stargate

PANORA

Wikipedia

OpenAI

Wikipedia

PANORA

OpenAI's Sora Text-to-Video Platform Used by Artists to Create Surreal Videos

PANORA

PANORA

Databricks Introduces New Generative AI Model DBRX

PANORA

Wikipedia

Artificial intelligence

Wikipedia

PANORA

Elon Musk's AI Startup xAI to Launch Enhanced Chatbot Grok-1.5 for Social Media Platform X

PANORA

Wikipedia

Mistral AI

Wikipedia

Wikipedia

OpenAI Five

Wikipedia

Wikipedia

OpenAI Codex

Wikipedia

PANORA

Apple to Focus on AI at WWDC Conference

PANORA