Meta探索新AI图像和视频模型Mango开发进展报道

访客 2025-12-19 14:56:55 4 抢沙发

默认

据Meta报道，该公司正在开发新的AI图像和视频模型Mango，该模型旨在提高图像和视频内容的处理和生成能力，并可能应用于社交媒体和其他数字平台上，Mango模型的探索和发展标志着人工智能技术在图像和视频领域的进一步发展，有望为用户带来更丰富、更智能的体验，目前关于该模型的更多细节尚未公布。

Meta Platforms, Inc. is developing new AI models code-named Mango for image and video generation and Avocado for text processing, targeting a first-half 2026 release, according to a Wall Street Journal (WSJ) report on Thursday. The move signals a potential shift from Meta's long-standing open-source strategy toward proprietary models.

AI Generated Image

Meta's chief AI officer Alexandr Wang was reported to discuss the models during an internal Q&A on Thursday with Chris Cox, Meta's chief product officer. Wang said Avocado will focus on improving coding capabilities, and that the company is in early-stage exploration of developing world models—AI systems that learn about environments through visual information, WSJ reported.

The developments follow reports last week that Meta is pivoting its AI strategy. CNBC reported last week that Meta is pursuing Avocado as a potentially proprietary model rather than open source, while Bloomberg reported a week ago that the company is using third-party models including Alibaba's Qwen to train the new system.

These strategy shifts mark Meta's biggest departure from the open-source approach it has championed through its Llama model family, aligning the company with rivals Google and OpenAI who operate closed, revenue-generating models.

New Models Target 2026 Launch

Meta restructured its AI team over the summer, hiring Wang from Scale AI to lead a newly created division called Meta Superintelligence Labs. CEO Mark Zuckerberg personally recruited more than 20 researchers from OpenAI and assembled a staff of over 50 new AI experts, according to WSJ.

The Mango model represents Meta's latest effort in image and video generation, a battleground among major AI companies. Meta launched an AI video generator called Vibes in September, developed with startup Midjourney. OpenAI countered less than a week later with its Sora video generator.

CNBC reported that many within Meta expected Avocado's release before year-end, but the model is now slated for first-quarter 2026 as it undergoes training-related performance testing. "Our model training efforts are going according to plan and have had no meaningful timing changes," a Meta spokesperson told CNBC.

Strategic Pivot From Open Source

The shift follows the disappointing reception of Meta's Llama 4 model earlier this year, which failed to captivate developers. Zuckerberg sidelined some team members who worked on that project and launched aggressive recruiting, offering top AI researchers multiyear packages worth hundreds of millions of dollars, Bloomberg reported.

Some Meta employees were upset that Chinese AI lab DeepSeek's R1 model incorporated pieces of Llama's architecture, underscoring open-source risks, CNBC reported. Wang and other new AI leaders have questioned the open-source strategy and favored creating a more powerful proprietary model.

Bloomberg reported that Meta's TBD Lab is using several third-party models including Google's Gemma, OpenAI's gpt-oss and Alibaba's Qwen in Avocado's training process—a notable shift for Zuckerberg, who raised concerns in January about Chinese models potentially being shaped by state censorship.

The pivot has created internal confusion and cultural shifts. Wang's TBD Lab operates like a separate startup and doesn't use Meta's internal Workplace network, according to CNBC. The company has also experienced layoffs and restructurings throughout the year, including 600 job cuts in October.

Pressure Mounts on AI Investments

Meta faces mounting Wall Street pressure to demonstrate returns on massive AI spending. The company raised its 2025 capital expenditure guidance to between $70 billion and $72 billion, after spending $14.3 billion in June to hire Wang and acquire a stake in Scale AI.

"In many ways, Meta has been the opposite of Alphabet, where it entered the year as an AI winner and now faces more questions around investment levels and ROI," KeyBanc Capital Markets analysts wrote in November.

Competitors have advanced their offerings. Google's Gemini 3 drew solid reviews last month, while OpenAI announced GPT-5 updates and Anthropic debuted Claude Opus 4.5 in November. Nvidia CEO Jensen Huang highlighted OpenAI, Anthropic and xAI as major customers during the chipmaker's November earnings call but made no reference to Llama.

Understanding World Models

World models are neural networks that understand real-world dynamics, including physics and spatial properties, according to Nvidia. They use input data—text, image, video and movement—to generate videos simulating realistic physical environments.

These models enable AI systems to simulate realistic cause-and-effect scenarios and predict how objects will move and interact in complex scenes. Physical AI developers use world models to generate custom synthetic data for training robots and autonomous vehicles.

World models can take different forms, Nvidia explains: prediction models that synthesize continuous motion based on prompts, style transfer models that guide outputs using structured guidance like depth maps, and reasoning models that use multimodal inputs to analyze situations over time and space. Applications include autonomous vehicles, robotics, and video analytics for industrial safety and smart city monitoring.

标签： closed operate