Hands-on Video Breakdown - How Manus is Revolutionizing the 2025 AI Agent Market
Manus gives large language models hands and eyes, enabling them to not just think and speak, but also act.
Last week, an AI product called Manus once again set the Chinese tech community abuzz. Its concept of a universal agent, impressive demo scenarios, invitation-only mechanism, combined with its Chinese team and DeepSeek background, quickly propelled it to the top of trending searches. My first reaction upon seeing the Manus demo was that it’s essentially a multi-agent hybrid scheduling system that integrates various innovative technologies from the AI Agent field over the past year, as the official statement suggests:
“We firmly believe in and practice the philosophy of ’less structure, more intelligence’: When your data is sufficiently high-quality, your models are powerful enough, your architecture is flexible enough, and your engineering is solid enough, then concepts like computer use, deep research, and coding agents naturally emerge as capabilities.”
That said, Manus has indeed excelled in human-computer interaction and user experience. It imposes no restrictions on user applications, capable of handling various general tasks such as creating Xiaohongshu ads, developing investment strategies, analyzing stock returns, conducting financial valuations, or even coding small games. It’s precisely the kind of product users have been expecting from AI, meeting everyday needs of ordinary people, hence its instant fame.
Video Demonstration - Automated Multi-language Website Creation
The following video showcases Manus automatically creating a multi-language website based on user-uploaded materials. The entire process is seamless, with Manus independently handling material reading and analysis, website construction and testing, and final deployment, fully demonstrating its high level of automation and autonomous execution capabilities as an AI Agent system.
What is an AI Agent?
For the average person, this is probably the first question when encountering products like Manus. Simply put, an AI Agent acts as the eyes, nose, and ears of large language models, and more crucially, their hands and feet. The biggest technical bottleneck in large language models has been their ability to think and express but not execute actual operations. With the advent of DeepSeek, large models can now perform extremely complex thinking and planning, providing complete and detailed solutions to our problems, even outlining specific steps to complete tasks. However, no matter how advanced these models become, they’re ultimately limited to text-based interactions. They can’t open files to read documents, nor can they write and save documents; they can’t read coffee machine manuals or press buttons on the machine. While large models are incredibly intelligent, without equipping them with eyes, ears, hands, and feet, they remain like caged beasts - loud but ultimately ineffective.
Deconstructing Manus’s Execution Steps: Understanding AI Agent Mechanics
From the video example above, Manus has implemented at least the following agents to assist the large model in completing tasks:
1. File Reading Agent
Responsible for extracting useful information from user-provided files, as shown in this Manus log where it’s using the agent to read file content.
2. File Creation Agent
Handles the creation of various files needed to store information and provide output, as seen in this log where Manus is using the agent to create files.
3. Command Execution Agent
Executes commands on the computer to drive various tasks. Here, Manus is using the ‘cp’ command to manipulate the file system and organize folder structures.
Notably, the ability to execute system commands means AI Agents can fully leverage the operating system’s capabilities to perform complex operations.
4. Browser Agent
Used to operate the browser, open URLs, and read web content. In this example, Manus is using this tool to read its own website content and perform testing/validation.
In summary, Manus integrates various tools to provide large models with senses (for reading information) and limbs (for executing actions). This aligns with what the Manus founder mentioned in the video: the concept of “mind and hands.”
“Mind and hands” provides an intuitive explanation of Manus’s essence: equipping large models with hands and eyes, enabling them to not just think and speak, but also act.
The PDCA Cycle in AI Agents
In Manus, we’ve discovered a crucial capability in AI Agent systems: the ability to self-construct task execution loops, essentially implementing the PDCA cycle (Plan-Do-Check-Act) from project management.
The PDCA cycle is vital for building self-improving systems. For AI Agent systems, only with this capability can they truly possess the ability to “correctly” execute tasks.
The challenge of determining task execution “correctness” has long been a major hurdle for AI systems. Traditionally, the approach has been to enhance model capabilities, hoping to improve task execution accuracy. However, as we often say, “nobody’s perfect.” The correct approach should be:
- Allow models to make mistakes during task execution
- Build a system that can “check” task execution
- Identify issues during task execution promptly
- Address and resolve them, achieving continuous optimization and improvement
Advice for General Users
For the average user, the first half of this article should suffice to help you fully understand Manus’s nature and working model. The next step is to unleash your imagination and learn how to interact with AI Agent systems like Manus, mastering the art of collaborating with AI.
While the Manus demos we see online appear smooth and flawless, when you actually operate such systems yourself, you’re bound to encounter various issues.
The fundamental reason is that these systems still face numerous challenges and require further technical exploration. As humans, we must:
- Stay informed about these systems’ capabilities and limitations
- Understand their working mechanisms
- Be among the first to benefit from the productivity boost AI Agents offer
Important Note:
We can’t wait until these systems are fully mature before using them, because by then, we’ll have already been surpassed by those who dared to explore, mastered human-AI interaction skills, and adapted to the characteristics of the digital symbiosis era.
You won’t be replaced by AI, but you might be replaced by others who have mastered AI technology.
Manus’s System Architecture
Another hot topic last week was the OpenManus open-source project, which claimed to replicate Manus in just three hours after dinner. The competition in the tech world is indeed fierce, with tech enthusiasts constantly oscillating between mutual appreciation and rivalry.
For developers, code has never been a barrier to replicating systems. Once we see your system, we can replicate it in no time. Of course, when it comes to product experience, stability, and reliability, that’s another story.
For systems like Manus, any developer in the AI application tech space can immediately recognize the technologies behind it, and these are essentially accumulated or mature technologies from the past two years. For example:
- Model Planning Capability: Derived from systems like OpenAI O1, DeepSeek V3/R1, Claude 3.5 Sonnet
- Agent Scheduling and Integration: Using Function Call or MCP protocols
- Various Agent Tools: Common components from traditional software systems
Therefore, if a company wants to replicate such a system, the deciding factor is essentially just determination, as technology is no longer the main obstacle.
Here’s a highly simplified Manus system architecture. For each component in this architecture, I can find mature solutions. What remains is refining details, enriching application scenarios, and enhancing user experience. In this process, what’s being tested isn’t technological innovation, but rather the team’s understanding and accumulation of their specific scenarios.
The 2025 AI Agent Market Explosion
If November 2022’s ChatGPT launch ignited a global technological revolution around large language models, then 2023 to 2024 saw the gradual development and evolution of peripheral system architectures around these models. During this period, the capabilities of large models themselves also improved, making systems like Manus possible. Here, we must mention DeepSeek, a significant milestone in China’s exploration of large model capabilities. DeepSeek’s greatest significance lies in bringing large model technology into the mainstream consciousness. Technically speaking, DeepSeek models have truly endowed large models with relatively accurate complex task planning capabilities, providing the fundamental ability to effectively schedule AI Agents.
Manus is essentially a product launched at the right time, allowing ordinary people to experience the powerful capabilities of combining large models with Agents, fully meeting public expectations for AI systems. This time, online evaluations of Manus show a clear polarization: some believe it’s incredibly powerful, while others see it as just a “shell” product. Regardless, following DeepSeek, Manus has given ordinary people a tangible experience of the productivity boost from combining large models with Agents. From this perspective, Manus is likely to trigger an explosive growth in the 2025 AI Agent market, as it has already created sufficient market demand for developers. Now it’s up to developers to meet these demands.
Let’s look forward to the 2025 AI Agent market explosion. Perhaps by the end of 2025, we won’t need to search for information online ourselves, write research reports manually, or create PowerPoint presentations from scratch. And that often-dreaded annual performance review? It’s estimated that over 80% of it will be generated by AI Agents.