Microsoft has recently unveiled an innovative multimodal AI-powered platform known as JARVIS. The AI can connect and collaborate with multiple artificial intelligence models, such as ChatGPT and t5-base, to deliver a final result. With a demo hosted on the popular AI platform Huggingface, users can now explore and test JARVIS’s extraordinary capabilities.
Also Read: Microsoft Integrates ChatGPT into Windows OS for Enhanced AI Experience
JARVIS: Microsoft’s Advanced AI System Taking AI Collaboration to the Next Level
Microsoft is currently working on an innovative AI system called JARVIS that links multiple AI models and delivers a unified result. The project, hosted on GitHub, showcases Microsoft’s unique collaborative approach to developing AI solutions. The system functions with ChatGPT as the task controller, directing the operation.
To witness JARVIS’s capabilities firsthand, users can visit Huggingface, where Microsoft hosts a demo of this powerful AI system.
Also Read: Microsoft Releases VisualGPT: Combines Language and Visuals
Multimodal AI Integration: The Future of AI Task Management
JARVIS extends OpenAI’s GPT-4 multimodal capabilities, demonstrated through text and image processing, by incorporating various open-source LLMs for images, videos, audio, and more. Additionally, it connects to the internet, allowing access to files and data from various sources.
This innovative approach enables users to add multiple tasks in a single query. For example, asking JARVIS to create an image of an alien invasion and write poetry about it would result in ChatGPT analyzing the request, planning the task, selecting the appropriate model (hosted on Huggingface), and executing the task. The chosen model completes the task and returns the results to ChatGPT.
Also Read: Microsoft Loop: The Collaboration Revolution Your Team Can’t Afford to Miss
A Network of 20 Powerful Models Linked to JARVIS
JARVIS, or HuggingGPT, is connected to as many as 20 different models, including t5-base, stable-diffusion 1.5, bert, Facebook’s bart-large-cnn, Intel’s dpt-large, and more. Users interested in experiencing multimodal capabilities can check out Microsoft JARVIS without delay.
While JARVIS has been tested multiple times and shown to perform exceptionally well, it requires a significant amount of resources, including at least 16GB of VRAM and around 300GB of storage space for various models. Consequently, JARVIS cannot be run locally on an average PC.
Huggingface Queue and Subscription Requirements
At present, users cannot clone JARVIS on Huggingface under a free account and bypass the queue. To run the powerful model on an Nvidia A10G, a large GPU costing $3.15 per hour, users need to subscribe to Huggingface’s services.
Despite these limitations, Microsoft’s JARVIS project marks a significant step in the advancement of AI systems, bringing the power of multimodal AI capabilities and collaboration to the masses. Its potential to revolutionize the way we interact with and utilize AI technology is undeniable, and its development will undoubtedly continue to push the boundaries of what is possible in the field of artificial intelligence.