Meet the AI-Agent, a native macOS application that integrates with Google's Gemini AI to provide a seamless assistant experience. This project is open-source, and we encourage you to test it, modify it, and contribute to its development.
Testing macOS AI Agent with Google Gemini Live Web API
Gemini Assistant macOS App
A native macOS application that connects to Google's Gemini AI. The app automatically accesses your camera and microphone to provide a seamless AI assistant experience
The Gemini Assistant is a macOS application designed to:
Capture audio input through your microphone.
Use your camera for visual context.
Provide AI-powered responses via text.
The app leverages Google's Gemini AI for natural language understanding and response generation, making it a powerful tool for productivity and interaction.
Features
Audio Input: Speak to the assistant using your microphone.
Visual Context: The app uses your camera to gather additional context.
Text Responses: Get responses displayed in the app
Customizable: Modify the code to add new features or improve existing ones
How It Works
The application is built using Python and integrates several libraries:
PyQt5: For the user interface.
OpenCV: For camera access and visual processing.
PyAudio: For capturing and playing audio.
Google Generative AI: For natural language processing.
Python-dotenv: For managing environment variables.
The app uses a .env file to store your Google Gemini API key securely. If the file doesn't exist, the app will create one for you.
great!, I am looking forward to such projects