PetBot: AI-Powered Social Robot with Local LLM Processing
February 1, 2025

Tech Stack
Creating an Autonomous Social Robot with Edge AI
PetBot represents the convergence of artificial intelligence, robotics, and human-computer interaction—a fully autonomous social robot capable of natural conversation, emotional expression, and adaptive learning. Built entirely with open-source technologies and local processing, it demonstrates how advanced AI can be deployed at the edge without relying on cloud services.
Project Demonstration
The Vision: Accessible Social Robotics
Social robots have traditionally been expensive, proprietary systems requiring constant internet connectivity. PetBot challenges this paradigm by providing:
- Privacy-First Design: All AI processing happens locally
- Open Source Architecture: Complete hardware and software designs available
- Affordable Components: Built with off-the-shelf electronics under £300
- Extensible Platform: Modular design supports various applications
Technical Architecture
Edge AI Processing
Local LLM Implementation:
- Ollama Integration: Running Gemma 3:1b/4b models entirely on Raspberry Pi 5
- Real-time Inference: Sub-2 second response times for natural conversation
- Memory Management: Efficient context handling for extended interactions
- No Internet Required: Complete functionality in offline environments
Speech Processing Pipeline:
Microphone → Vosk (Speech-to-Text) → Ollama (LLM) → Piper (Text-to-Speech) → Speaker
Multi-Modal Interface System
Three Operational Modes:
-
Robot Mode: Autonomous social interaction
- Natural conversation with personality traits
- Emotional expression through servo movements
- Adaptive responses based on interaction history
-
Developer Mode: Technical configuration and monitoring
- Real-time system diagnostics
- Parameter adjustment interface
- Debug logging and performance metrics
-
Demo Mode: Presentation and showcase functionality
- Scripted interactions for demonstrations
- Performance optimization for public display
- Simplified user interface for non-technical users
Hardware Integration
Mechanical Design:
- Custom 3D-Printed Chassis: Designed in Onshape CAD software
- Servo Motor Array: 3 servos for expressive ears and arm movement
- DC Motor System: 4 TT gearbox motors for mobility
- Compact Footprint: 20cm × 15cm × 18cm desktop-friendly form factor
Electronic Systems:
- Raspberry Pi 5: 8GB RAM for AI processing and system control
- Computer Vision: ESP32 camera module with YOLO V8n object detection
- Audio System: USB microphone and amplified speakers
- Sensor Array: VL53L5CX Time-of-Flight sensor for spatial awareness
- Display: Raspberry Pi Touch 2 (800×480) for user interaction
- Power Management: Pi UPS Hat providing ~5 hours runtime
Software Engineering Excellence
Real-Time Communication Architecture
Flask Web Server with Socket.IO:
- Asynchronous Communication: Real-time updates between components
- WebSocket Protocol: Low-latency bidirectional communication
- RESTful API: Standard HTTP endpoints for configuration
- Session Management: Persistent connections across interactions
MQTT Integration:
- Distributed System Design: Decoupled component communication
- Topic-Based Messaging: Organized data flow between subsystems
- Quality of Service: Guaranteed message delivery for critical commands
- Scalability: Easy addition of new sensors and actuators
PyQt6 User Interface
Professional Desktop Application:
- Native Performance: Optimized Qt framework for responsive UI
- Multi-Window Management: Dedicated interfaces for different modes
- Real-Time Visualization: Live data streaming and status monitoring
- Cross-Platform Compatibility: Runs on Windows, macOS, and Linux
AI and Machine Learning Integration
Conversational AI Implementation
Natural Language Understanding:
- Context Awareness: Maintains conversation history and context
- Personality Modeling: Consistent character traits across interactions
- Emotional Intelligence: Recognizes and responds to user emotional states
- Domain Adaptation: Specialized knowledge bases for specific applications
Computer Vision Capabilities:
- Person Tracking: Real-time human detection and following
- Object Recognition: YOLO V8n for environmental understanding
- Spatial Awareness: ToF sensor integration for navigation
- Expression Recognition: Visual feedback for emotional responses
Performance Optimization
Edge Computing Constraints:
- Model Selection: Gemma 3:1b for real-time performance vs 4b for accuracy
- Memory Optimization: Efficient context window management
- Thermal Management: CPU throttling prevention strategies
- Battery Optimization: Power-aware processing modes
Collaborative Development Process
Team Coordination (35% Individual Contribution)
Primary Technical Leadership:
- System Architecture: Designed overall software and hardware integration
- AI Implementation: Developed LLM integration and conversation management
- Real-Time Systems: Implemented communication protocols and timing
- Hardware Integration: Managed servo control and sensor interfaces
Collaborative Elements:
- Mechanical Design: Worked with team on chassis optimization
- User Experience: Coordinated interface design across modes
- Testing Protocol: Developed comprehensive validation procedures
- Documentation: Created technical specifications and user guides
Development Methodology
Agile Practices:
- Sprint Planning: Weekly development cycles with defined deliverables
- Continuous Integration: Automated testing and deployment pipelines
- Code Review: Peer review process for quality assurance
- Version Control: Git-based workflow with feature branching
Innovation and Technical Achievements
Edge AI Breakthrough
Local LLM Deployment:
- Successfully running 1-4B parameter models on embedded hardware
- Achieved conversational AI without cloud dependencies
- Demonstrated feasibility of privacy-preserving social robots
- Optimized inference pipeline for real-time interaction
System Integration Excellence
Multi-Modal Fusion:
- Seamless integration of speech, vision, and motor control
- Real-time coordination between AI and physical systems
- Robust error handling and recovery mechanisms
- Scalable architecture supporting future enhancements
Open Source Contribution
Community Impact:
- Complete project documentation and build instructions
- Reusable components for other robotics projects
- Educational resource for AI and robotics learning
- Platform for research and development collaboration
Real-World Applications
Educational Technology
- STEM Learning: Interactive programming and robotics education
- Language Learning: Conversational practice with AI tutor
- Special Needs Support: Assistive technology for communication
- Research Platform: Academic research in human-robot interaction
Commercial Potential
- Customer Service: Retail and hospitality applications
- Elder Care: Companionship and monitoring systems
- Entertainment: Interactive gaming and storytelling
- Therapy: Social skills development and emotional support
Performance Metrics
System Specifications:
- Response Time: <2 seconds for conversational AI
- Battery Life: ~5 hours continuous operation
- Processing Power: 8GB RAM with thermal management
- Weight: 850g total system weight
- Mobility: 4-wheel drive with differential steering
AI Performance:
- Speech Recognition: >95% accuracy in quiet environments
- LLM Inference: Context-aware responses with personality consistency
- Computer Vision: Real-time object detection at 15 FPS
- Motor Control: Precise servo positioning with emotional expression
Future Development Roadmap
Technical Enhancements
- Advanced Navigation: SLAM implementation for autonomous mapping
- Gesture Recognition: Hand tracking for enhanced interaction
- Multi-Language Support: Conversation in multiple languages
- Cloud Synchronization: Optional cloud backup while maintaining privacy
AI Capabilities
- Emotion Recognition: Visual and audio emotion detection
- Skill Learning: Dynamic capability acquisition through interaction
- Personality Customization: User-defined character traits and behaviors
- Social Learning: Group interaction and social behavior modeling
Impact and Recognition
PetBot demonstrates that sophisticated social robotics is accessible to individual developers and small teams. By combining cutting-edge AI with practical engineering, it proves that the future of human-robot interaction doesn't require massive corporate resources or cloud dependencies.
Technical Skills Demonstrated:
- AI/ML Engineering: Local LLM deployment and optimization
- Robotics Integration: Hardware-software system design
- Full-Stack Development: Web services and desktop applications
- Real-Time Systems: Low-latency communication and control
- Mechanical Design: 3D modeling and manufacturing
- Project Management: Collaborative development coordination
The success of PetBot opens new possibilities for privacy-preserving AI systems and demonstrates the potential for democratizing advanced robotics technology.
PetBot was developed as a collaborative project demonstrating the integration of modern AI technologies with practical robotics engineering. The complete source code, CAD files, and documentation are available for the open-source community.