key-feature
Model Management offers the following four core features:
- Model Registry: Model registration and management
- Model Serving: Model deployment and serving
- Playground: Model testing and validation
- Monitoring & Management: Model operation monitoring
💡 Tip: If you are a first-time user, we recommend learning in the order of Model Registry → Model Serving → Playground.
Model Registry
Securely store and version AI/ML models in a centralized repository.
Main Functions
- Model Storage: Import models from Hugging Face or register custom-developed models
- Version Control: Git-based versioning and tagging
- Metadata Management: Manage information such as model description, framework, and task type
- Access Control: Set sharing scope with private/public options
- Model Search: Search and filter models by project or tag
Use Cases
- Import pre-trained models from Hugging Face Hub
- Upload and version control custom-developed models
- Share and reuse models across teams
Model Serving
Deploy registered models to Kubernetes clusters, making them available as live services.
Main Functions
- One-Click Deployment: Deploy to production environments with simple configuration
- Resource Management: Optimize CPU, memory, and GPU resources
- Multi-Cluster Support: Distribute deployments across multiple clusters
- Endpoint Provisioning: Inference services via REST API
Use Cases
- Deploy finished models to staging or production environments
- Serve high-performance models utilizing GPU resources
Playground
An interactive environment for testing deployed models directly from your browser, without coding.
Main Functions
- Interactive Testing: Real-time interaction with models through a web UI
- Parameter Adjustment: Instantly modify parameters like Temperature, Max Tokens, etc.
- Performance Validation: Check metrics such as response time and token usage
- Supports Various Model Types:
- Chat (GPT, LLaMA, etc.)
- Text Completion
- Embedding
- Image Generation
- Audio (TTS, STT, Translation)
Use Cases
- Pre-deployment model performance validation
- Response testing with various input values
- Find optimal inference parameters
Monitoring & Management
Track and manage the status of deployed models in real time.
Main Functions
- Real-Time Monitoring: Monitor pod status and resource usage
- Log Management: Stream and search logs in real time
- Deployment Control: Functions to Start, Pause, Stop, and Delete deployments
Use Cases
- Monitor the status and performance of production models
- Analyze logs and troubleshoot issues when problems occur
- Optimize operations based on resource usage