Chuyển tới nội dung chính

key-feature

Model Management offers the following four core features:

  • Model Registry: Model registration and management
  • Model Serving: Model deployment and serving
  • Playground: Model testing and validation
  • Monitoring & Management: Model operation monitoring
💡 Tip: If you are a first-time user, we recommend learning in the order of Model RegistryModel ServingPlayground.

Model Registry

Securely store and version AI/ML models in a centralized repository.

Main Functions

  • Model Storage: Import models from Hugging Face or register custom-developed models
  • Version Control: Git-based versioning and tagging
  • Metadata Management: Manage information such as model description, framework, and task type
  • Access Control: Set sharing scope with private/public options
  • Model Search: Search and filter models by project or tag

Use Cases

  • Import pre-trained models from Hugging Face Hub
  • Upload and version control custom-developed models
  • Share and reuse models across teams

Model Serving

Deploy registered models to Kubernetes clusters, making them available as live services.

Main Functions

  • One-Click Deployment: Deploy to production environments with simple configuration
  • Resource Management: Optimize CPU, memory, and GPU resources
  • Multi-Cluster Support: Distribute deployments across multiple clusters
  • Endpoint Provisioning: Inference services via REST API

Use Cases

  • Deploy finished models to staging or production environments
  • Serve high-performance models utilizing GPU resources

Playground

An interactive environment for testing deployed models directly from your browser, without coding.

Main Functions

  • Interactive Testing: Real-time interaction with models through a web UI
  • Parameter Adjustment: Instantly modify parameters like Temperature, Max Tokens, etc.
  • Performance Validation: Check metrics such as response time and token usage
  • Supports Various Model Types:
    • Chat (GPT, LLaMA, etc.)
    • Text Completion
    • Embedding
    • Image Generation
    • Audio (TTS, STT, Translation)

Use Cases

  • Pre-deployment model performance validation
  • Response testing with various input values
  • Find optimal inference parameters

Monitoring & Management

Track and manage the status of deployed models in real time.

Main Functions

  • Real-Time Monitoring: Monitor pod status and resource usage
  • Log Management: Stream and search logs in real time
  • Deployment Control: Functions to Start, Pause, Stop, and Delete deployments

Use Cases

  • Monitor the status and performance of production models
  • Analyze logs and troubleshoot issues when problems occur
  • Optimize operations based on resource usage