Machine learning operations (MLOps) is the development and use of machine learning models by development operations (DevOps) teams. MLOps adds discipline to the development and deployment of machine learning models, making the development process more reliable and productive.

dedicated ML tooling

data storage

typical ETL workflow

discussions

available architectures

  • cpu (slow)
  • gpu (cumbersome)
  • tensor core gpu ()
    • e g. Nvidia A100

relevant cloud products

baseten

  • abstract gpu autoscaling and hosting
  • multi-cluster hosting add compute on the edge
  • team is excited about:
    • support for vLLM, & TGI for model hosting
    • new Nvidia standard ray-llm

TGI vs vLLM

vLLM 15% faster for mistral and more stable on higher load https://tunehq.ai/blog/comparing-vllm-and-tgi

tech used by companies

  • Nvidia
    • Enroot
    • Pyxis
  • base10
    • Kubernetes
    • pytorch

MLOps design patterns

  • Data representation design patterns
    • #1 Hashed Feature
    • #2 Embedding
    • #3 Feature Cross
    • #4 Multimodal Input
  • Problem representation design patterns
    • #5 Reframing
    • #6 Multilabel
    • #7 Ensemble
    • #8 Cascade
    • #9 Neutral Class
    • #10 Rebalancing
  • Patterns that modify model training
    • #11 Useful overfitting
    • #12 Checkpoints
    • #13 Transfer Learning
    • #14 Distribution Strategy
    • #15 Hyperparameter Tuning
  • Resilience patterns
    • #16 Stateless Serving Function
    • #17 Batch Serving
    • #18 Continuous Model Evaluation
    • #19 Two Phase Predictions
    • #20 Keyed Predictions
  • Reproducibility patterns
    • #21 Transform
    • #22 Repeatable Sampling
    • #23 Bridged Schema
    • #24 Windowed Inference
    • #25 Workflow Pipeline
    • #26 Feature Store
    • #27 Model Versioning
  • Responsible AI
    • #28 Heuristic benchmark
    • #29 Explainable Predictions
    • #30 Fairness Lens
  • https://github.com/GoogleCloudPlatform/ml-design-patterns

private documents