cortex

module
v0.22.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 11, 2020 License: Apache-2.0

README

Deploy machine learning models to production

Cortex is an open source platform for deploying, managing, and scaling machine learning in production.

Model serving infrastructure

  • Supports deploying TensorFlow, PyTorch, sklearn and other models as realtime or batch APIs
  • Ensures high availability with availability zones and automated instance restarts
  • Scales to handle production workloads with request-based autoscaling
  • Runs inference on spot instances with on-demand backups
  • Manages traffic splitting for A/B testing
Configure your cluster:
# cluster.yaml

region: us-east-1
availability_zones: [us-east-1a, us-east-1b]
api_gateway: public
instance_type: g4dn.xlarge
min_instances: 10
max_instances: 100
spot: true
Spin up your cluster on your AWS account:
$ cortex cluster up --config cluster.yaml

○ configuring autoscaling ✓
○ configuring networking ✓
○ configuring logging ✓
○ configuring metrics dashboard ✓

cortex is ready!

Reproducible model deployments

  • Implement request handling in Python
  • Customize compute, autoscaling, and networking for each API
  • Package dependencies, code, and configuration for reproducible deployments
  • Test locally before deploying to your cluster
Implement a predictor:
# predictor.py

from transformers import pipeline

class PythonPredictor:
  def __init__(self, config):
    self.model = pipeline(task="text-generation")

  def predict(self, payload):
    return self.model(payload["text"])[0]
Configure an API:
# cortex.yaml

name: text-generator
kind: RealtimeAPI
predictor:
  path: predictor.py
compute:
  gpu: 1
  mem: 4Gi
autoscaling:
  min_replicas: 1
  max_replicas: 10
networking:
  api_gateway: public
Deploy to production:
$ cortex deploy cortex.yaml

creating https://example.com/text-generator

$ curl https://example.com/text-generator \
    -X POST -H "Content-Type: application/json" \
    -d '{"text": "deploy machine learning models to"}'

"deploy machine learning models to production"

API management

  • Monitor API performance
  • Aggregate and stream logs
  • Customize prediction tracking
  • Update APIs without downtime
Manage your APIs:
$ cortex get

realtime api       status     replicas   last update   latency   requests

text-generator     live       34         9h            247ms     71828
object-detector    live       13         15h           23ms      828459


batch api          running jobs   last update

image-classifier   5              10h

Get started

$ pip install cortex

See the installation guide for next steps.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL