cortex

module

v0.22.0 Latest Latest Go to latest Published: Nov 11, 2020 License: Apache-2.0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/cortexlabs/cortex

Links

Open Source Insights

README ¶

Deploy machine learning models to production

Cortex is an open source platform for deploying, managing, and scaling machine learning in production.

Model serving infrastructure

Supports deploying TensorFlow, PyTorch, sklearn and other models as realtime or batch APIs
Ensures high availability with availability zones and automated instance restarts
Scales to handle production workloads with request-based autoscaling
Runs inference on spot instances with on-demand backups
Manages traffic splitting for A/B testing

Configure your cluster:

# cluster.yaml

region: us-east-1
availability_zones: [us-east-1a, us-east-1b]
api_gateway: public
instance_type: g4dn.xlarge
min_instances: 10
max_instances: 100
spot: true

Spin up your cluster on your AWS account:

$ cortex cluster up --config cluster.yaml

￮ configuring autoscaling ✓
￮ configuring networking ✓
￮ configuring logging ✓
￮ configuring metrics dashboard ✓

cortex is ready!

Reproducible model deployments

Implement request handling in Python
Customize compute, autoscaling, and networking for each API
Package dependencies, code, and configuration for reproducible deployments
Test locally before deploying to your cluster

Implement a predictor:

# predictor.py

from transformers import pipeline

class PythonPredictor:
  def __init__(self, config):
    self.model = pipeline(task="text-generation")

  def predict(self, payload):
    return self.model(payload["text"])[0]

Configure an API:

# cortex.yaml

name: text-generator
kind: RealtimeAPI
predictor:
  path: predictor.py
compute:
  gpu: 1
  mem: 4Gi
autoscaling:
  min_replicas: 1
  max_replicas: 10
networking:
  api_gateway: public

Deploy to production:

$ cortex deploy cortex.yaml

creating https://example.com/text-generator

$ curl https://example.com/text-generator \
    -X POST -H "Content-Type: application/json" \
    -d '{"text": "deploy machine learning models to"}'

"deploy machine learning models to production"

API management

Monitor API performance
Aggregate and stream logs
Customize prediction tracking
Update APIs without downtime

Manage your APIs:

$ cortex get

realtime api       status     replicas   last update   latency   requests

text-generator     live       34         9h            247ms     71828
object-detector    live       13         15h           23ms      828459


batch api          running jobs   last update

image-classifier   5              10h

Get started

$ pip install cortex

See the installation guide for next steps.

Directories ¶

Path	Synopsis
cli
cluster
cmd
local
types/cliconfig
types/flags
dev
pkg
consts
lib/archive
lib/aws
lib/cache
lib/cast
lib/configreader
lib/console
lib/cron
lib/debug
lib/docker
lib/errors
lib/exit
lib/files
lib/hash
lib/json
lib/k8s
lib/maps
lib/math
lib/msgpack
lib/parallel
lib/pointer
lib/print
lib/prompt
lib/random
lib/regex
lib/sets/strset
lib/sets/strset/threadsafe
lib/slices
lib/strings
lib/table
lib/telemetry
lib/time
lib/urls
operator
operator/config
operator/endpoints
operator/operator
operator/resources
operator/resources/batchapi
operator/resources/realtimeapi
operator/resources/trafficsplitter
operator/schema
types
types/clusterconfig
types/clusterstate
types/metrics
types/spec
types/status
types/userconfig

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL