Endpoint

sagemaker.services.k8s.aws/v1alpha1

TypeLink
GoDocsagemaker-controller/apis/v1alpha1#Endpoint

Metadata

PropertyValue
ScopeNamespaced
KindEndpoint
ListKindEndpointList
Pluralendpoints
Singularendpoint

A hosted endpoint for real-time inference.

Spec

deploymentConfig: 
  autoRollbackConfiguration: 
    alarms:
    - alarmName: string
  blueGreenUpdatePolicy: 
    maximumExecutionTimeoutInSeconds: integer
    terminationWaitInSeconds: integer
    trafficRoutingConfiguration: 
      canarySize: 
        type_: string
        value: integer
      linearStepSize: 
        type_: string
        value: integer
      type_: string
      waitIntervalInSeconds: integer
  rollingUpdatePolicy: 
    maximumBatchSize: 
      type_: string
      value: integer
    maximumExecutionTimeoutInSeconds: integer
    rollbackMaximumBatchSize: 
      type_: string
      value: integer
    waitIntervalInSeconds: integer
endpointConfigName: string
endpointName: string
tags:
- key: string
  value: string
FieldDescription
deploymentConfig
Optional
object
The deployment configuration for an endpoint, which contains the desired
deployment strategy and rollback configurations.
deploymentConfig.autoRollbackConfiguration
Optional
object
Automatic rollback configuration for handling endpoint deployment failures
and recovery.
deploymentConfig.autoRollbackConfiguration.alarms
Optional
array
deploymentConfig.autoRollbackConfiguration.alarms.[]
Required
object
An Amazon CloudWatch alarm configured to monitor metrics on an endpoint.
deploymentConfig.blueGreenUpdatePolicy
Optional
object
Update policy for a blue/green deployment. If this update policy is specified,
SageMaker creates a new fleet during the deployment while maintaining the
old fleet. SageMaker flips traffic to the new fleet according to the specified
traffic routing configuration. Only one update policy should be used in the
deployment configuration. If no update policy is specified, SageMaker uses
a blue/green deployment strategy with all at once traffic shifting by default.
deploymentConfig.blueGreenUpdatePolicy.maximumExecutionTimeoutInSeconds
Optional
integer
deploymentConfig.blueGreenUpdatePolicy.terminationWaitInSeconds
Optional
integer
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration
Optional
object
Defines the traffic routing strategy during an endpoint deployment to shift
traffic from the old fleet to the new fleet.
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.canarySize
Optional
object
Specifies the type and size of the endpoint capacity to activate for a blue/green
deployment, a rolling deployment, or a rollback strategy. You can specify
your batches as either instance count or the overall percentage or your fleet.


For a rollback strategy, if you don’t specify the fields in this object,
or if you set the Value to 100%, then SageMaker uses a blue/green rollback
strategy and rolls all traffic back to the blue fleet.
**deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.canarySize.type_**
Optional
string
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.canarySize.value
Optional
integer
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.linearStepSize
Optional
object
Specifies the type and size of the endpoint capacity to activate for a blue/green
deployment, a rolling deployment, or a rollback strategy. You can specify
your batches as either instance count or the overall percentage or your fleet.


For a rollback strategy, if you don’t specify the fields in this object,
or if you set the Value to 100%, then SageMaker uses a blue/green rollback
strategy and rolls all traffic back to the blue fleet.
**deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.linearStepSize.type_**
Optional
string
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.linearStepSize.value
Optional
integer
**deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.type_**
Optional
string
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.waitIntervalInSeconds
Optional
integer
deploymentConfig.rollingUpdatePolicy
Optional
object
Specifies a rolling deployment strategy for updating a SageMaker endpoint.
deploymentConfig.rollingUpdatePolicy.maximumBatchSize
Optional
object
Specifies the type and size of the endpoint capacity to activate for a blue/green
deployment, a rolling deployment, or a rollback strategy. You can specify
your batches as either instance count or the overall percentage or your fleet.


For a rollback strategy, if you don’t specify the fields in this object,
or if you set the Value to 100%, then SageMaker uses a blue/green rollback
strategy and rolls all traffic back to the blue fleet.
**deploymentConfig.rollingUpdatePolicy.maximumBatchSize.type_**
Optional
string
deploymentConfig.rollingUpdatePolicy.maximumBatchSize.value
Optional
integer
deploymentConfig.rollingUpdatePolicy.maximumExecutionTimeoutInSeconds
Optional
integer
deploymentConfig.rollingUpdatePolicy.rollbackMaximumBatchSize
Optional
object
Specifies the type and size of the endpoint capacity to activate for a blue/green
deployment, a rolling deployment, or a rollback strategy. You can specify
your batches as either instance count or the overall percentage or your fleet.


For a rollback strategy, if you don’t specify the fields in this object,
or if you set the Value to 100%, then SageMaker uses a blue/green rollback
strategy and rolls all traffic back to the blue fleet.
**deploymentConfig.rollingUpdatePolicy.rollbackMaximumBatchSize.type_**
Optional
string
deploymentConfig.rollingUpdatePolicy.rollbackMaximumBatchSize.value
Optional
integer
deploymentConfig.rollingUpdatePolicy.waitIntervalInSeconds
Optional
integer
endpointConfigName
Required
string
The name of an endpoint configuration. For more information, see CreateEndpointConfig
(https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html).
endpointName
Required
string
The name of the endpoint.The name must be unique within an Amazon Web Services
Region in your Amazon Web Services account. The name is case-insensitive
in CreateEndpoint, but the case is preserved and must be matched in InvokeEndpoint
(https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html).
tags
Optional
array
An array of key-value pairs. You can use tags to categorize your Amazon Web
Services resources in different ways, for example, by purpose, owner, or
environment. For more information, see Tagging Amazon Web Services Resources
(https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html).
tags.[]
Required
object
A tag object that consists of a key and an optional value, used to manage
metadata for SageMaker Amazon Web Services resources.

You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AddTags.html).

For more information on adding metadata to your Amazon Web Services resources with tagging, see Tagging Amazon Web Services resources (https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html). For advice on best practices for managing Amazon Web Services resources with tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services Resource Tagging Strategy (https://d1.awsstatic.com/whitepapers/aws-tagging-best-practices.pdf). || tags.[].key
Optional | string
| | tags.[].value
Optional | string
|

Status

ackResourceMetadata: 
  arn: string
  ownerAccountID: string
  region: string
conditions:
- lastTransitionTime: string
  message: string
  reason: string
  status: string
  type: string
creationTime: string
endpointStatus: string
failureReason: string
lastModifiedTime: string
pendingDeploymentSummary: 
  endpointConfigName: string
  productionVariants:
  - acceleratorType: string
    currentInstanceCount: integer
    currentServerlessConfig: 
      maxConcurrency: integer
      memorySizeInMB: integer
      provisionedConcurrency: integer
    currentWeight: number
    deployedImages:
    - resolutionTime: string
      resolvedImage: string
      specifiedImage: string
    desiredInstanceCount: integer
    desiredServerlessConfig: 
      maxConcurrency: integer
      memorySizeInMB: integer
      provisionedConcurrency: integer
    desiredWeight: number
    instanceType: string
    managedInstanceScaling: 
      maxInstanceCount: integer
      minInstanceCount: integer
      status: string
    routingConfig: 
      routingStrategy: string
    variantName: string
    variantStatus:
    - startTime: string
      status: string
      statusMessage: string
  startTime: string
productionVariants:
- currentInstanceCount: integer
  currentServerlessConfig: 
    maxConcurrency: integer
    memorySizeInMB: integer
    provisionedConcurrency: integer
  currentWeight: number
  deployedImages:
  - resolutionTime: string
    resolvedImage: string
    specifiedImage: string
  desiredInstanceCount: integer
  desiredServerlessConfig: 
    maxConcurrency: integer
    memorySizeInMB: integer
    provisionedConcurrency: integer
  desiredWeight: number
  managedInstanceScaling: 
    maxInstanceCount: integer
    minInstanceCount: integer
    status: string
  routingConfig: 
    routingStrategy: string
  variantName: string
  variantStatus:
  - startTime: string
    status: string
    statusMessage: string
FieldDescription
ackResourceMetadata
Optional
object
All CRs managed by ACK have a common Status.ACKResourceMetadata member
that is used to contain resource sync state, account ownership,
constructed ARN for the resource
ackResourceMetadata.arn
Optional
string
ARN is the Amazon Resource Name for the resource. This is a
globally-unique identifier and is set only by the ACK service controller
once the controller has orchestrated the creation of the resource OR
when it has verified that an “adopted” resource (a resource where the
ARN annotation was set by the Kubernetes user on the CR) exists and
matches the supplied CR’s Spec field values.
TODO(vijat@): Find a better strategy for resources that do not have ARN in CreateOutputResponse
https://github.com/aws/aws-controllers-k8s/issues/270
ackResourceMetadata.ownerAccountID
Required
string
OwnerAccountID is the AWS Account ID of the account that owns the
backend AWS service API resource.
ackResourceMetadata.region
Required
string
Region is the AWS region in which the resource exists or will exist.
conditions
Optional
array
All CRS managed by ACK have a common Status.Conditions member that
contains a collection of ackv1alpha1.Condition objects that describe
the various terminal states of the CR and its backend AWS service API
resource
conditions.[]
Required
object
Condition is the common struct used by all CRDs managed by ACK service
controllers to indicate terminal states of the CR and its backend AWS
service API resource
conditions.[].message
Optional
string
A human readable message indicating details about the transition.
conditions.[].reason
Optional
string
The reason for the condition’s last transition.
conditions.[].status
Optional
string
Status of the condition, one of True, False, Unknown.
conditions.[].type
Optional
string
Type is the type of the Condition
creationTime
Optional
string
A timestamp that shows when the endpoint was created.
endpointStatus
Optional
string
The status of the endpoint.


* OutOfService: Endpoint is not available to take incoming requests.


* Creating: CreateEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html)
is executing.


* Updating: UpdateEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpoint.html)
or UpdateEndpointWeightsAndCapacities (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpointWeightsAndCapacities.html)
is executing.


* SystemUpdating: Endpoint is undergoing maintenance and cannot be updated
or deleted or re-scaled until it has completed. This maintenance operation
does not change any customer-specified values such as VPC config, KMS
encryption, model, instance type, or instance count.


* RollingBack: Endpoint fails to scale up or down or change its variant
weight and is in the process of rolling back to its previous configuration.
Once the rollback completes, endpoint returns to an InService status.
This transitional status only applies to an endpoint that has autoscaling
enabled and is undergoing variant weight or capacity changes as part of
an UpdateEndpointWeightsAndCapacities (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpointWeightsAndCapacities.html)
call or when the UpdateEndpointWeightsAndCapacities (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpointWeightsAndCapacities.html)
operation is called explicitly.


* InService: Endpoint is available to process incoming requests.


* Deleting: DeleteEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DeleteEndpoint.html)
is executing.


* Failed: Endpoint could not be created, updated, or re-scaled. Use the
FailureReason value returned by DescribeEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeEndpoint.html)
for information about the failure. DeleteEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DeleteEndpoint.html)
is the only operation that can be performed on a failed endpoint.


* UpdateRollbackFailed: Both the rolling deployment and auto-rollback
failed. Your endpoint is in service with a mix of the old and new endpoint
configurations. For information about how to remedy this issue and restore
the endpoint’s status to InService, see Rolling Deployments (https://docs.aws.amazon.com/sagemaker/latest/dg/deployment-guardrails-rolling.html).
failureReason
Optional
string
If the status of the endpoint is Failed, the reason why it failed.
lastModifiedTime
Optional
string
A timestamp that shows when the endpoint was last modified.
pendingDeploymentSummary
Optional
object
Returns the summary of an in-progress deployment. This field is only returned
when the endpoint is creating or updating with a new endpoint configuration.
pendingDeploymentSummary.endpointConfigName
Optional
string
pendingDeploymentSummary.productionVariants
Optional
array
pendingDeploymentSummary.productionVariants.[]
Required
object
The production variant summary for a deployment when an endpoint is creating
or updating with the CreateEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html)
or UpdateEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpoint.html)
operations. Describes the VariantStatus , weight and capacity for a production
variant associated with an endpoint.
pendingDeploymentSummary.productionVariants.[].currentInstanceCount
Optional
integer
pendingDeploymentSummary.productionVariants.[].currentServerlessConfig
Optional
object
Specifies the serverless configuration for an endpoint variant.
pendingDeploymentSummary.productionVariants.[].currentServerlessConfig.maxConcurrency
Optional
integer
pendingDeploymentSummary.productionVariants.[].currentServerlessConfig.memorySizeInMB
Optional
integer
pendingDeploymentSummary.productionVariants.[].currentServerlessConfig.provisionedConcurrency
Optional
integer
pendingDeploymentSummary.productionVariants.[].currentWeight
Optional
number
pendingDeploymentSummary.productionVariants.[].deployedImages
Optional
array
pendingDeploymentSummary.productionVariants.[].deployedImages.[]
Required
object
Gets the Amazon EC2 Container Registry path of the docker image of the model
that is hosted in this ProductionVariant (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ProductionVariant.html).

If you used the registry/repository[:tag] form to specify the image path of the primary container when you created the model hosted in this ProductionVariant, the path resolves to a path of the form registry/repository[@digest]. A digest is a hash value that identifies a specific version of an image. For information about Amazon ECR paths, see Pulling an Image (https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-pull-ecr-image.html) in the Amazon ECR User Guide. || pendingDeploymentSummary.productionVariants.[].deployedImages.[].resolutionTime
Optional | string
| | pendingDeploymentSummary.productionVariants.[].deployedImages.[].resolvedImage
Optional | string
| | pendingDeploymentSummary.productionVariants.[].deployedImages.[].specifiedImage
Optional | string
| | pendingDeploymentSummary.productionVariants.[].desiredInstanceCount
Optional | integer
| | pendingDeploymentSummary.productionVariants.[].desiredServerlessConfig
Optional | object
Specifies the serverless configuration for an endpoint variant. | | pendingDeploymentSummary.productionVariants.[].desiredServerlessConfig.maxConcurrency
Optional | integer
| | pendingDeploymentSummary.productionVariants.[].desiredServerlessConfig.memorySizeInMB
Optional | integer
| | pendingDeploymentSummary.productionVariants.[].desiredServerlessConfig.provisionedConcurrency
Optional | integer
| | pendingDeploymentSummary.productionVariants.[].desiredWeight
Optional | number
| | pendingDeploymentSummary.productionVariants.[].instanceType
Optional | string
| | pendingDeploymentSummary.productionVariants.[].managedInstanceScaling
Optional | object
Settings that control the range in the number of instances that the endpoint
provisions as it scales up or down to accommodate traffic. | | pendingDeploymentSummary.productionVariants.[].managedInstanceScaling.maxInstanceCount
Optional | integer
| | pendingDeploymentSummary.productionVariants.[].managedInstanceScaling.minInstanceCount
Optional | integer
| | pendingDeploymentSummary.productionVariants.[].managedInstanceScaling.status
Optional | string
| | pendingDeploymentSummary.productionVariants.[].routingConfig
Optional | object
Settings that control how the endpoint routes incoming traffic to the instances
that the endpoint hosts. | | pendingDeploymentSummary.productionVariants.[].routingConfig.routingStrategy
Optional | string
| | pendingDeploymentSummary.productionVariants.[].variantName
Optional | string
| | pendingDeploymentSummary.productionVariants.[].variantStatus
Optional | array
| | pendingDeploymentSummary.productionVariants.[].variantStatus.[]
Required | object
Describes the status of the production variant. || pendingDeploymentSummary.productionVariants.[].variantStatus.[].startTime
Optional | string
| | pendingDeploymentSummary.productionVariants.[].variantStatus.[].status
Optional | string
| | pendingDeploymentSummary.productionVariants.[].variantStatus.[].statusMessage
Optional | string
| | pendingDeploymentSummary.startTime
Optional | string
| | productionVariants
Optional | array
An array of ProductionVariantSummary (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ProductionVariantSummary.html)
objects, one for each model hosted behind this endpoint. | | productionVariants.[]
Required | object
Describes weight and capacities for a production variant associated with an endpoint. If you sent a request to the UpdateEndpointWeightsAndCapacities API and the endpoint status is Updating, you get different desired and current values. || productionVariants.[].currentInstanceCount
Optional | integer
| | productionVariants.[].currentServerlessConfig
Optional | object
Specifies the serverless configuration for an endpoint variant. | | productionVariants.[].currentServerlessConfig.maxConcurrency
Optional | integer
| | productionVariants.[].currentServerlessConfig.memorySizeInMB
Optional | integer
| | productionVariants.[].currentServerlessConfig.provisionedConcurrency
Optional | integer
| | productionVariants.[].currentWeight
Optional | number
| | productionVariants.[].deployedImages
Optional | array
| | productionVariants.[].deployedImages.[]
Required | object
Gets the Amazon EC2 Container Registry path of the docker image of the model that is hosted in this ProductionVariant (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ProductionVariant.html).

If you used the registry/repository[:tag] form to specify the image path of the primary container when you created the model hosted in this ProductionVariant, the path resolves to a path of the form registry/repository[@digest]. A digest is a hash value that identifies a specific version of an image. For information about Amazon ECR paths, see Pulling an Image (https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-pull-ecr-image.html) in the Amazon ECR User Guide. || productionVariants.[].deployedImages.[].resolutionTime
Optional | string
| | productionVariants.[].deployedImages.[].resolvedImage
Optional | string
| | productionVariants.[].deployedImages.[].specifiedImage
Optional | string
| | productionVariants.[].desiredInstanceCount
Optional | integer
| | productionVariants.[].desiredServerlessConfig
Optional | object
Specifies the serverless configuration for an endpoint variant. | | productionVariants.[].desiredServerlessConfig.maxConcurrency
Optional | integer
| | productionVariants.[].desiredServerlessConfig.memorySizeInMB
Optional | integer
| | productionVariants.[].desiredServerlessConfig.provisionedConcurrency
Optional | integer
| | productionVariants.[].desiredWeight
Optional | number
| | productionVariants.[].managedInstanceScaling
Optional | object
Settings that control the range in the number of instances that the endpoint
provisions as it scales up or down to accommodate traffic. | | productionVariants.[].managedInstanceScaling.maxInstanceCount
Optional | integer
| | productionVariants.[].managedInstanceScaling.minInstanceCount
Optional | integer
| | productionVariants.[].managedInstanceScaling.status
Optional | string
| | productionVariants.[].routingConfig
Optional | object
Settings that control how the endpoint routes incoming traffic to the instances
that the endpoint hosts. | | productionVariants.[].routingConfig.routingStrategy
Optional | string
| | productionVariants.[].variantName
Optional | string
| | productionVariants.[].variantStatus
Optional | array
| | productionVariants.[].variantStatus.[]
Required | object
Describes the status of the production variant. || productionVariants.[].variantStatus.[].startTime
Optional | string
| | productionVariants.[].variantStatus.[].status
Optional | string
| | productionVariants.[].variantStatus.[].statusMessage
Optional | string
|