Endpoint
sagemaker.services.k8s.aws/v1alpha1
Type | Link |
---|---|
GoDoc | sagemaker-controller/apis/v1alpha1#Endpoint |
Metadata
Property | Value |
---|---|
Scope | Namespaced |
Kind | Endpoint |
ListKind | EndpointList |
Plural | endpoints |
Singular | endpoint |
A hosted endpoint for real-time inference.
Spec
deploymentConfig:
autoRollbackConfiguration:
alarms:
- alarmName: string
blueGreenUpdatePolicy:
maximumExecutionTimeoutInSeconds: integer
terminationWaitInSeconds: integer
trafficRoutingConfiguration:
canarySize:
type_: string
value: integer
linearStepSize:
type_: string
value: integer
type_: string
waitIntervalInSeconds: integer
rollingUpdatePolicy:
maximumBatchSize:
type_: string
value: integer
maximumExecutionTimeoutInSeconds: integer
rollbackMaximumBatchSize:
type_: string
value: integer
waitIntervalInSeconds: integer
endpointConfigName: string
endpointName: string
tags:
- key: string
value: string
Field | Description |
---|---|
deploymentConfig Optional | object The deployment configuration for an endpoint, which contains the desired deployment strategy and rollback configurations. |
deploymentConfig.autoRollbackConfiguration Optional | object Automatic rollback configuration for handling endpoint deployment failures and recovery. |
deploymentConfig.autoRollbackConfiguration.alarms Optional | array |
deploymentConfig.autoRollbackConfiguration.alarms.[] Required | object An Amazon CloudWatch alarm configured to monitor metrics on an endpoint. |
deploymentConfig.blueGreenUpdatePolicy Optional | object Update policy for a blue/green deployment. If this update policy is specified, SageMaker creates a new fleet during the deployment while maintaining the old fleet. SageMaker flips traffic to the new fleet according to the specified traffic routing configuration. Only one update policy should be used in the deployment configuration. If no update policy is specified, SageMaker uses a blue/green deployment strategy with all at once traffic shifting by default. |
deploymentConfig.blueGreenUpdatePolicy.maximumExecutionTimeoutInSeconds Optional | integer |
deploymentConfig.blueGreenUpdatePolicy.terminationWaitInSeconds Optional | integer |
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration Optional | object Defines the traffic routing strategy during an endpoint deployment to shift traffic from the old fleet to the new fleet. |
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.canarySize Optional | object Specifies the type and size of the endpoint capacity to activate for a blue/green deployment, a rolling deployment, or a rollback strategy. You can specify your batches as either instance count or the overall percentage or your fleet. For a rollback strategy, if you don’t specify the fields in this object, or if you set the Value to 100%, then SageMaker uses a blue/green rollback strategy and rolls all traffic back to the blue fleet. |
**deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.canarySize.type_** Optional | string |
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.canarySize.value Optional | integer |
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.linearStepSize Optional | object Specifies the type and size of the endpoint capacity to activate for a blue/green deployment, a rolling deployment, or a rollback strategy. You can specify your batches as either instance count or the overall percentage or your fleet. For a rollback strategy, if you don’t specify the fields in this object, or if you set the Value to 100%, then SageMaker uses a blue/green rollback strategy and rolls all traffic back to the blue fleet. |
**deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.linearStepSize.type_** Optional | string |
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.linearStepSize.value Optional | integer |
**deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.type_** Optional | string |
deploymentConfig.blueGreenUpdatePolicy.trafficRoutingConfiguration.waitIntervalInSeconds Optional | integer |
deploymentConfig.rollingUpdatePolicy Optional | object Specifies a rolling deployment strategy for updating a SageMaker endpoint. |
deploymentConfig.rollingUpdatePolicy.maximumBatchSize Optional | object Specifies the type and size of the endpoint capacity to activate for a blue/green deployment, a rolling deployment, or a rollback strategy. You can specify your batches as either instance count or the overall percentage or your fleet. For a rollback strategy, if you don’t specify the fields in this object, or if you set the Value to 100%, then SageMaker uses a blue/green rollback strategy and rolls all traffic back to the blue fleet. |
**deploymentConfig.rollingUpdatePolicy.maximumBatchSize.type_** Optional | string |
deploymentConfig.rollingUpdatePolicy.maximumBatchSize.value Optional | integer |
deploymentConfig.rollingUpdatePolicy.maximumExecutionTimeoutInSeconds Optional | integer |
deploymentConfig.rollingUpdatePolicy.rollbackMaximumBatchSize Optional | object Specifies the type and size of the endpoint capacity to activate for a blue/green deployment, a rolling deployment, or a rollback strategy. You can specify your batches as either instance count or the overall percentage or your fleet. For a rollback strategy, if you don’t specify the fields in this object, or if you set the Value to 100%, then SageMaker uses a blue/green rollback strategy and rolls all traffic back to the blue fleet. |
**deploymentConfig.rollingUpdatePolicy.rollbackMaximumBatchSize.type_** Optional | string |
deploymentConfig.rollingUpdatePolicy.rollbackMaximumBatchSize.value Optional | integer |
deploymentConfig.rollingUpdatePolicy.waitIntervalInSeconds Optional | integer |
endpointConfigName Required | string The name of an endpoint configuration. For more information, see CreateEndpointConfig (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html). |
endpointName Required | string The name of the endpoint.The name must be unique within an Amazon Web Services Region in your Amazon Web Services account. The name is case-insensitive in CreateEndpoint, but the case is preserved and must be matched in InvokeEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html). |
tags Optional | array An array of key-value pairs. You can use tags to categorize your Amazon Web Services resources in different ways, for example, by purpose, owner, or environment. For more information, see Tagging Amazon Web Services Resources (https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html). |
tags.[] Required | object A tag object that consists of a key and an optional value, used to manage |
metadata for SageMaker Amazon Web Services resources. |
You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AddTags.html).
For more information on adding metadata to your Amazon Web Services resources
with tagging, see Tagging Amazon Web Services resources (https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html).
For advice on best practices for managing Amazon Web Services resources with
tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services
Resource Tagging Strategy (https://d1.awsstatic.com/whitepapers/aws-tagging-best-practices.pdf). || tags.[].key
Optional | string
|
| tags.[].value
Optional | string
|
Status
ackResourceMetadata:
arn: string
ownerAccountID: string
region: string
conditions:
- lastTransitionTime: string
message: string
reason: string
status: string
type: string
creationTime: string
endpointStatus: string
failureReason: string
lastModifiedTime: string
pendingDeploymentSummary:
endpointConfigName: string
productionVariants:
- acceleratorType: string
currentInstanceCount: integer
currentServerlessConfig:
maxConcurrency: integer
memorySizeInMB: integer
provisionedConcurrency: integer
currentWeight: number
deployedImages:
- resolutionTime: string
resolvedImage: string
specifiedImage: string
desiredInstanceCount: integer
desiredServerlessConfig:
maxConcurrency: integer
memorySizeInMB: integer
provisionedConcurrency: integer
desiredWeight: number
instanceType: string
managedInstanceScaling:
maxInstanceCount: integer
minInstanceCount: integer
status: string
routingConfig:
routingStrategy: string
variantName: string
variantStatus:
- startTime: string
status: string
statusMessage: string
startTime: string
productionVariants:
- currentInstanceCount: integer
currentServerlessConfig:
maxConcurrency: integer
memorySizeInMB: integer
provisionedConcurrency: integer
currentWeight: number
deployedImages:
- resolutionTime: string
resolvedImage: string
specifiedImage: string
desiredInstanceCount: integer
desiredServerlessConfig:
maxConcurrency: integer
memorySizeInMB: integer
provisionedConcurrency: integer
desiredWeight: number
managedInstanceScaling:
maxInstanceCount: integer
minInstanceCount: integer
status: string
routingConfig:
routingStrategy: string
variantName: string
variantStatus:
- startTime: string
status: string
statusMessage: string
Field | Description |
---|---|
ackResourceMetadata Optional | object All CRs managed by ACK have a common Status.ACKResourceMetadata memberthat is used to contain resource sync state, account ownership, constructed ARN for the resource |
ackResourceMetadata.arn Optional | string ARN is the Amazon Resource Name for the resource. This is a globally-unique identifier and is set only by the ACK service controller once the controller has orchestrated the creation of the resource OR when it has verified that an “adopted” resource (a resource where the ARN annotation was set by the Kubernetes user on the CR) exists and matches the supplied CR’s Spec field values. TODO(vijat@): Find a better strategy for resources that do not have ARN in CreateOutputResponse https://github.com/aws/aws-controllers-k8s/issues/270 |
ackResourceMetadata.ownerAccountID Required | string OwnerAccountID is the AWS Account ID of the account that owns the backend AWS service API resource. |
ackResourceMetadata.region Required | string Region is the AWS region in which the resource exists or will exist. |
conditions Optional | array All CRS managed by ACK have a common Status.Conditions member thatcontains a collection of ackv1alpha1.Condition objects that describethe various terminal states of the CR and its backend AWS service API resource |
conditions.[] Required | object Condition is the common struct used by all CRDs managed by ACK service |
controllers to indicate terminal states of the CR and its backend AWS | |
service API resource | |
conditions.[].message Optional | string A human readable message indicating details about the transition. |
conditions.[].reason Optional | string The reason for the condition’s last transition. |
conditions.[].status Optional | string Status of the condition, one of True, False, Unknown. |
conditions.[].type Optional | string Type is the type of the Condition |
creationTime Optional | string A timestamp that shows when the endpoint was created. |
endpointStatus Optional | string The status of the endpoint. * OutOfService: Endpoint is not available to take incoming requests. * Creating: CreateEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html) is executing. * Updating: UpdateEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpoint.html) or UpdateEndpointWeightsAndCapacities (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpointWeightsAndCapacities.html) is executing. * SystemUpdating: Endpoint is undergoing maintenance and cannot be updated or deleted or re-scaled until it has completed. This maintenance operation does not change any customer-specified values such as VPC config, KMS encryption, model, instance type, or instance count. * RollingBack: Endpoint fails to scale up or down or change its variant weight and is in the process of rolling back to its previous configuration. Once the rollback completes, endpoint returns to an InService status. This transitional status only applies to an endpoint that has autoscaling enabled and is undergoing variant weight or capacity changes as part of an UpdateEndpointWeightsAndCapacities (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpointWeightsAndCapacities.html) call or when the UpdateEndpointWeightsAndCapacities (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpointWeightsAndCapacities.html) operation is called explicitly. * InService: Endpoint is available to process incoming requests. * Deleting: DeleteEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DeleteEndpoint.html) is executing. * Failed: Endpoint could not be created, updated, or re-scaled. Use the FailureReason value returned by DescribeEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeEndpoint.html) for information about the failure. DeleteEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DeleteEndpoint.html) is the only operation that can be performed on a failed endpoint. * UpdateRollbackFailed: Both the rolling deployment and auto-rollback failed. Your endpoint is in service with a mix of the old and new endpoint configurations. For information about how to remedy this issue and restore the endpoint’s status to InService, see Rolling Deployments (https://docs.aws.amazon.com/sagemaker/latest/dg/deployment-guardrails-rolling.html). |
failureReason Optional | string If the status of the endpoint is Failed, the reason why it failed. |
lastModifiedTime Optional | string A timestamp that shows when the endpoint was last modified. |
pendingDeploymentSummary Optional | object Returns the summary of an in-progress deployment. This field is only returned when the endpoint is creating or updating with a new endpoint configuration. |
pendingDeploymentSummary.endpointConfigName Optional | string |
pendingDeploymentSummary.productionVariants Optional | array |
pendingDeploymentSummary.productionVariants.[] Required | object The production variant summary for a deployment when an endpoint is creating |
or updating with the CreateEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpoint.html) | |
or UpdateEndpoint (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpoint.html) | |
operations. Describes the VariantStatus , weight and capacity for a production | |
variant associated with an endpoint. | |
pendingDeploymentSummary.productionVariants.[].currentInstanceCount Optional | integer |
pendingDeploymentSummary.productionVariants.[].currentServerlessConfig Optional | object Specifies the serverless configuration for an endpoint variant. |
pendingDeploymentSummary.productionVariants.[].currentServerlessConfig.maxConcurrency Optional | integer |
pendingDeploymentSummary.productionVariants.[].currentServerlessConfig.memorySizeInMB Optional | integer |
pendingDeploymentSummary.productionVariants.[].currentServerlessConfig.provisionedConcurrency Optional | integer |
pendingDeploymentSummary.productionVariants.[].currentWeight Optional | number |
pendingDeploymentSummary.productionVariants.[].deployedImages Optional | array |
pendingDeploymentSummary.productionVariants.[].deployedImages.[] Required | object Gets the Amazon EC2 Container Registry path of the docker image of the model |
that is hosted in this ProductionVariant (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ProductionVariant.html). |
If you used the registry/repository[:tag] form to specify the image path
of the primary container when you created the model hosted in this ProductionVariant,
the path resolves to a path of the form registry/repository[@digest]. A digest
is a hash value that identifies a specific version of an image. For information
about Amazon ECR paths, see Pulling an Image (https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-pull-ecr-image.html)
in the Amazon ECR User Guide. || pendingDeploymentSummary.productionVariants.[].deployedImages.[].resolutionTime
Optional | string
|
| pendingDeploymentSummary.productionVariants.[].deployedImages.[].resolvedImage
Optional | string
|
| pendingDeploymentSummary.productionVariants.[].deployedImages.[].specifiedImage
Optional | string
|
| pendingDeploymentSummary.productionVariants.[].desiredInstanceCount
Optional | integer
|
| pendingDeploymentSummary.productionVariants.[].desiredServerlessConfig
Optional | object
Specifies the serverless configuration for an endpoint variant. |
| pendingDeploymentSummary.productionVariants.[].desiredServerlessConfig.maxConcurrency
Optional | integer
|
| pendingDeploymentSummary.productionVariants.[].desiredServerlessConfig.memorySizeInMB
Optional | integer
|
| pendingDeploymentSummary.productionVariants.[].desiredServerlessConfig.provisionedConcurrency
Optional | integer
|
| pendingDeploymentSummary.productionVariants.[].desiredWeight
Optional | number
|
| pendingDeploymentSummary.productionVariants.[].instanceType
Optional | string
|
| pendingDeploymentSummary.productionVariants.[].managedInstanceScaling
Optional | object
Settings that control the range in the number of instances that the endpoint
provisions as it scales up or down to accommodate traffic. |
| pendingDeploymentSummary.productionVariants.[].managedInstanceScaling.maxInstanceCount
Optional | integer
|
| pendingDeploymentSummary.productionVariants.[].managedInstanceScaling.minInstanceCount
Optional | integer
|
| pendingDeploymentSummary.productionVariants.[].managedInstanceScaling.status
Optional | string
|
| pendingDeploymentSummary.productionVariants.[].routingConfig
Optional | object
Settings that control how the endpoint routes incoming traffic to the instances
that the endpoint hosts. |
| pendingDeploymentSummary.productionVariants.[].routingConfig.routingStrategy
Optional | string
|
| pendingDeploymentSummary.productionVariants.[].variantName
Optional | string
|
| pendingDeploymentSummary.productionVariants.[].variantStatus
Optional | array
|
| pendingDeploymentSummary.productionVariants.[].variantStatus.[]
Required | object
Describes the status of the production variant. || pendingDeploymentSummary.productionVariants.[].variantStatus.[].startTime
Optional | string
|
| pendingDeploymentSummary.productionVariants.[].variantStatus.[].status
Optional | string
|
| pendingDeploymentSummary.productionVariants.[].variantStatus.[].statusMessage
Optional | string
|
| pendingDeploymentSummary.startTime
Optional | string
|
| productionVariants
Optional | array
An array of ProductionVariantSummary (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ProductionVariantSummary.html)
objects, one for each model hosted behind this endpoint. |
| productionVariants.[]
Required | object
Describes weight and capacities for a production variant associated with
an endpoint. If you sent a request to the UpdateEndpointWeightsAndCapacities
API and the endpoint status is Updating, you get different desired and current
values. || productionVariants.[].currentInstanceCount
Optional | integer
|
| productionVariants.[].currentServerlessConfig
Optional | object
Specifies the serverless configuration for an endpoint variant. |
| productionVariants.[].currentServerlessConfig.maxConcurrency
Optional | integer
|
| productionVariants.[].currentServerlessConfig.memorySizeInMB
Optional | integer
|
| productionVariants.[].currentServerlessConfig.provisionedConcurrency
Optional | integer
|
| productionVariants.[].currentWeight
Optional | number
|
| productionVariants.[].deployedImages
Optional | array
|
| productionVariants.[].deployedImages.[]
Required | object
Gets the Amazon EC2 Container Registry path of the docker image of the model
that is hosted in this ProductionVariant (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ProductionVariant.html).
If you used the registry/repository[:tag] form to specify the image path
of the primary container when you created the model hosted in this ProductionVariant,
the path resolves to a path of the form registry/repository[@digest]. A digest
is a hash value that identifies a specific version of an image. For information
about Amazon ECR paths, see Pulling an Image (https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-pull-ecr-image.html)
in the Amazon ECR User Guide. || productionVariants.[].deployedImages.[].resolutionTime
Optional | string
|
| productionVariants.[].deployedImages.[].resolvedImage
Optional | string
|
| productionVariants.[].deployedImages.[].specifiedImage
Optional | string
|
| productionVariants.[].desiredInstanceCount
Optional | integer
|
| productionVariants.[].desiredServerlessConfig
Optional | object
Specifies the serverless configuration for an endpoint variant. |
| productionVariants.[].desiredServerlessConfig.maxConcurrency
Optional | integer
|
| productionVariants.[].desiredServerlessConfig.memorySizeInMB
Optional | integer
|
| productionVariants.[].desiredServerlessConfig.provisionedConcurrency
Optional | integer
|
| productionVariants.[].desiredWeight
Optional | number
|
| productionVariants.[].managedInstanceScaling
Optional | object
Settings that control the range in the number of instances that the endpoint
provisions as it scales up or down to accommodate traffic. |
| productionVariants.[].managedInstanceScaling.maxInstanceCount
Optional | integer
|
| productionVariants.[].managedInstanceScaling.minInstanceCount
Optional | integer
|
| productionVariants.[].managedInstanceScaling.status
Optional | string
|
| productionVariants.[].routingConfig
Optional | object
Settings that control how the endpoint routes incoming traffic to the instances
that the endpoint hosts. |
| productionVariants.[].routingConfig.routingStrategy
Optional | string
|
| productionVariants.[].variantName
Optional | string
|
| productionVariants.[].variantStatus
Optional | array
|
| productionVariants.[].variantStatus.[]
Required | object
Describes the status of the production variant. || productionVariants.[].variantStatus.[].startTime
Optional | string
|
| productionVariants.[].variantStatus.[].status
Optional | string
|
| productionVariants.[].variantStatus.[].statusMessage
Optional | string
|