TransformJob
sagemaker.services.k8s.aws/v1alpha1
Type | Link |
---|---|
GoDoc | sagemaker-controller/apis/v1alpha1#TransformJob |
Metadata
Property | Value |
---|---|
Scope | Namespaced |
Kind | TransformJob |
ListKind | TransformJobList |
Plural | transformjobs |
Singular | transformjob |
A batch transform job. For information about SageMaker batch transform, see Use Batch Transform (https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html).
Spec
batchStrategy: string
dataProcessing:
inputFilter: string
joinSource: string
outputFilter: string
environment: {}
experimentConfig:
experimentName: string
trialComponentDisplayName: string
trialName: string
maxConcurrentTransforms: integer
maxPayloadInMB: integer
modelClientConfig:
invocationsMaxRetries: integer
invocationsTimeoutInSeconds: integer
modelName: string
tags:
- key: string
value: string
transformInput:
compressionType: string
contentType: string
dataSource:
s3DataSource:
s3DataType: string
s3URI: string
splitType: string
transformJobName: string
transformOutput:
accept: string
assembleWith: string
kmsKeyID: string
s3OutputPath: string
transformResources:
instanceCount: integer
instanceType: string
volumeKMSKeyID: string
Field | Description |
---|---|
batchStrategy Optional | string Specifies the number of records to include in a mini-batch for an HTTP inference request. A record is a single unit of input data that inference can be made on. For example, a single line in a CSV file is a record. To enable the batch strategy, you must set the SplitType property to Line, RecordIO, or TFRecord. To use only one record when making an HTTP invocation request to a container, set BatchStrategy to SingleRecord and SplitType to Line. To fit as many records in a mini-batch as can fit within the MaxPayloadInMB limit, set BatchStrategy to MultiRecord and SplitType to Line. |
dataProcessing Optional | object The data structure used to specify the data to be used for inference in a batch transform job and to associate the data that is relevant to the prediction results in the output. The input filter provided allows you to exclude input data that is not needed for inference in a batch transform job. The output filter provided allows you to include input data relevant to interpreting the predictions in the output from the job. For more information, see Associate Prediction Results with their Corresponding Input Records (https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html). |
dataProcessing.inputFilter Optional | string |
dataProcessing.joinSource Optional | string |
dataProcessing.outputFilter Optional | string |
environment Optional | object The environment variables to set in the Docker container. We support up to 16 key and values entries in the map. |
experimentConfig Optional | object Associates a SageMaker job as a trial component with an experiment and trial. Specified when you call the following APIs: * CreateProcessingJob (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob.html) * CreateTrainingJob (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html) * CreateTransformJob (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTransformJob.html) |
experimentConfig.experimentName Optional | string |
experimentConfig.trialComponentDisplayName Optional | string |
experimentConfig.trialName Optional | string |
maxConcurrentTransforms Optional | integer The maximum number of parallel requests that can be sent to each instance in a transform job. If MaxConcurrentTransforms is set to 0 or left unset, Amazon SageMaker checks the optional execution-parameters to determine the settings for your chosen algorithm. If the execution-parameters endpoint is not enabled, the default value is 1. For more information on execution-parameters, see How Containers Serve Requests (https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-batch-code.html#your-algorithms-batch-code-how-containe-serves-requests). For built-in algorithms, you don’t need to set a value for MaxConcurrentTransforms. |
maxPayloadInMB Optional | integer The maximum allowed size of the payload, in MB. A payload is the data portion of a record (without metadata). The value in MaxPayloadInMB must be greater than, or equal to, the size of a single record. To estimate the size of a record in MB, divide the size of your dataset by the number of records. To ensure that the records fit within the maximum payload size, we recommend using a slightly larger value. The default value is 6 MB. The value of MaxPayloadInMB cannot be greater than 100 MB. If you specify the MaxConcurrentTransforms parameter, the value of (MaxConcurrentTransforms * MaxPayloadInMB) also cannot exceed 100 MB. For cases where the payload might be arbitrarily large and is transmitted using HTTP chunked encoding, set the value to 0. This feature works only in supported algorithms. Currently, Amazon SageMaker built-in algorithms do not support HTTP chunked encoding. |
modelClientConfig Optional | object Configures the timeout and maximum number of retries for processing a transform job invocation. |
modelClientConfig.invocationsMaxRetries Optional | integer |
modelClientConfig.invocationsTimeoutInSeconds Optional | integer |
modelName Required | string The name of the model that you want to use for the transform job. ModelName must be the name of an existing Amazon SageMaker model within an Amazon Web Services Region in an Amazon Web Services account. |
tags Optional | array (Optional) An array of key-value pairs. For more information, see Using Cost Allocation Tags (https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html#allocation-what) in the Amazon Web Services Billing and Cost Management User Guide. |
tags.[] Required | object A tag object that consists of a key and an optional value, used to manage |
metadata for SageMaker Amazon Web Services resources. |
You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AddTags.html).
For more information on adding metadata to your Amazon Web Services resources
with tagging, see Tagging Amazon Web Services resources (https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html).
For advice on best practices for managing Amazon Web Services resources with
tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services
Resource Tagging Strategy (https://d1.awsstatic.com/whitepapers/aws-tagging-best-practices.pdf). || tags.[].key
Optional | string
|
| tags.[].value
Optional | string
|
| transformInput
Required | object
Describes the input source and the way the transform job consumes it. |
| transformInput.compressionType
Optional | string
|
| transformInput.contentType
Optional | string
|
| transformInput.dataSource
Optional | object
Describes the location of the channel data. |
| transformInput.dataSource.s3DataSource
Optional | object
Describes the S3 data source. |
| transformInput.dataSource.s3DataSource.s3DataType
Optional | string
|
| transformInput.dataSource.s3DataSource.s3URI
Optional | string
|
| transformInput.splitType
Optional | string
|
| transformJobName
Required | string
The name of the transform job. The name must be unique within an Amazon Web
Services Region in an Amazon Web Services account. |
| transformOutput
Required | object
Describes the results of the transform job. |
| transformOutput.accept
Optional | string
|
| transformOutput.assembleWith
Optional | string
|
| transformOutput.kmsKeyID
Optional | string
|
| transformOutput.s3OutputPath
Optional | string
|
| transformResources
Required | object
Describes the resources, including ML instance types and ML instance count,
to use for the transform job. |
| transformResources.instanceCount
Optional | integer
|
| transformResources.instanceType
Optional | string
|
| transformResources.volumeKMSKeyID
Optional | string
|
Status
ackResourceMetadata:
arn: string
ownerAccountID: string
region: string
conditions:
- lastTransitionTime: string
message: string
reason: string
status: string
type: string
failureReason: string
transformJobStatus: string
Field | Description |
---|---|
ackResourceMetadata Optional | object All CRs managed by ACK have a common Status.ACKResourceMetadata memberthat is used to contain resource sync state, account ownership, constructed ARN for the resource |
ackResourceMetadata.arn Optional | string ARN is the Amazon Resource Name for the resource. This is a globally-unique identifier and is set only by the ACK service controller once the controller has orchestrated the creation of the resource OR when it has verified that an “adopted” resource (a resource where the ARN annotation was set by the Kubernetes user on the CR) exists and matches the supplied CR’s Spec field values. TODO(vijat@): Find a better strategy for resources that do not have ARN in CreateOutputResponse https://github.com/aws/aws-controllers-k8s/issues/270 |
ackResourceMetadata.ownerAccountID Required | string OwnerAccountID is the AWS Account ID of the account that owns the backend AWS service API resource. |
ackResourceMetadata.region Required | string Region is the AWS region in which the resource exists or will exist. |
conditions Optional | array All CRS managed by ACK have a common Status.Conditions member thatcontains a collection of ackv1alpha1.Condition objects that describethe various terminal states of the CR and its backend AWS service API resource |
conditions.[] Required | object Condition is the common struct used by all CRDs managed by ACK service |
controllers to indicate terminal states of the CR and its backend AWS | |
service API resource | |
conditions.[].message Optional | string A human readable message indicating details about the transition. |
conditions.[].reason Optional | string The reason for the condition’s last transition. |
conditions.[].status Optional | string Status of the condition, one of True, False, Unknown. |
conditions.[].type Optional | string Type is the type of the Condition |
failureReason Optional | string If the transform job failed, FailureReason describes why it failed. A transform job creates a log file, which includes error messages, and stores it as an Amazon S3 object. For more information, see Log Amazon SageMaker Events with Amazon CloudWatch (https://docs.aws.amazon.com/sagemaker/latest/dg/logging-cloudwatch.html). |
transformJobStatus Optional | string The status of the transform job. If the transform job failed, the reason is returned in the FailureReason field. |