TransformJob

sagemaker.services.k8s.aws/v1alpha1

Type	Link
GoDoc	sagemaker-controller/apis/v1alpha1#TransformJob

Metadata

Property	Value
Scope	Namespaced
Kind	`TransformJob`
ListKind	`TransformJobList`
Plural	`transformjobs`
Singular	`transformjob`

A batch transform job. For information about SageMaker batch transform, see Use Batch Transform (https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html).

Spec

batchStrategy: string
dataProcessing: 
  inputFilter: string
  joinSource: string
  outputFilter: string
environment: {}
experimentConfig: 
  experimentName: string
  trialComponentDisplayName: string
  trialName: string
maxConcurrentTransforms: integer
maxPayloadInMB: integer
modelClientConfig: 
  invocationsMaxRetries: integer
  invocationsTimeoutInSeconds: integer
modelName: string
tags:
- key: string
  value: string
transformInput: 
  compressionType: string
  contentType: string
  dataSource: 
    s3DataSource: 
      s3DataType: string
      s3URI: string
  splitType: string
transformJobName: string
transformOutput: 
  accept: string
  assembleWith: string
  kmsKeyID: string
  s3OutputPath: string
transformResources: 
  instanceCount: integer
  instanceType: string
  volumeKMSKeyID: string

Field	Description
batchStrategy Optional	string Specifies the number of records to include in a mini-batch for an HTTP inference request. A record is a single unit of input data that inference can be made on. For example, a single line in a CSV file is a record. To enable the batch strategy, you must set the SplitType property to Line, RecordIO, or TFRecord. To use only one record when making an HTTP invocation request to a container, set BatchStrategy to SingleRecord and SplitType to Line. To fit as many records in a mini-batch as can fit within the MaxPayloadInMB limit, set BatchStrategy to MultiRecord and SplitType to Line.
dataProcessing Optional	object The data structure used to specify the data to be used for inference in a batch transform job and to associate the data that is relevant to the prediction results in the output. The input filter provided allows you to exclude input data that is not needed for inference in a batch transform job. The output filter provided allows you to include input data relevant to interpreting the predictions in the output from the job. For more information, see Associate Prediction Results with their Corresponding Input Records (https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html).
dataProcessing.inputFilter Optional	string
dataProcessing.joinSource Optional	string
dataProcessing.outputFilter Optional	string
environment Optional	object The environment variables to set in the Docker container. Don’t include any sensitive data in your environment variables. We support up to 16 key and values entries in the map.
experimentConfig Optional	object Associates a SageMaker job as a trial component with an experiment and trial. Specified when you call the following APIs: * CreateProcessingJob (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob.html) * CreateTrainingJob (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html) * CreateTransformJob (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTransformJob.html)
experimentConfig.experimentName Optional	string
experimentConfig.trialComponentDisplayName Optional	string
experimentConfig.trialName Optional	string
maxConcurrentTransforms Optional	integer The maximum number of parallel requests that can be sent to each instance in a transform job. If MaxConcurrentTransforms is set to 0 or left unset, Amazon SageMaker checks the optional execution-parameters to determine the settings for your chosen algorithm. If the execution-parameters endpoint is not enabled, the default value is 1. For more information on execution-parameters, see How Containers Serve Requests (https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-batch-code.html#your-algorithms-batch-code-how-containe-serves-requests). For built-in algorithms, you don’t need to set a value for MaxConcurrentTransforms.
maxPayloadInMB Optional	integer The maximum allowed size of the payload, in MB. A payload is the data portion of a record (without metadata). The value in MaxPayloadInMB must be greater than, or equal to, the size of a single record. To estimate the size of a record in MB, divide the size of your dataset by the number of records. To ensure that the records fit within the maximum payload size, we recommend using a slightly larger value. The default value is 6 MB. The value of MaxPayloadInMB cannot be greater than 100 MB. If you specify the MaxConcurrentTransforms parameter, the value of (MaxConcurrentTransforms * MaxPayloadInMB) also cannot exceed 100 MB. For cases where the payload might be arbitrarily large and is transmitted using HTTP chunked encoding, set the value to 0. This feature works only in supported algorithms. Currently, Amazon SageMaker built-in algorithms do not support HTTP chunked encoding.
modelClientConfig Optional	object Configures the timeout and maximum number of retries for processing a transform job invocation.
modelClientConfig.invocationsMaxRetries Optional	integer
modelClientConfig.invocationsTimeoutInSeconds Optional	integer
modelName Required	string The name of the model that you want to use for the transform job. ModelName must be the name of an existing Amazon SageMaker model within an Amazon Web Services Region in an Amazon Web Services account. Regex Pattern: `^[a-zA-Z0-9]([\-a-zA-Z0-9]*[a-zA-Z0-9])?$`
tags Optional	array (Optional) An array of key-value pairs. For more information, see Using Cost Allocation Tags (https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html#allocation-what) in the Amazon Web Services Billing and Cost Management User Guide.
tags.[] Required	object A tag object that consists of a key and an optional value, used to manage
metadata for SageMaker Amazon Web Services resources.

You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AddTags.html).

For more information on adding metadata to your Amazon Web Services resources with tagging, see Tagging Amazon Web Services resources (https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html). For advice on best practices for managing Amazon Web Services resources with tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services Resource Tagging Strategy (https://d1.awsstatic.com/whitepapers/aws-tagging-best-practices.pdf). || tags.[].key
Optional | string
| | tags.[].value
Optional | string
| | transformInput
Required | object
Describes the input source and the way the transform job consumes it. | | transformInput.compressionType
Optional | string
| | transformInput.contentType
Optional | string
| | transformInput.dataSource
Optional | object
Describes the location of the channel data. | | transformInput.dataSource.s3DataSource
Optional | object
Describes the S3 data source. | | transformInput.dataSource.s3DataSource.s3DataType
Optional | string
| | transformInput.dataSource.s3DataSource.s3URI
Optional | string
| | transformInput.splitType
Optional | string
| | transformJobName
Required | string
The name of the transform job. The name must be unique within an Amazon Web
Services Region in an Amazon Web Services account.

Regex Pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}$ | | transformOutput
Required | object
Describes the results of the transform job. | | transformOutput.accept
Optional | string
| | transformOutput.assembleWith
Optional | string
| | transformOutput.kmsKeyID
Optional | string
| | transformOutput.s3OutputPath
Optional | string
| | transformResources
Required | object
Describes the resources, including ML instance types and ML instance count,
to use for the transform job. | | transformResources.instanceCount
Optional | integer
| | transformResources.instanceType
Optional | string
| | transformResources.volumeKMSKeyID
Optional | string
|

Status

ackResourceMetadata: 
  arn: string
  ownerAccountID: string
  region: string
conditions:
- lastTransitionTime: string
  message: string
  reason: string
  status: string
  type: string
failureReason: string
transformJobStatus: string

Field	Description
ackResourceMetadata Optional	object All CRs managed by ACK have a common `Status.ACKResourceMetadata` member that is used to contain resource sync state, account ownership, constructed ARN for the resource
ackResourceMetadata.arn Optional	string ARN is the Amazon Resource Name for the resource. This is a globally-unique identifier and is set only by the ACK service controller once the controller has orchestrated the creation of the resource OR when it has verified that an “adopted” resource (a resource where the ARN annotation was set by the Kubernetes user on the CR) exists and matches the supplied CR’s Spec field values. https://github.com/aws/aws-controllers-k8s/issues/270
ackResourceMetadata.ownerAccountID Required	string OwnerAccountID is the AWS Account ID of the account that owns the backend AWS service API resource.
ackResourceMetadata.region Required	string Region is the AWS region in which the resource exists or will exist.
conditions Optional	array All CRs managed by ACK have a common `Status.Conditions` member that contains a collection of `ackv1alpha1.Condition` objects that describe the various terminal states of the CR and its backend AWS service API resource
conditions.[] Required	object Condition is the common struct used by all CRDs managed by ACK service
controllers to indicate terminal states of the CR and its backend AWS
service API resource
conditions.[].message Optional	string A human readable message indicating details about the transition.
conditions.[].reason Optional	string The reason for the condition’s last transition.
conditions.[].status Optional	string Status of the condition, one of True, False, Unknown.
conditions.[].type Optional	string Type is the type of the Condition
failureReason Optional	string If the transform job failed, FailureReason describes why it failed. A transform job creates a log file, which includes error messages, and stores it as an Amazon S3 object. For more information, see Log Amazon SageMaker Events with Amazon CloudWatch (https://docs.aws.amazon.com/sagemaker/latest/dg/logging-cloudwatch.html).
transformJobStatus Optional	string The status of the transform job. If the transform job failed, the reason is returned in the FailureReason field.

TransformJob

Metadata#

Spec#

Status#

Metadata

Spec

Status