ProcessingJob
sagemaker.services.k8s.aws/v1alpha1
Type | Link |
---|---|
GoDoc | sagemaker-controller/apis/v1alpha1#ProcessingJob |
Metadata
Property | Value |
---|---|
Scope | Namespaced |
Kind | ProcessingJob |
ListKind | ProcessingJobList |
Plural | processingjobs |
Singular | processingjob |
An Amazon SageMaker processing job that is used to analyze data and evaluate models. For more information, see Process Data and Evaluate Models (https://docs.aws.amazon.com/sagemaker/latest/dg/processing-job.html).
Spec
appSpecification:
containerArguments:
- string
containerEntrypoint:
- string
imageURI: string
environment: {}
experimentConfig:
experimentName: string
trialComponentDisplayName: string
trialName: string
networkConfig:
enableInterContainerTrafficEncryption: boolean
enableNetworkIsolation: boolean
vpcConfig:
securityGroupIDs:
- string
subnets:
- string
processingInputs:
- appManaged: boolean
datasetDefinition:
athenaDatasetDefinition:
catalog: string
database: string
kmsKeyID: string
outputCompression: string
outputFormat: string
outputS3URI: string
queryString: string
workGroup: string
dataDistributionType: string
inputMode: string
localPath: string
redshiftDatasetDefinition:
clusterID: string
clusterRoleARN: string
database: string
dbUser: string
kmsKeyID: string
outputCompression: string
outputFormat: string
outputS3URI: string
queryString: string
inputName: string
s3Input:
localPath: string
s3CompressionType: string
s3DataDistributionType: string
s3DataType: string
s3InputMode: string
s3URI: string
processingJobName: string
processingOutputConfig:
kmsKeyID: string
outputs:
- appManaged: boolean
featureStoreOutput:
featureGroupName: string
outputName: string
s3Output:
localPath: string
s3URI: string
s3UploadMode: string
processingResources:
clusterConfig:
instanceCount: integer
instanceType: string
volumeKMSKeyID: string
volumeSizeInGB: integer
roleARN: string
stoppingCondition:
maxRuntimeInSeconds: integer
tags:
- key: string
value: string
Field | Description |
---|---|
appSpecification Required | object Configures the processing job to run a specified Docker container image. |
appSpecification.containerArguments Optional | array |
appSpecification.containerArguments.[] Required | string |
appSpecification.containerEntrypoint.[] Required | string |
environment Optional | object The environment variables to set in the Docker container. Up to 100 key and values entries in the map are supported. |
experimentConfig Optional | object Associates a SageMaker job as a trial component with an experiment and trial. Specified when you call the following APIs: * CreateProcessingJob (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob.html) * CreateTrainingJob (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html) * CreateTransformJob (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTransformJob.html) |
experimentConfig.experimentName Optional | string |
experimentConfig.trialComponentDisplayName Optional | string |
experimentConfig.trialName Optional | string |
networkConfig Optional | object Networking options for a processing job, such as whether to allow inbound and outbound network calls to and from processing containers, and the VPC subnets and security groups to use for VPC-enabled processing jobs. |
networkConfig.enableInterContainerTrafficEncryption Optional | boolean |
networkConfig.enableNetworkIsolation Optional | boolean |
networkConfig.vpcConfig Optional | object Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC (https://docs.aws.amazon.com/sagemaker/latest/dg/infrastructure-give-access.html). |
networkConfig.vpcConfig.securityGroupIDs Optional | array |
networkConfig.vpcConfig.securityGroupIDs.[] Required | string |
networkConfig.vpcConfig.subnets.[] Required | string |
processingInputs.[] Required | object The inputs for a processing job. The processing input must specify exactly |
one of either S3Input or DatasetDefinition types. | |
processingInputs.[].datasetDefinition Optional | object Configuration for Dataset Definition inputs. The Dataset Definition input must specify exactly one of either AthenaDatasetDefinition or RedshiftDatasetDefinition types. |
processingInputs.[].datasetDefinition.athenaDatasetDefinition Optional | object Configuration for Athena Dataset Definition input. |
processingInputs.[].datasetDefinition.athenaDatasetDefinition.catalog Optional | string The name of the data catalog used in Athena query execution. |
processingInputs.[].datasetDefinition.athenaDatasetDefinition.database Optional | string The name of the database used in the Athena query execution. |
processingInputs.[].datasetDefinition.athenaDatasetDefinition.kmsKeyID Optional | string |
processingInputs.[].datasetDefinition.athenaDatasetDefinition.outputCompression Optional | string The compression used for Athena query results. |
processingInputs.[].datasetDefinition.athenaDatasetDefinition.outputFormat Optional | string The data storage format for Athena query results. |
processingInputs.[].datasetDefinition.athenaDatasetDefinition.outputS3URI Optional | string |
processingInputs.[].datasetDefinition.athenaDatasetDefinition.queryString Optional | string The SQL query statements, to be executed. |
processingInputs.[].datasetDefinition.athenaDatasetDefinition.workGroup Optional | string The name of the workgroup in which the Athena query is being started. |
processingInputs.[].datasetDefinition.dataDistributionType Optional | string |
processingInputs.[].datasetDefinition.inputMode Optional | string |
processingInputs.[].datasetDefinition.localPath Optional | string |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition Optional | object Configuration for Redshift Dataset Definition input. |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition.clusterID Optional | string The Redshift cluster Identifier. |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition.clusterRoleARN Optional | string |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition.database Optional | string The name of the Redshift database used in Redshift query execution. |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition.dbUser Optional | string The database user name used in Redshift query execution. |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition.kmsKeyID Optional | string |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition.outputCompression Optional | string The compression used for Redshift query results. |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition.outputFormat Optional | string The data storage format for Redshift query results. |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition.outputS3URI Optional | string |
processingInputs.[].datasetDefinition.redshiftDatasetDefinition.queryString Optional | string The SQL query statements to be executed. |
processingInputs.[].inputName Optional | string |
processingInputs.[].s3Input Optional | object Configuration for downloading input data from Amazon S3 into the processing container. |
processingInputs.[].s3Input.localPath Optional | string |
processingInputs.[].s3Input.s3CompressionType Optional | string |
processingInputs.[].s3Input.s3DataDistributionType Optional | string |
processingInputs.[].s3Input.s3DataType Optional | string |
processingInputs.[].s3Input.s3InputMode Optional | string |
processingInputs.[].s3Input.s3URI Optional | string |
processingJobName Required | string The name of the processing job. The name must be unique within an Amazon Web Services Region in the Amazon Web Services account. |
processingOutputConfig Optional | object Output configuration for the processing job. |
processingOutputConfig.kmsKeyID Optional | string |
processingOutputConfig.outputs Optional | array |
processingOutputConfig.outputs.[] Required | object Describes the results of a processing job. The processing output must specify |
exactly one of either S3Output or FeatureStoreOutput types. | |
processingOutputConfig.outputs.[].featureStoreOutput Optional | object Configuration for processing job outputs in Amazon SageMaker Feature Store. |
processingOutputConfig.outputs.[].featureStoreOutput.featureGroupName Optional | string |
processingOutputConfig.outputs.[].outputName Optional | string |
processingOutputConfig.outputs.[].s3Output Optional | object Configuration for uploading output data to Amazon S3 from the processing container. |
processingOutputConfig.outputs.[].s3Output.localPath Optional | string |
processingOutputConfig.outputs.[].s3Output.s3URI Optional | string |
processingOutputConfig.outputs.[].s3Output.s3UploadMode Optional | string |
processingResources Required | object Identifies the resources, ML compute instances, and ML storage volumes to deploy for a processing job. In distributed training, you specify more than one instance. |
processingResources.clusterConfig Optional | object Configuration for the cluster used to run a processing job. |
processingResources.clusterConfig.instanceCount Optional | integer |
processingResources.clusterConfig.instanceType Optional | string |
processingResources.clusterConfig.volumeKMSKeyID Optional | string |
processingResources.clusterConfig.volumeSizeInGB Optional | integer |
roleARN Required | string The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf. |
stoppingCondition Optional | object The time limit for how long the processing job is allowed to run. |
stoppingCondition.maxRuntimeInSeconds Optional | integer |
tags Optional | array (Optional) An array of key-value pairs. For more information, see Using Cost Allocation Tags (https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html#allocation-whatURL) in the Amazon Web Services Billing and Cost Management User Guide. |
tags.[] Required | object A tag object that consists of a key and an optional value, used to manage |
metadata for SageMaker Amazon Web Services resources. |
You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AddTags.html).
For more information on adding metadata to your Amazon Web Services resources
with tagging, see Tagging Amazon Web Services resources (https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html).
For advice on best practices for managing Amazon Web Services resources with
tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services
Resource Tagging Strategy (https://d1.awsstatic.com/whitepapers/aws-tagging-best-practices.pdf). || tags.[].key
Optional | string
|
| tags.[].value
Optional | string
|
Status
ackResourceMetadata:
arn: string
ownerAccountID: string
region: string
conditions:
- lastTransitionTime: string
message: string
reason: string
status: string
type: string
failureReason: string
processingJobStatus: string
Field | Description |
---|---|
ackResourceMetadata Optional | object All CRs managed by ACK have a common Status.ACKResourceMetadata memberthat is used to contain resource sync state, account ownership, constructed ARN for the resource |
ackResourceMetadata.arn Optional | string ARN is the Amazon Resource Name for the resource. This is a globally-unique identifier and is set only by the ACK service controller once the controller has orchestrated the creation of the resource OR when it has verified that an “adopted” resource (a resource where the ARN annotation was set by the Kubernetes user on the CR) exists and matches the supplied CR’s Spec field values. TODO(vijat@): Find a better strategy for resources that do not have ARN in CreateOutputResponse https://github.com/aws/aws-controllers-k8s/issues/270 |
ackResourceMetadata.ownerAccountID Required | string OwnerAccountID is the AWS Account ID of the account that owns the backend AWS service API resource. |
ackResourceMetadata.region Required | string Region is the AWS region in which the resource exists or will exist. |
conditions Optional | array All CRS managed by ACK have a common Status.Conditions member thatcontains a collection of ackv1alpha1.Condition objects that describethe various terminal states of the CR and its backend AWS service API resource |
conditions.[] Required | object Condition is the common struct used by all CRDs managed by ACK service |
controllers to indicate terminal states of the CR and its backend AWS | |
service API resource | |
conditions.[].message Optional | string A human readable message indicating details about the transition. |
conditions.[].reason Optional | string The reason for the condition’s last transition. |
conditions.[].status Optional | string Status of the condition, one of True, False, Unknown. |
conditions.[].type Optional | string Type is the type of the Condition |
failureReason Optional | string A string, up to one KB in size, that contains the reason a processing job failed, if it failed. |
processingJobStatus Optional | string Provides the status of a processing job. |