LabelingJob

sagemaker.services.k8s.aws/v1alpha1

TypeLink
GoDocsagemaker-controller/apis/v1alpha1#LabelingJob

Metadata

PropertyValue
ScopeNamespaced
KindLabelingJob
ListKindLabelingJobList
Plurallabelingjobs
Singularlabelingjob

Spec

humanTaskConfig: 
  annotationConsolidationConfig: 
    annotationConsolidationLambdaARN: string
  maxConcurrentTaskCount: integer
  numberOfHumanWorkersPerDataObject: integer
  preHumanTaskLambdaARN: string
  publicWorkforceTaskPrice: 
    amountInUsd: 
      cents: integer
      dollars: integer
      tenthFractionsOfACent: integer
  taskAvailabilityLifetimeInSeconds: integer
  taskDescription: string
  taskKeywords:
  - string
  taskTimeLimitInSeconds: integer
  taskTitle: string
  uiConfig: 
    humanTaskUIARN: string
    uiTemplateS3URI: string
  workteamARN: string
inputConfig: 
  dataAttributes: 
    contentClassifiers:
    - string
  dataSource: 
    s3DataSource: 
      manifestS3URI: string
    snsDataSource: 
      snsTopicARN: string
labelAttributeName: string
labelCategoryConfigS3URI: string
labelingJobAlgorithmsConfig: 
  initialActiveLearningModelARN: string
  labelingJobAlgorithmSpecificationARN: string
  labelingJobResourceConfig: 
    volumeKMSKeyID: string
    vpcConfig: 
      securityGroupIDs:
      - string
      subnets:
      - string
labelingJobName: string
outputConfig: 
  kmsKeyID: string
  s3OutputPath: string
  snsTopicARN: string
roleARN: string
stoppingConditions: 
  maxHumanLabeledObjectCount: integer
  maxPercentageOfInputDatasetLabeled: integer
tags:
- key: string
  value: string
FieldDescription
humanTaskConfig
Required
object
Configures the labeling task and how it is presented to workers; including,
but not limited to price, keywords, and batch size (task count).
humanTaskConfig.annotationConsolidationConfig
Optional
object
Configures how labels are consolidated across human workers and processes
output data.
humanTaskConfig.annotationConsolidationConfig.annotationConsolidationLambdaARN
Optional
string
humanTaskConfig.maxConcurrentTaskCount
Optional
integer
humanTaskConfig.numberOfHumanWorkersPerDataObject
Optional
integer
humanTaskConfig.preHumanTaskLambdaARN
Optional
string
humanTaskConfig.publicWorkforceTaskPrice
Optional
object
Defines the amount of money paid to an Amazon Mechanical Turk worker for
each task performed.

Use one of the following prices for bounding box tasks. Prices are in US
dollars and should be based on the complexity of the task; the longer it
takes in your initial testing, the more you should offer.

* 0.036

* 0.048

* 0.060

* 0.072

* 0.120

* 0.240

* 0.360

* 0.480

* 0.600

* 0.720

* 0.840

* 0.960

* 1.080

* 1.200

Use one of the following prices for image classification, text classification,
and custom tasks. Prices are in US dollars.

* 0.012

* 0.024

* 0.036

* 0.048

* 0.060

* 0.072

* 0.120

* 0.240

* 0.360

* 0.480

* 0.600

* 0.720

* 0.840

* 0.960

* 1.080

* 1.200

Use one of the following prices for semantic segmentation tasks. Prices are
in US dollars.

* 0.840

* 0.960

* 1.080

* 1.200

Use one of the following prices for Textract AnalyzeDocument Important Form
Key Amazon Augmented AI review tasks. Prices are in US dollars.

* 2.400

* 2.280

* 2.160

* 2.040

* 1.920

* 1.800

* 1.680

* 1.560

* 1.440

* 1.320

* 1.200

* 1.080

* 0.960

* 0.840

* 0.720

* 0.600

* 0.480

* 0.360

* 0.240

* 0.120

* 0.072

* 0.060

* 0.048

* 0.036

* 0.024

* 0.012

Use one of the following prices for Rekognition DetectModerationLabels Amazon
Augmented AI review tasks. Prices are in US dollars.

* 1.200

* 1.080

* 0.960

* 0.840

* 0.720

* 0.600

* 0.480

* 0.360

* 0.240

* 0.120

* 0.072

* 0.060

* 0.048

* 0.036

* 0.024

* 0.012

Use one of the following prices for Amazon Augmented AI custom human review
tasks. Prices are in US dollars.

* 1.200

* 1.080

* 0.960

* 0.840

* 0.720

* 0.600

* 0.480

* 0.360

* 0.240

* 0.120

* 0.072

* 0.060

* 0.048

* 0.036

* 0.024

* 0.012
humanTaskConfig.publicWorkforceTaskPrice.amountInUsd
Optional
object
Represents an amount of money in United States dollars.
humanTaskConfig.publicWorkforceTaskPrice.amountInUsd.cents
Optional
integer
humanTaskConfig.publicWorkforceTaskPrice.amountInUsd.dollars
Optional
integer
humanTaskConfig.publicWorkforceTaskPrice.amountInUsd.tenthFractionsOfACent
Optional
integer
humanTaskConfig.taskAvailabilityLifetimeInSeconds
Optional
integer
humanTaskConfig.taskDescription
Optional
string
humanTaskConfig.taskKeywords
Optional
array
humanTaskConfig.taskKeywords.[]
Required
string
humanTaskConfig.taskTitle
Optional
string
humanTaskConfig.uiConfig
Optional
object
Provided configuration information for the worker UI for a labeling job.
Provide either HumanTaskUiArn or UiTemplateS3Uri.

For named entity recognition, 3D point cloud and video frame labeling jobs,
use HumanTaskUiArn.

For all other Ground Truth built-in task types and custom task types, use
UiTemplateS3Uri to specify the location of a worker task template in Amazon
S3.
humanTaskConfig.uiConfig.humanTaskUIARN
Optional
string
humanTaskConfig.uiConfig.uiTemplateS3URI
Optional
string
humanTaskConfig.workteamARN
Optional
string
inputConfig
Required
object
Input data for the labeling job, such as the Amazon S3 location of the data
objects and the location of the manifest file that describes the data objects.

You must specify at least one of the following: S3DataSource or SnsDataSource.

* Use SnsDataSource to specify an SNS input topic for a streaming labeling
job. If you do not specify and SNS input topic ARN, Ground Truth will
create a one-time labeling job that stops after all data objects in the
input manifest file have been labeled.

* Use S3DataSource to specify an input manifest file for both streaming
and one-time labeling jobs. Adding an S3DataSource is optional if you
use SnsDataSource to create a streaming labeling job.

If you use the Amazon Mechanical Turk workforce, your input data should not
include confidential information, personal information or protected health
information. Use ContentClassifiers to specify that your data is free of
personally identifiable information and adult content.
inputConfig.dataAttributes
Optional
object
Attributes of the data specified by the customer. Use these to describe the
data to be labeled.
inputConfig.dataAttributes.contentClassifiers
Optional
array
inputConfig.dataAttributes.contentClassifiers.[]
Required
string
inputConfig.dataSource.s3DataSource
Optional
object
The Amazon S3 location of the input data objects.
inputConfig.dataSource.s3DataSource.manifestS3URI
Optional
string
inputConfig.dataSource.snsDataSource
Optional
object
An Amazon SNS data source used for streaming labeling jobs.
inputConfig.dataSource.snsDataSource.snsTopicARN
Optional
string
labelAttributeName
Required
string
The attribute name to use for the label in the output manifest file. This
is the key for the key/value pair formed with the label that a worker assigns
to the object. The LabelAttributeName must meet the following requirements.

* The name can’t end with “-metadata”.

* If you are using one of the following built-in task types (https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html),
the attribute name must end with “-ref”. If the task type you are using
is not listed below, the attribute name must not end with “-ref”. Image
semantic segmentation (SemanticSegmentation), and adjustment (AdjustmentSemanticSegmentation)
and verification (VerificationSemanticSegmentation) labeling jobs for
this task type. Video frame object detection (VideoObjectDetection), and
adjustment and verification (AdjustmentVideoObjectDetection) labeling
jobs for this task type. Video frame object tracking (VideoObjectTracking),
and adjustment and verification (AdjustmentVideoObjectTracking) labeling
jobs for this task type. 3D point cloud semantic segmentation (3DPointCloudSemanticSegmentation),
and adjustment and verification (Adjustment3DPointCloudSemanticSegmentation)
labeling jobs for this task type. 3D point cloud object tracking (3DPointCloudObjectTracking),
and adjustment and verification (Adjustment3DPointCloudObjectTracking)
labeling jobs for this task type.

If you are creating an adjustment or verification labeling job, you must
use a different LabelAttributeName than the one used in the original labeling
job. The original labeling job is the Ground Truth labeling job that produced
the labels that you want verified or adjusted. To learn more about adjustment
and verification labeling jobs, see Verify and Adjust Labels (https://docs.aws.amazon.com/sagemaker/latest/dg/sms-verification-data.html).

Regex Pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,126}$
labelCategoryConfigS3URI
Optional
string
The S3 URI of the file, referred to as a label category configuration file,
that defines the categories used to label the data objects.

For 3D point cloud and video frame task types, you can add label category
attributes and frame attributes to your label category configuration file.
To learn how, see Create a Labeling Category Configuration File for 3D Point
Cloud Labeling Jobs (https://docs.aws.amazon.com/sagemaker/latest/dg/sms-point-cloud-label-category-config.html).

For named entity recognition jobs, in addition to “labels”, you must provide
worker instructions in the label category configuration file using the “instructions”
parameter: “instructions”: {“shortInstruction”:"
Add header

Add Instructions
", “fullInstruction”:"
Add additional instructions.
"}. For details and an example, see Create a Named Entity Recognition Labeling
Job (API) (https://docs.aws.amazon.com/sagemaker/latest/dg/sms-named-entity-recg.html#sms-creating-ner-api).

For all other built-in task types (https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html)
and custom tasks (https://docs.aws.amazon.com/sagemaker/latest/dg/sms-custom-templates.html),
your label category configuration file must be a JSON file in the following
format. Identify the labels you want to use by replacing label_1, label_2,…,label_n
with your label categories.

{

“document-version”: “2018-11-28”,

“labels”: [{“label”: “label_1”},{“label”: “label_2”},…{“label”: “label_n”}]

}

Note the following about the label category configuration file:

* For image classification and text classification (single and multi-label)
you must specify at least two label categories. For all other task types,
the minimum number of label categories required is one.

* Each label category must be unique, you cannot specify duplicate label
categories.

* If you create a 3D point cloud or video frame adjustment or verification
labeling job, you must include auditLabelAttributeName in the label category
configuration. Use this parameter to enter the LabelAttributeName (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html#sagemaker-CreateLabelingJob-request-LabelAttributeName)
of the labeling job you want to adjust or verify annotations of.

Regex Pattern: `^(https
labelingJobAlgorithmsConfig
Optional
object
Configures the information required to perform automated data labeling.
labelingJobAlgorithmsConfig.initialActiveLearningModelARN
Optional
string
labelingJobAlgorithmsConfig.labelingJobAlgorithmSpecificationARN
Optional
string
labelingJobAlgorithmsConfig.labelingJobResourceConfig
Optional
object
Configure encryption on the storage volume attached to the ML compute instance
used to run automated data labeling model training and inference.
labelingJobAlgorithmsConfig.labelingJobResourceConfig.volumeKMSKeyID
Optional
string
labelingJobAlgorithmsConfig.labelingJobResourceConfig.vpcConfig
Optional
object
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs,
hosted models, and compute resources have access to. You can control access
to and from your resources by configuring a VPC. For more information, see
Give SageMaker Access to Resources in your Amazon VPC (https://docs.aws.amazon.com/sagemaker/latest/dg/infrastructure-give-access.html).
labelingJobAlgorithmsConfig.labelingJobResourceConfig.vpcConfig.securityGroupIDs
Optional
array
labelingJobAlgorithmsConfig.labelingJobResourceConfig.vpcConfig.securityGroupIDs.[]
Required
string
labelingJobAlgorithmsConfig.labelingJobResourceConfig.vpcConfig.subnets.[]
Required
string
outputConfig
Required
object
The location of the output data and the Amazon Web Services Key Management
Service key ID for the key used to encrypt the output data, if any.
outputConfig.kmsKeyID
Optional
string
outputConfig.s3OutputPath
Optional
string
outputConfig.snsTopicARN
Optional
string
roleARN
Required
string
The Amazon Resource Number (ARN) that Amazon SageMaker assumes to perform
tasks on your behalf during data labeling. You must grant this role the necessary
permissions so that Amazon SageMaker can successfully complete data labeling.

Regex Pattern: ^arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+$
stoppingConditions
Optional
object
A set of conditions for stopping the labeling job. If any of the conditions
are met, the job is automatically stopped. You can use these conditions to
control the cost of data labeling.
stoppingConditions.maxHumanLabeledObjectCount
Optional
integer
stoppingConditions.maxPercentageOfInputDatasetLabeled
Optional
integer
tags
Optional
array
An array of key/value pairs. For more information, see Using Cost Allocation
Tags (https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html#allocation-what)
in the Amazon Web Services Billing and Cost Management User Guide.
tags.[]
Required
object
A tag object that consists of a key and an optional value, used to manage
metadata for SageMaker Amazon Web Services resources.

You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AddTags.html).

For more information on adding metadata to your Amazon Web Services resources with tagging, see Tagging Amazon Web Services resources (https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html). For advice on best practices for managing Amazon Web Services resources with tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services Resource Tagging Strategy (https://d1.awsstatic.com/whitepapers/aws-tagging-best-practices.pdf). || tags.[].key
Optional | string
| | tags.[].value
Optional | string
|

Status

ackResourceMetadata: 
  arn: string
  ownerAccountID: string
  region: string
conditions:
- lastTransitionTime: string
  message: string
  reason: string
  status: string
  type: string
failureReason: string
jobReferenceCode: string
labelCounters: 
  failedNonRetryableError: integer
  humanLabeled: integer
  machineLabeled: integer
  totalLabeled: integer
  unlabeled: integer
labelingJobOutput: 
  finalActiveLearningModelARN: string
  outputDatasetS3URI: string
labelingJobStatus: string
FieldDescription
ackResourceMetadata
Optional
object
All CRs managed by ACK have a common Status.ACKResourceMetadata member
that is used to contain resource sync state, account ownership,
constructed ARN for the resource
ackResourceMetadata.arn
Optional
string
ARN is the Amazon Resource Name for the resource. This is a
globally-unique identifier and is set only by the ACK service controller
once the controller has orchestrated the creation of the resource OR
when it has verified that an “adopted” resource (a resource where the
ARN annotation was set by the Kubernetes user on the CR) exists and
matches the supplied CR’s Spec field values.
https://github.com/aws/aws-controllers-k8s/issues/270
ackResourceMetadata.ownerAccountID
Required
string
OwnerAccountID is the AWS Account ID of the account that owns the
backend AWS service API resource.
ackResourceMetadata.region
Required
string
Region is the AWS region in which the resource exists or will exist.
conditions
Optional
array
All CRs managed by ACK have a common Status.Conditions member that
contains a collection of ackv1alpha1.Condition objects that describe
the various terminal states of the CR and its backend AWS service API
resource
conditions.[]
Required
object
Condition is the common struct used by all CRDs managed by ACK service
controllers to indicate terminal states of the CR and its backend AWS
service API resource
conditions.[].message
Optional
string
A human readable message indicating details about the transition.
conditions.[].reason
Optional
string
The reason for the condition’s last transition.
conditions.[].status
Optional
string
Status of the condition, one of True, False, Unknown.
conditions.[].type
Optional
string
Type is the type of the Condition
failureReason
Optional
string
If the job failed, the reason that it failed.
jobReferenceCode
Optional
string
A unique identifier for work done as part of a labeling job.

Regex Pattern: ^.+$
labelCounters
Optional
object
Provides a breakdown of the number of data objects labeled by humans, the
number of objects labeled by machine, the number of objects than couldn’t
be labeled, and the total number of objects labeled.
labelCounters.failedNonRetryableError
Optional
integer
labelCounters.humanLabeled
Optional
integer
labelCounters.machineLabeled
Optional
integer
labelCounters.totalLabeled
Optional
integer
labelCounters.unlabeled
Optional
integer
labelingJobOutput
Optional
object
The location of the output produced by the labeling job.
labelingJobOutput.finalActiveLearningModelARN
Optional
string
labelingJobOutput.outputDatasetS3URI
Optional
string
labelingJobStatus
Optional
string
The processing status of the labeling job.