FeatureGroup

sagemaker.services.k8s.aws/v1alpha1

TypeLink
GoDocsagemaker-controller/apis/v1alpha1#FeatureGroup

Metadata

PropertyValue
ScopeNamespaced
KindFeatureGroup
ListKindFeatureGroupList
Pluralfeaturegroups
Singularfeaturegroup

Amazon SageMaker Feature Store stores features in a collection called Feature Group. A Feature Group can be visualized as a table which has rows, with a unique identifier for each row where each column in the table is a feature. In principle, a Feature Group is composed of features and values per features.

Spec

description: string
eventTimeFeatureName: string
featureDefinitions:
  collectionConfig: 
    vectorConfig: 
      dimension: integer
  collectionType: string
  featureName: string
  featureType: string
featureGroupName: string
offlineStoreConfig: 
  dataCatalogConfig: 
    catalog: string
    database: string
    tableName: string
  disableGlueTableCreation: boolean
  s3StorageConfig: 
    kmsKeyID: string
    resolvedOutputS3URI: string
    s3URI: string
onlineStoreConfig: 
  enableOnlineStore: boolean
  securityConfig: 
    kmsKeyID: string
  storageType: string
  ttlDuration: 
    unit: string
    value: integer
recordIdentifierFeatureName: string
roleARN: string
tags:
- key: string
  value: string
throughputConfig: 
  provisionedReadCapacityUnits: integer
  provisionedWriteCapacityUnits: integer
  throughputMode: string
FieldDescription
description
Optional
string
A free-form description of a FeatureGroup.
eventTimeFeatureName
Required
string
The name of the feature that stores the EventTime of a Record in a FeatureGroup.


An EventTime is a point in time when a new event occurs that corresponds
to the creation or update of a Record in a FeatureGroup. All Records in the
FeatureGroup must have a corresponding EventTime.


An EventTime can be a String or Fractional.


* Fractional: EventTime feature values must be a Unix timestamp in seconds.


* String: EventTime feature values must be an ISO-8601 string in the format.
The following formats are supported yyyy-MM-dd’T’HH:mm:ssZ and yyyy-MM-dd’T’HH:mm:ss.SSSZ
where yyyy, MM, and dd represent the year, month, and day respectively
and HH, mm, ss, and if applicable, SSS represent the hour, month, second
and milliseconds respsectively. ‘T’ and Z are constants.
featureDefinitions
Required
array
A list of Feature names and types. Name and Type is compulsory per Feature.


Valid feature FeatureTypes are Integral, Fractional and String.


FeatureNames cannot be any of the following: is_deleted, write_time, api_invocation_time


You can create up to 2,500 FeatureDefinitions per FeatureGroup.
featureDefinitions.[]
Required
object
A list of features. You must include FeatureName and FeatureType. Valid feature
FeatureTypes are Integral, Fractional and String.
featureDefinitions.[].collectionConfig.vectorConfig
Optional
object
Configuration for your vector collection type.
featureDefinitions.[].collectionConfig.vectorConfig.dimension
Optional
integer
featureDefinitions.[].collectionType
Optional
string
featureDefinitions.[].featureName
Optional
string
featureDefinitions.[].featureType
Optional
string
featureGroupName
Required
string
The name of the FeatureGroup. The name must be unique within an Amazon Web
Services Region in an Amazon Web Services account. The name:


* Must start and end with an alphanumeric character.


* Can only contain alphanumeric character and hyphens. Spaces are not
allowed.
offlineStoreConfig
Optional
object
Use this to configure an OfflineFeatureStore. This parameter allows you to
specify:


* The Amazon Simple Storage Service (Amazon S3) location of an OfflineStore.


* A configuration for an Amazon Web Services Glue or Amazon Web Services
Hive data catalog.


* An KMS encryption key to encrypt the Amazon S3 location used for OfflineStore.
If KMS encryption key is not specified, by default we encrypt all data
at rest using Amazon Web Services KMS key. By defining your bucket-level
key (https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket-key.html)
for SSE, you can reduce Amazon Web Services KMS requests costs by up to
99 percent.


* Format for the offline store table. Supported formats are Glue (Default)
and Apache Iceberg (https://iceberg.apache.org/).


To learn more about this parameter, see OfflineStoreConfig (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_OfflineStoreConfig.html).
offlineStoreConfig.dataCatalogConfig
Optional
object
The meta data of the Glue table which serves as data catalog for the OfflineStore.
offlineStoreConfig.dataCatalogConfig.catalog
Optional
string
offlineStoreConfig.dataCatalogConfig.database
Optional
string
offlineStoreConfig.dataCatalogConfig.tableName
Optional
string
offlineStoreConfig.disableGlueTableCreation
Optional
boolean
offlineStoreConfig.s3StorageConfig
Optional
object
The Amazon Simple Storage (Amazon S3) location and and security configuration
for OfflineStore.
offlineStoreConfig.s3StorageConfig.kmsKeyID
Optional
string
offlineStoreConfig.s3StorageConfig.resolvedOutputS3URI
Optional
string
offlineStoreConfig.s3StorageConfig.s3URI
Optional
string
onlineStoreConfig
Optional
object
You can turn the OnlineStore on or off by specifying True for the EnableOnlineStore
flag in OnlineStoreConfig.


You can also include an Amazon Web Services KMS key ID (KMSKeyId) for at-rest
encryption of the OnlineStore.


The default value is False.
onlineStoreConfig.enableOnlineStore
Optional
boolean
onlineStoreConfig.securityConfig
Optional
object
The security configuration for OnlineStore.
onlineStoreConfig.securityConfig.kmsKeyID
Optional
string
onlineStoreConfig.storageType
Optional
string
onlineStoreConfig.ttlDuration
Optional
object
Time to live duration, where the record is hard deleted after the expiration
time is reached; ExpiresAt = EventTime + TtlDuration. For information on
HardDelete, see the DeleteRecord (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_feature_store_DeleteRecord.html)
API in the Amazon SageMaker API Reference guide.
onlineStoreConfig.ttlDuration.unit
Optional
string
onlineStoreConfig.ttlDuration.value
Optional
integer
recordIdentifierFeatureName
Required
string
The name of the Feature whose value uniquely identifies a Record defined
in the FeatureStore. Only the latest record per identifier value will be
stored in the OnlineStore. RecordIdentifierFeatureName must be one of feature
definitions' names.


You use the RecordIdentifierFeatureName to access data in a FeatureStore.


This name:


* Must start and end with an alphanumeric character.


* Can only contains alphanumeric characters, hyphens, underscores. Spaces
are not allowed.
roleARN
Optional
string
The Amazon Resource Name (ARN) of the IAM execution role used to persist
data into the OfflineStore if an OfflineStoreConfig is provided.
tags
Optional
array
Tags used to identify Features in each FeatureGroup.
tags.[]
Required
object
A tag object that consists of a key and an optional value, used to manage
metadata for SageMaker Amazon Web Services resources.

You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AddTags.html).

For more information on adding metadata to your Amazon Web Services resources with tagging, see Tagging Amazon Web Services resources (https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html). For advice on best practices for managing Amazon Web Services resources with tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services Resource Tagging Strategy (https://d1.awsstatic.com/whitepapers/aws-tagging-best-practices.pdf). || tags.[].key
Optional | string
| | tags.[].value
Optional | string
| | throughputConfig
Optional | object
Used to set feature group throughput configuration. There are two modes:
ON_DEMAND and PROVISIONED. With on-demand mode, you are charged for data
reads and writes that your application performs on your feature group. You
do not need to specify read and write throughput because Feature Store accommodates
your workloads as they ramp up and down. You can switch a feature group to
on-demand only once in a 24 hour period. With provisioned throughput mode,
you specify the read and write capacity per second that you expect your application
to require, and you are billed based on those limits. Exceeding provisioned
throughput will result in your requests being throttled.


Note: PROVISIONED throughput mode is supported only for feature groups that
are offline-only, or use the Standard (https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_OnlineStoreConfig.html#sagemaker-Type-OnlineStoreConfig-StorageType)
tier online store. | | throughputConfig.provisionedReadCapacityUnits
Optional | integer
| | throughputConfig.provisionedWriteCapacityUnits
Optional | integer
| | throughputConfig.throughputMode
Optional | string
|

Status

ackResourceMetadata: 
  arn: string
  ownerAccountID: string
  region: string
conditions:
- lastTransitionTime: string
  message: string
  reason: string
  status: string
  type: string
failureReason: string
featureGroupStatus: string
FieldDescription
ackResourceMetadata
Optional
object
All CRs managed by ACK have a common Status.ACKResourceMetadata member
that is used to contain resource sync state, account ownership,
constructed ARN for the resource
ackResourceMetadata.arn
Optional
string
ARN is the Amazon Resource Name for the resource. This is a
globally-unique identifier and is set only by the ACK service controller
once the controller has orchestrated the creation of the resource OR
when it has verified that an “adopted” resource (a resource where the
ARN annotation was set by the Kubernetes user on the CR) exists and
matches the supplied CR’s Spec field values.
TODO(vijat@): Find a better strategy for resources that do not have ARN in CreateOutputResponse
https://github.com/aws/aws-controllers-k8s/issues/270
ackResourceMetadata.ownerAccountID
Required
string
OwnerAccountID is the AWS Account ID of the account that owns the
backend AWS service API resource.
ackResourceMetadata.region
Required
string
Region is the AWS region in which the resource exists or will exist.
conditions
Optional
array
All CRS managed by ACK have a common Status.Conditions member that
contains a collection of ackv1alpha1.Condition objects that describe
the various terminal states of the CR and its backend AWS service API
resource
conditions.[]
Required
object
Condition is the common struct used by all CRDs managed by ACK service
controllers to indicate terminal states of the CR and its backend AWS
service API resource
conditions.[].message
Optional
string
A human readable message indicating details about the transition.
conditions.[].reason
Optional
string
The reason for the condition’s last transition.
conditions.[].status
Optional
string
Status of the condition, one of True, False, Unknown.
conditions.[].type
Optional
string
Type is the type of the Condition
failureReason
Optional
string
The reason that the FeatureGroup failed to be replicated in the OfflineStore.
This is failure can occur because:


* The FeatureGroup could not be created in the OfflineStore.


* The FeatureGroup could not be deleted from the OfflineStore.
featureGroupStatus
Optional
string
The status of the feature group.