Tokens, Records, and Aliases

Tokenization, when applied to data security, is the process of substituting a sensitive data element with a non-sensitive equivalent, referred to as a token, that has no extrinsic or exploitable meaning or value. The token is a reference (i.e. identifier) that maps back to the sensitive data through a tokenization system. Wikipedia

At VGS a token is called a Record and the reference identifier is called an Alias and it stores a Value. A Record can have one or many Aliases allowing you to represent a single element of sensitive data in many ways depending on the permissions, use-case or sensitivity of the data.

Record(alias="tok_sandbox_asdf1234", value="your sensitive data here")

Records are stored securely within your VGS Vault and are subject to data retention policies and storage configuration based on user preference or compliance requirements.

Retention Policies

A retention policy (aka 'schedule') is a key part of the lifecycle of a record. It describes how long a business needs to keep a piece of information (record), where it's stored and how to dispose of the record when its time.

Records within your VGS Vault are stored within a persistent or volatile storage medium to allow custom control over data retention policies to allow for compliant storing of any sensitive data including providing a full PCI level 1 compliance when storing card security code (CSC) data for both merchant processing and card issuing scenarios.

Classification and Tagging

Data classification is the process of organizing data into categories that make it is easy to retrieve, sort and store for future use. A well-planned data classification system makes essential data easy to find and retrieve. This can be of particular importance for risk management, legal discovery and compliance.

Through the use of the VGS Tokenization API and VGS Routes you can classify and tag data using either static or dynamic tagging policies.

Routes allow restricting certain classes of data to specific destinations allowing fine grained control over where data within your VGS Vault is routed enabling simple, policy based control of your sensitive data flows from a single centralized location.

Alias Formats

Formats come in several varieties to choose from based on your use case.

Format
Description

UUID

Represents any piece of data. Format returns a surrogate value like`tok_sandbox_xxxxxxxxxxxxxxxxxxxxxxxxx`. You can match this format using the regex: (vgs|tok)_[A-Za-z0-9-]+_[a-zA-Z0-9\-]{4,32}

NUM_LENGTH_PRESERVING

Can be used for any number that needs to have it's length maintained for form validation or other reasons where the length returned matters. This does not support numbers less than 3.

FPE_SIX_T_FOUR

To be used for Payment cards when you need them to still go through a validation check and capture the BIN (Bank Identification Number) and the last four digits. Example `4111111111111111` becomes something like `4111119381251111`.

FPE_T_FOUR

To be used for Payment cards where you do not need a BIN but it is still Luhn Valid to pass validation checks on your system. `5555555555554444` would become something like `9399630812244444`.

PFPT

This format makes it easy to distinguish between real sensitive data and the surrogate values. For example `4012888888881881` turns into `9914040119524511881` The prefix here is `99` with `1` reserved for versioning of this format. The 4th and 5th digits represent the first two digits of the original PAN and the last four digits represent the last four from the original PAN.

NON_LUHN_FPE_ALPHANUMERIC

This format generates format preserving card number which **does not pass the [Luhn validation](https://en.wikipedia.org/wiki/Luhn_algorithm)**. This format is useful to make sure that surrogate data can be identified algorithmically. Example: `7858402423279985`.

FPE_SSN_T_FOUR

Can be used for social security number (SSN). Possible to use ssn with dashes or without. For example `567-34-5672` would have alias like `123-945-5672`, last four digits are the same.

FPE_ACC_NUM_T_FOUR

Could be used for numeric account number. The length of the value could be in range from 7 to 17. It keep last four digits of the value untouched.

FPE_ALPHANUMERIC_ACC_NUM_T_FOUR

Generates an alphanumeric account number string with a length of 7 to 17 characters. It keep last four characters untouched from the value.

GENERIC_T_FOUR

Can be used for any type of data. This will generate an alias with the last four characters of the original value after the alias - `tok_sandbox_xxxxxxxxxxxxxxxxxxxxxxxxx_:last_four`, where the x’s are alphanumeric characters. The length of the value must be greater or equal to 7.

RAW_UUID

Can be used for any type of data. This will generate an Alias with a UUID format. Example: `76b8a972-f0a7-4e87-a07f-a51c92314d56`.

ALPHANUMERIC_SIX_T_FOUR

This format generates alias that is the combination of BIN, alphanumeric part and last 4 digits. Example of a generated alias: `785840aLpH4nUmV9985`. Last character of alphanumeric part tells if original value was Luhn valid. `V` or `N` means that is was Luhn valid/not valid. An alias in this format will be created _only_ if the value is numeric and length is more than 13 symbols.

VGS_FIXED_LEN_GENERIC

This format generates an alias that has a fixed length of 29 digits regardless of input. All aliases generated will have the first 3 characters as `vgs`. The 4th - 6th character will be the environment (ex: `sbx`, `l01`). Example of a generated alias: `vgsl0100VwAc1nxucgPiPAhUcZ3AF`

Length or format preserving will generate a RAW_UUID if you provide an invalid value or if you exceed the possible unique combinations of available aliases for the next formats: 1. Generic - Numeric Length Preserving 2. Payment Card - Format Preserving, Luhn Valid (T4) 3. Payment Card - Prefixed, Luhn Valid, 19 Digits Fixed Length 4. SSN - Format Preserving (A4) 5. Account Number - Numeric Length Preserving (A4) 6. Account Number - Alphanumeric Length Preserving (A4) Length or format preserving will generate a UUID if you provide an invalid value or if you exceed the possible unique combinations of available aliases for the next formats: 1. Generic - VGS Alias last 4 2. Payment Card - Format Preserving, Luhn Valid (6T4) 3. Numeric - Include Alphanumeric, 19 symbols length (6T4)\

Fingerprinting

When the fingerprinting feature is enabled, it ensures that every time a specific value is redacted, the same token alias is returned. This helps to minimize token duplication, which occurs when multiple aliases refer to the same value.

For example, if you redact the string 4111111111111111 twice with the fingerprinting feature enabled, both times you will get the same alias, tok_live_5TsdDFxbATPKOTJFvRSHGn.

However, if the fingerprinting feature is disabled, redacting 4111111111111111 twice will yield two different aliases. The first redaction might return tok_live_5TsdDFxbATPKOTJFvRSHGn, while the second redaction might return tok_live_4PdsSGsfZKQLROLDcETGEf.

It’s important to note that this feature only applies to new redactions. If a value was stored while the feature was off, it will continue to be retrievable (via reveal) with the previous aliases assigned to it. To ensure consistent behavior, you would need to delete the previous aliases.

Record usage and failed reveals

When revealing aliases, error might occur that indicates that the reveal of data failed. VGS gives a possibility to track success record usage, as well as failed record usage via Observability (Prometheus integration).

Failed reveals (record_usage_failure metric) might occur due to the following errors:

  • NOT_FOUND - Exception raised if searching for an object and it’s not found.

  • ACCESS_DENIED - Exception raised if searching for an object and it’s found but access is denied based on request classifiers and tags on a token.

  • TOKENIZATION_FAILED - Exception raised if there are issues during tokenization or detokenization of a value.

Next steps

Last updated