Configuration

Service configuration structure
Default configuration

The EPTS loads configuration data from a file (typically backed by a Kubernetes config map). The location of the configuration file is sourced from an environment variable. If the service is unable to retrieve the configuration data, it will not start. Refer to the configuration for details.

Service configuration structure

The recommended configuration format is YAML. The below sections describe the officially supported configuration options that influence various aspects of the service functionality. The service may support options other than the ones listed below, but those are not a part of the public API and may be changed or deleted at any time.

General

EPTS uses its service instance name and replica ID for communication with other services and load balancing across replicas (for example, see 3/ISM). You can control the values with the following configuration options:

service:
  instance:
    name: <string>          # Service instance name, should be unique cluster-wide. "epts" by default.
    replica.id: <string>    # Service instance replica ID, should be unique cluster-wide. Random string by default.
  graceful-shutdown: <uint> # Allowed service graceful shutdown period in seconds. 30 seconds by default.

The REST API server is controlled with the following options.

server:
  port: <integer>           # Server port used to expose the REST API at, 80 by default.

Kaa applications

Many Kaa services can be configured for different behavior depending on the application version of the endpoint the processed data relates to. This is called application-specific behavior and is handled in service configurations under kaa.applications. Alternatively, the application-specific configuration can be sourced from Kaa Tekton. See the Tekton configuration section to find out how to configure such integration.

kaa:
  applications:
    <application 1 name>:
      versions:
        <application 1 version 1 name>:
        <application 1 version 2 name>:  # Multiple application versions can be configured under the "versions" key
          ...
    <application 2 name>:                # Multiple applications can be configured under the "applications" key
      ...

For example:

kaa:
  applications:
    smart_kettle:
      versions:
        kettle_v1:
        kettle_v2:
        kettle_v3:

Due to the various compatibility reasons the application and application version names must be limited to lowercase Latin letters (a-z), digits (0-9), dashes (-) and underscores (_).

Time series configuration

Time series in EPTS are defined within the scope of their applications.

The configuration consists of two logical parts:

time series definition specifies the structure of a given time series: which values exist and what their types are;
time series extraction configuration specifies the rules for extracting time series data points from data samples received from endpoints.

Both time series definition and extraction configuration can be specified within an application. However, only the extraction configuration can be overridden in an application version. The time series definition is the same application-wide for consistency and convenience reasons. If you need a change to the time series structure in a certain application version, that deserves a new time series name.

The full structure of the time series configuration is shown below. We will review it in more detail in the following subsections.

kaa:
  applications:
    <application name>:
      time-series:
        auto-extract: <boolean>                   # Whether to auto-extract all numeric values from data sample. `false` by default.
        timestamp:
          path: <gjson-path>                      # Path of the timestamp field in data samples. Optional.
          format: <timestamp-format>              # Format of the timestamp in data sample. "iso8601" by default.
          fallback-strategy: <fallback-strategy>  # Fallback strategy to use. `fail` by default.
        names:
          <time series name>:
            values:
            - name: <string>          # Time series value name. "value" by default.
              type: <string>          # Time series value type. "number" by default.
              path: <gjson-path>      # Path to the value in the incoming data sample. Same as `name` by default.
      versions:
        <application version name>:
          time-series:                # Time series extraction configuration overridde for an application version.
            auto-extract: <boolean>
            timestamp:
              path: <gjson-path>
              format: <timestamp-format>
              fallback-strategy: <fallback-strategy>
            names:
              <time series name>:     # One of time series names defined at the application level.
                values:
                - name: <string>      # Name of a time series value field. Must be defined at the application level.
                  path: <gjson-path>  # Path to the value in the incoming data sample for the given application version. Required.

Time series definition

Time series structure is defined once per application under kaa.applications.<application name>.time-series.names. Configuration fields that contribute to the definition are:

<time series name>—the time series name. Must be unique application-wide and limited to Latin letters (a-z, A-Z), digits (0-9), dashes (-) and underscores (_). (e.g. temperature, ground_speed, 402-metres-time, etc.).
<time series name>.values—list of time series values.
<time series name>.values[].name—time series value name. Must be unique time series-wide and not include any of: backslash (\), circumflex accent (^), dollar sign ($), single quotation mark ('), double quotation mark ("), equal sign (=), or comma (,). Defaults to value. Names time, kaaEndpointID, kaaEmpty, _field, and _measurement are reserved and cannot be used.
<time series name>.values[].type—time series value type. One of number, string, or bool, where number is stored as a 64-bit float. null values are permitted for any value type.

Time series extraction

Upon receiving data samples (which are basically arbitrary JSON records), EPTS extracts data points according to the time series extraction configuration.

IMPORTANT:In the process of the time series extraction, EPTS attempts to match that data sample against each of the time series configured in the corresponding application. For the match to occur and a data point to be extracted into a given time series, all time series values must be found in the data sample by their respective configured path, and the found field type must match the expected value type. If the processed data sample does not contain a field when searched by the value path, or the field type does not match the time series specification, EPTS concludes that the data sample does not contain the given time series data, skips the extraction of a data point for that time series, and proceeds to the next configured time series.

number time-series values can be extracted from JSON integers or numbers, or parsed from JSON strings, when possible. Note that an explicit null value in a data sample satisfies any value type and will be processed by EPTS as an empty value.

As you can see from the above, you can think of EPTS configured time series as filters that get applied against the received data samples. Data points only get added to time series when data samples contain all configured time series values with appropriate data types. This is especially useful when your connected devices take and report different measurements at different time intervals. For example, you can submit environmental conditions (e.g. temperature and humidity) every 10 minutes, and the battery level—every hour. Thus, every hour a data sample reported from device would contain an additional battery field. If you configure one time series to extract environmental conditions data, and the other—battery level, EPTS will extract the former time series from data samples every 10 minutes, and the latter—every hour.

Base time series extraction configuration is defined application-wide and is applied by default, unless you override it for a specific application version.

kaa.applications.<application name>.time-series.names.<time series name>.values[].path defines the path to the value in the data sample. EPTS uses GJSON for extracting time series values, so you can use the path syntax specified by GJSON. This setting can be overriden for an application version using kaa.applications.<application name>.versions.<application version name>.time-series.names.<time series name>.values[].path option.

Timestamp extraction

Every time series data point must contain a timestamp. EPTS supports extracting timestamps from received data samples or using the server receipt time. The timestamp extraction can be both defined at the application level under kaa.applications.<application name>.time-series.timestamp, and overriden for an application version under kaa.applications.<application name>.versions.<application version name>.time-series.timestamp. The following options are supported:

path—path to the timestamp field in the data sample. Same as with the time series values path, GJSON path syntax is supported. If path is not specified, EPTS does not attempt to extract a timestamp from the data sample, and uses the server receipt time stamp instead.
format—field format to use for parsing the timestamp located at path. Only used when the path is set. Supported formats include:
- iso8601—ISO 8601 (default);
- millis-unix-epoch—integer milliseconds elapsed since Unix epoch;
- sec-unix-epoch—integer seconds elapsed since Unix epoch.
fallback-strategy—Defines EPTS behavior when the timestamp parsing fails for any reason (missing field, wrong format). Supported strategies are:
- server-timestamp—fallback to using the server receipt timestamp;
- fail—fail the data sample processing and respond back to the DSTP transmitter with an error (default).

Time series auto-extraction

kaa.applications.<application name>.time-series.auto-extract option enables time series auto-extraction from data samples. If set to true, EPTS will automatically create a time series for each top-level numeric field in the data sample JSON with keys limited to Latin letters (a-z, A-Z), digits (0-9), dashes (-) and underscores (_). This setting can be overridden for an application version using kaa.applications.<application name>.versions.<application version name>.time-series.auto-extract option.

Auto-extracted time series name will match the source field name with auto~ prefix (e.g. auto~temperature) to prevent collisions with user-defined time series names.

Data samples receiver

Use the following options to configure the data samples receiver interface.

kaa:
  dstp.receiver:
    from: <string>        # Name of the data samples transmission service instance EPTS will subscribe to.
    concurrency: <uint>   # The maximum number of concurrent workers, which will consume 13/DSTP messages from message pool. 256 by default.
    queue-length: <uint>  # The maximum amount of messages in the DSTP receiver queue. When the queue is full, new messages will be dropped. 64 * `kaa.dstp.receiver.concurrency` by default.

Time series receiver

Use the following options to configure the time series receiver interface that EPTS uses for time series consumption. You can configure subscription to multiple transmitters and either define specific time series to listen to, or use a “*” wildcard.

NOTE: EPTS will only process time series data points that match the configuration according to the time series configuration section.

kaa:
  tstp.receiver:
    concurrency: <uint>               # The maximum number of concurrent workers, which will consume 14/TSTP messages from message pool. 16 by default.
    queue-length: <uint>              # The maximum amount of messages in the TSTP receiver queue. When the queue is full, new messages will be dropped. 64 * `kaa.tstp.receiver.concurrency` by default.
    from:                             # Map of time series transmission service instance EPTS will subscribe to.
      <transmission service instance name>:
        time-series: <list of string> # Time series names that EPTS will consume. Optional, "*" by default.
      ...

Tekton

EPTS supports integration with Kaa Tekton for centralized application configuration management. The below configuration options set up the integration interface.

kaa:
  tekton:
    enabled: <boolean>    # Enables Tekton integration. False by default. Also can be set with the KAA_TEKTON_ENABLED environment variable.
    url: <string>         # URL of the Tekton service. "http://tekton" by default. Also can be set with the KAA_TEKTON_URL environment variable.
  scmp.consumer:
    provider.service-instance-name: <string>  # Service instance name of the Tekton service. "tekton" by default.
    queue-length: <uint>                      # Maximum queue length for 17/SCMP messages from Tekton. 256 by default.

For the Tekton integration to function, there must be no kaa.applications key in the configuration file. Such configuration, when present, takes precedence over the Tekton-supplied application-specific configs.

Data persistence interface

EPTS uses InfluxDB for persisting time series data.

kaa:
  influx:
    precision: <string>       # The data points timestamp precision in InfluxDB.
                              # All timestamps stored in the InfluxDB are truncated to the given precision.
                              # `h` (hours), `m` (minutes), `s` (seconds), `ms` (milliseconds), `u` (microseconds), `ns` (nanoseconds).
                              # Defaults to `ns`.

    url: <string>             # InfluxDB URL. "http://influxdb:8086" by default.
    user: <string>            # InfluxDB user (see note below)
    password: <string>        # InfluxDB password (see note below)
    idle-connections: <uint>  # Maximum idle connections in the InfluxDB client HTTP transport pool. 128 by default.
    read:
      chunk.size: <uint>      # InfluxDB request chunk size (number of time series to write to or getting from InfluxDB). 10000 by default.
    write:
      concurrency: <uint>     # The maximum number of concurrent InfluxDB writes; 4 by default.
      queue.size: <uint>      # The maximum length of the DB write queue per application.
                              # Each item in the queue is based on a received DSTP or TSTP message. 1024 by default.

      batch.size: <uint>      # The optimal desired write batch size in data points. 10000 by default.
                              # EPTS uses this configuration option as a recommendation, not as a strict rule.
                              # It will attempt to collect write batches as close to this number as possible without sacrificing the write latency.
                              # Some writes may exceed the configured batch size, which is by design.

      timeout: <uint>         # Write timeout for InfluxDB requests (in ms). Default 30000 ms.

NOTE: For security reasons, username and password must be sourced from the environment variables.

NATS

The below parameters configure EPTS’s connection to NATS. Note that for security reasons NATS username and password are sourced from the environment variables.

nats:
  urls: <comma separated list of URL>  # NATS connection URLs.

Authentication, authorization, and multi-tenancy

EPTS’s REST API security is implemented according to OAuth2 protocol with a UMA profile. Authentication and authorization is handled within the scope of a given Kaa tenant. Each tenant has a separate OAuth 2.0 issuer, managed by [the Kaa Tenant Manager][Tenant Manager]. When multi-tenancy is disabled, all authentication and authorization is conducted in the default system tenant (“kaa”).

EPTS security is controlled with the following configuration options (for security reasons it is advised to set these via environment variables).

kaa:
  security:
    enabled: <boolean>      # Enables authentication and authorization. False by default.
    issuer: <string>        # OAuth 2.0 issuer URL for the system tenant ("kaa").
    client-id: <string>     # Client ID for making requests in the system tenant scope.
    client-secret: <string> # Client secret for making requests in the system tenant scope.

    multitenancy:
      enabled: <boolean>    # Enables multitenancy via integration with the Kaa Tenant Manager. Only effective when kaa.security.enabled is set to true. False by default.
      tenant-manager:
        url: <string>       # URL of the Kaa Tenant Manager that provides security configurations for tenants. "http://tenant-manager" by default.

Management

To control the EPTS management interface, use the following configuration options.

service.monitoring:
  disabled: <boolean>   # Disables the monitoring interface entirely. False by default (enabled).
  port: <uint>          # TCP port to expose the monitoring server on. 8080 by default.

Logging

EPTS writes logs to stdout. By default, the service only produces logs at startup, shutdown, or in case of errors. To enable debug level logging, use the following option:

service:
  debug: <boolean>    # Enables debug level logging, false by default (disabled).

Keep in mind that enabling debug level logging will produce significantly more log output and degrade the service performance.

Default configuration

Summarizing the above, the default EPTS configuration is as follows. Note that no Kaa applications are defined by default—you have to configure those for any specific Kaa-based solution.

service:
  instance:
    name: "epts"
    replica.id: "<random string generated on boot>"
  monitoring:
    disabled: false
    port: 8080
  debug: false
kaa:
  influx:
    precision: "ns"
    write.timeout: 30000
    read.chunk.size: 10000
    idle-connections: 128

Kaa IoT Platform