Usage

Setup Prerequisites

Apache Nifi requires Apache ZooKeeper to run. This page assumes you already installed the Stackable operators for both of these products.

For details regarding the Stackable Operator for Apache ZooKeeper see the documentation.

Create an Apache Nifi cluster

An Apache NiFi cluster running on three nodes with SingleUser authentication is described as follows:

apiVersion: nifi.stackable.tech/v1alpha1
kind: NifiCluster
metadata:
  name: simple-nifi (1)
spec:
  version: 1.16.3-stackable0.1.0 (2)
  zookeeperConfigMapName: simple-nifi-znode (3)
  config:
    authentication:
      method:
        singleUser:
          adminCredentialsSecret: nifi-admin-credentials-simple (4)
          autoGenerate: true
    sensitiveProperties:
      keySecret: nifi-sensitive-property-key
  nodes:
    roleGroups:
      default:
        selector:
          matchLabels:
            kubernetes.io/os: linux
        config:
          log:
            rootLogLevel: INFO
        replicas: 3
1 Name of the Nifi cluster.
2 Version of the Apache Nifi container image to use. This contains the Nifi version (1.16.3) as well as Stackable image version (stackable0.1.0). For a list of available versions please check our image registry.
3 A reference to a ZNode resource. This resource must exist and cannot be reused across Nifi clusters. For details on how to create ZNode resources see this page.
4 Administator credentials for logging into the Nifi web interface. This is the name of a Secret resource with two fields: username and password. This Secret must exist but it’s entries can be populated by the operator when autoGenerate is true.

Authentication

Every user has to authenticate themselves before using NiFI. There are multiple options to set up the authentication of users.

Single user

The default setting is to only provision a single user with administrative privileges. You need to specify the username and password of the user. Further users can not be added.

LDAP

NiFI supports authentication of users against an LDAP server. Have a look at the LDAP test and the general Stackable Authentication documentation on how to set it up. In general, it requires you to specify a AuthenticationClass which is used to authenticate the users. Here you have an example usage

---
apiVersion: nifi.stackable.tech/v1alpha1
kind: NifiCluster
metadata:
  name: test-nifi
spec:
  config:
    authentication:
      method:
        authenticationClass: ldap-without-tls
---
apiVersion: authentication.stackable.tech/v1alpha1
kind: AuthenticationClass
metadata:
  name: ldap-without-tls
spec:
  provider:
    ldap:
      hostname: openldap.default.svc.cluster.local
      searchBase: ou=users,dc=example,dc=org
      bindCredentials:
        secretClass: nifi-with-ldap-bind
      port: 1389
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
  name: nifi-with-ldap-bind
spec:
  backend:
    k8sSearch:
      searchNamespace:
        pod: {}
---
apiVersion: v1
kind: Secret
metadata:
  name: nifi-with-ldap-bind
  labels:
    secrets.stackable.tech/class: nifi-with-ldap-bind
stringData:
  user: cn=integrationtest,ou=users,dc=example,dc=org
  password: integrationtest

Authorization

NiFi supports multiple authorization methods documented here. The available authorization methods depend on the chosen authentication method.

Authorization is not fully implemented by the Stackable Operator for Apache Nifi.

Single user

With this authorization methid, a single user has administrator capabilites.

LDAP

The operator uses the FileUserGroupProvider and FileAccessPolicyProvider to bind the LDAP user to the Nifi administrator group. This user is then able to create and modify groups and polices in the web interface. These changes local to the Pod running Nifi and are not persistent.

Updating NiFi

Updating (or downgrading for that matter) the deployed version of NiFi is as simple as changing the version stated in the CRD. Continuing the example above, to change the deployed version from 1.16.3 to 1.16.2 you’d simply deploy the following CRD.

apiVersion: nifi.stackable.tech/v1alpha1
kind: NifiCluster
metadata:
  name: simple-nifi
spec:
  version: 1.16.2-stackable0.1.0 (1)
  zookeeperConfigMapName: simple-nifi-znode
  config:
    authentication:
      method:
        singleUser:
          adminCredentialsSecret: nifi-admin-credentials-simple
    sensitiveProperties:
      keySecret: nifi-sensitive-property-key
  nodes:
    roleGroups:
      default:
        selector:
          matchLabels:
            kubernetes.io/os: linux
        config:
          log:
            rootLogLevel: INFO
        replicas: 3
1 Change the NiFi version here
Due to a limitation in NiFi itself it is not possible to up- or downgrade a NiFi cluster in a rolling fashion. So any change to the NiFi version you make in this CRD will result in a full cluster restart with a short downtime. This does not affect the stackable image version, this can be changed in a rolling fashion, as long as the underlying NiFi version remains unchanged.

Monitoring

The managed NiFi cluster is automatically configured to export Prometheus metrics. See Monitoring for more details.

Configuration & Environment Overrides

The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).

Do not override port numbers. This will lead to cluster malfunction.

Configuration Overrides

Apache NiFi runtime configuration is stored in the files bootstrap.conf and nifi.properties. The configOverrides block enables you to customize parameters in these files. The complete list of the configuration options can be found in the Apache NiFi documentation.

Overrides are key, value pairs defined under a Nifi configuration file such as bootstrap.conf or nifi.properties. They must match the names values as expected by Nifi. In the example below a property aws.region is being explicitly set to 'eu-west-1', overriding the default value.

The following snippet shows how to disable workflow file backups in the NifiCluster definition:

configOverrides:
  nifi.properties:
    nifi.flow.configuration.archive.enabled: false
Please be aware that by overriding config settings in this section you have a very high risk of breaking things, because the product does not behave the way the Stackable Operator for Apache NiFi expects it to behave anymore.

Environment Variables

Environment variables can be (over)written by adding the envOverrides property.

For example per role group:

nodes:
  roleGroups:
    default:
      config: {}
      replicas: 1
      envOverrides:
        MY_ENV_VAR: "MY_VALUE"

or per role:

nodes:
  envOverrides:
    MY_ENV_VAR: "MY_VALUE"
  roleGroups:
    default:
      config: {}
      replicas: 1

Volume storage

By default, a Nifi cluster will create five different persistent volume claims for flow files, provenance, database, content and state folders. These PVCs will request 2Gi. It is recommended that you configure these volume requests according to your needs.

Storage requests can be configured at role or group level, for one or more of the persistent volumes as follows:

nodes:
  roleGroups:
    default:
      config:
        resources:
          storage:
            flowfile_repo:
              capacity: 12Gi
            provenance_repo:
              capacity: 12Gi
            database_repo:
              capacity: 12Gi
            content_repo:
              capacity: 12Gi
            state_repo:
              capacity: 12Gi

In the above example, all nodes in the default group will request 12Gi of storage the various folders.

Resource Requests

Stackable operators handle resource requests in a sligtly different manner than Kubernetes. Resource requests are defined on role or group level. See Roles and role groups for details on these concepts. On a role level this means that e.g. all workers will use the same resource requests and limits. This can be further specified on role group level (which takes priority to the role level) to apply different resources.

This is an example on how to specify CPU and memory resources using the Stackable Custom Resources:

---
apiVersion: example.stackable.tech/v1alpha1
kind: ExampleCluster
metadata:
  name: example
spec:
  workers: # role-level
    config:
      resources:
        cpu:
          min: 300m
          max: 600m
        memory:
          limit: 3Gi
    roleGroups: # role-group-level
      resources-from-role: # role-group 1
        replicas: 1
      resources-from-role-group: # role-group 2
        replicas: 1
        config:
          resources:
            cpu:
              min: 400m
              max: 800m
            memory:
              limit: 4Gi

In this case, the role group resources-from-role will inherit the resources specified on the role level. Resulting in a maximum of 3Gi memory and 600m CPU resources.

The role group resources-from-role-group has maximum of 4Gi memory and 800m CPU resources (which overrides the role CPU resources).

For Java products the actual used Heap memory is lower than the specified memory limit due to other processes in the Container requiring memory to run as well. Currently, 80% of the specified memory limits is passed to the JVM.

For memory only a limit can be specified, which will be set as memory request and limit in the Container. This is to always guarantee a Container the full amount memory during Kubernetes scheduling.

If no resource requests are configured explicitly, the Nifi operator uses the following defaults:

nodes:
  roleGroups:
    default:
      config:
        resources:
          cpu:
            min: "500m"
            max: "4"
          memory:
            limit: '2Gi'
          storage:
            flowfile_repo:
              capacity: 2Gi
            provenance_repo:
              capacity: 2Gi
            database_repo:
              capacity: 2Gi
            content_repo:
              capacity: 2Gi
            state_repo:
              capacity: 2Gi