Usage
Setup Prerequisites
Apache Nifi requires Apache ZooKeeper to run. This page assumes you already installed the Stackable operators for both of these products.
For details regarding the Stackable Operator for Apache ZooKeeper see the documentation.
Create an Apache Nifi cluster
An Apache NiFi cluster running on three nodes with SingleUser authentication is described as follows:
apiVersion: nifi.stackable.tech/v1alpha1
kind: NifiCluster
metadata:
name: simple-nifi (1)
spec:
version: 1.16.3-stackable0.1.0 (2)
zookeeperConfigMapName: simple-nifi-znode (3)
config:
authentication:
method:
singleUser:
adminCredentialsSecret: nifi-admin-credentials-simple (4)
autoGenerate: true
sensitiveProperties:
keySecret: nifi-sensitive-property-key
nodes:
roleGroups:
default:
selector:
matchLabels:
kubernetes.io/os: linux
config:
log:
rootLogLevel: INFO
replicas: 3
1 | Name of the Nifi cluster. |
2 | Version of the Apache Nifi container image to use. This contains the Nifi version (1.16.3 ) as well as Stackable image version (stackable0.1.0 ). For a list of available versions please check our image registry. |
3 | A reference to a ZNode resource. This resource must exist and cannot be reused across Nifi clusters. For details on how to create ZNode resources see this page. |
4 | Administator credentials for logging into the Nifi web interface. This is the name of a Secret resource with two fields: username and password . This Secret must exist but it’s entries can be populated by the operator when autoGenerate is true . |
Authentication
Every user has to authenticate themselves before using NiFI. There are multiple options to set up the authentication of users.
Single user
The default setting is to only provision a single user with administrative privileges. You need to specify the username and password of the user. Further users can not be added.
LDAP
NiFI supports authentication of users against an LDAP server.
Have a look at the LDAP test and the general Stackable Authentication documentation on how to set it up.
In general, it requires you to specify a AuthenticationClass
which is used to authenticate the users.
Here you have an example usage
---
apiVersion: nifi.stackable.tech/v1alpha1
kind: NifiCluster
metadata:
name: test-nifi
spec:
config:
authentication:
method:
authenticationClass: ldap-without-tls
---
apiVersion: authentication.stackable.tech/v1alpha1
kind: AuthenticationClass
metadata:
name: ldap-without-tls
spec:
provider:
ldap:
hostname: openldap.default.svc.cluster.local
searchBase: ou=users,dc=example,dc=org
bindCredentials:
secretClass: nifi-with-ldap-bind
port: 1389
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
name: nifi-with-ldap-bind
spec:
backend:
k8sSearch:
searchNamespace:
pod: {}
---
apiVersion: v1
kind: Secret
metadata:
name: nifi-with-ldap-bind
labels:
secrets.stackable.tech/class: nifi-with-ldap-bind
stringData:
user: cn=integrationtest,ou=users,dc=example,dc=org
password: integrationtest
Authorization
NiFi supports multiple authorization methods documented here. The available authorization methods depend on the chosen authentication method.
Authorization is not fully implemented by the Stackable Operator for Apache Nifi.
LDAP
The operator uses the FileUserGroupProvider
and FileAccessPolicyProvider to bind the LDAP user to the Nifi administrator group. This user is then able to create and modify groups and polices in the web interface. These changes local to the Pod
running Nifi and are not persistent.
Updating NiFi
Updating (or downgrading for that matter) the deployed version of NiFi is as simple as changing the version stated in the CRD.
Continuing the example above, to change the deployed version from 1.16.3
to 1.16.2
you’d simply deploy the following CRD.
apiVersion: nifi.stackable.tech/v1alpha1
kind: NifiCluster
metadata:
name: simple-nifi
spec:
version: 1.16.2-stackable0.1.0 (1)
zookeeperConfigMapName: simple-nifi-znode
config:
authentication:
method:
singleUser:
adminCredentialsSecret: nifi-admin-credentials-simple
sensitiveProperties:
keySecret: nifi-sensitive-property-key
nodes:
roleGroups:
default:
selector:
matchLabels:
kubernetes.io/os: linux
config:
log:
rootLogLevel: INFO
replicas: 3
1 | Change the NiFi version here |
Due to a limitation in NiFi itself it is not possible to up- or downgrade a NiFi cluster in a rolling fashion. So any change to the NiFi version you make in this CRD will result in a full cluster restart with a short downtime. This does not affect the stackable image version, this can be changed in a rolling fashion, as long as the underlying NiFi version remains unchanged. |
Monitoring
The managed NiFi cluster is automatically configured to export Prometheus metrics. See Monitoring for more details.
Configuration & Environment Overrides
The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).
Do not override port numbers. This will lead to cluster malfunction. |
Configuration Overrides
Apache NiFi runtime configuration is stored in the files bootstrap.conf and nifi.properties.
The configOverrides
block enables you to customize parameters in these files.
The complete list of the configuration options can be found in the Apache NiFi documentation.
Overrides are key, value pairs defined under a Nifi configuration file such as bootstrap.conf
or nifi.properties
. They must match the names values as expected by Nifi. In the example below a property aws.region
is being explicitly set to 'eu-west-1', overriding the default value.
The following snippet shows how to disable workflow file backups in the NifiCluster definition:
configOverrides:
nifi.properties:
nifi.flow.configuration.archive.enabled: false
Please be aware that by overriding config settings in this section you have a very high risk of breaking things, because the product does not behave the way the Stackable Operator for Apache NiFi expects it to behave anymore. |
Environment Variables
Environment variables can be (over)written by adding the envOverrides
property.
For example per role group:
nodes:
roleGroups:
default:
config: {}
replicas: 1
envOverrides:
MY_ENV_VAR: "MY_VALUE"
or per role:
nodes:
envOverrides:
MY_ENV_VAR: "MY_VALUE"
roleGroups:
default:
config: {}
replicas: 1
Volume storage
By default, a Nifi cluster will create five different persistent volume claims for flow files, provenance, database, content and state folders. These PVCs will request 2Gi
. It is recommended that you configure these volume requests according to your needs.
Storage requests can be configured at role or group level, for one or more of the persistent volumes as follows:
nodes:
roleGroups:
default:
config:
resources:
storage:
flowfile_repo:
capacity: 12Gi
provenance_repo:
capacity: 12Gi
database_repo:
capacity: 12Gi
content_repo:
capacity: 12Gi
state_repo:
capacity: 12Gi
In the above example, all nodes in the default group will request 12Gi
of storage the various folders.
Resource Requests
Stackable operators handle resource requests in a sligtly different manner than Kubernetes. Resource requests are defined on role or group level. See Roles and role groups for details on these concepts. On a role level this means that e.g. all workers will use the same resource requests and limits. This can be further specified on role group level (which takes priority to the role level) to apply different resources.
This is an example on how to specify CPU and memory resources using the Stackable Custom Resources:
---
apiVersion: example.stackable.tech/v1alpha1
kind: ExampleCluster
metadata:
name: example
spec:
workers: # role-level
config:
resources:
cpu:
min: 300m
max: 600m
memory:
limit: 3Gi
roleGroups: # role-group-level
resources-from-role: # role-group 1
replicas: 1
resources-from-role-group: # role-group 2
replicas: 1
config:
resources:
cpu:
min: 400m
max: 800m
memory:
limit: 4Gi
In this case, the role group resources-from-role
will inherit the resources specified on the role level. Resulting in a maximum of 3Gi
memory and 600m
CPU resources.
The role group resources-from-role-group
has maximum of 4Gi
memory and 800m
CPU resources (which overrides the role CPU resources).
For Java products the actual used Heap memory is lower than the specified memory limit due to other processes in the Container requiring memory to run as well. Currently, 80% of the specified memory limits is passed to the JVM. |
For memory only a limit can be specified, which will be set as memory request and limit in the Container. This is to always guarantee a Container the full amount memory during Kubernetes scheduling.
If no resource requests are configured explicitly, the Nifi operator uses the following defaults:
nodes:
roleGroups:
default:
config:
resources:
cpu:
min: "500m"
max: "4"
memory:
limit: '2Gi'
storage:
flowfile_repo:
capacity: 2Gi
provenance_repo:
capacity: 2Gi
database_repo:
capacity: 2Gi
content_repo:
capacity: 2Gi
state_repo:
capacity: 2Gi