Release notes for the Stackable Data Platform
The Stackable platform consists of multiple operators that work together. Periodically a platform release is made, including all components of the platform at a specific version.
Release 22.11
This is the third release of the Stackable Data Platform, which this time focuses on resource management.
New platform features
The following new major platform features were added:
- CPU and memory limits configurable
-
The operators now request resources from Kubernetes for the products and required CPU and memory can now also be configured for all products. If your product instances are less performant after the update, the new defaults might be set too low and we recommend to set custom requests for your cluster.
- Orphaned Resources
-
The operators now properly clean up after scaling down products. This means for example deleting StatefulSets that were left over after scaling down.
- New Versions
-
New product versions are supported.
- Product features
-
Additionally there are some individual product features that are noteworthy
-
NiFi: repository sizes are now adjusted based on declared PVC sizes
-
The github repositories contain new and improved READMEs.
Supported Kubernetes versions
This release supports the following Kubernetes versions:
-
1.25
(new) -
1.24
-
1.23
-
1.22
Upgrade from 22.09
Using stackablectl
You can list the available releases as follows
$ stackablectl release list
RELEASE RELEASE DATE DESCRIPTION
22.11 2022-11-08 Third release focusing on resource management
22.09 2022-09-09 Second release focusing on security and OpenShift support
22.06 2022-06-30 First official release of the Stackable Data Platform
To uninstall the 22.09
release run
$ stackablectl release uninstall 22.09
[INFO ] Uninstalling release 22.09
[INFO ] Uninstalling airflow operator
[INFO ] Uninstalling commons operator
# ...
Afterwards you will need to update the CustomResourceDefinitions (CRDs) installed by the Stackable Platform. The reason for this is that helm will uninstall the operators but not the CRDs.
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/airflow-operator/0.6.0/deploy/helm/airflow-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/commons-operator/0.4.0/deploy/helm/commons-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/druid-operator/0.8.0/deploy/helm/druid-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/hbase-operator/0.5.0/deploy/helm/hbase-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/hdfs-operator/0.6.0/deploy/helm/hdfs-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/hive-operator/0.8.0/deploy/helm/hive-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/kafka-operator/0.8.0/deploy/helm/kafka-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/nifi-operator/0.8.0/deploy/helm/nifi-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/opa-operator/0.11.0/deploy/helm/opa-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/secret-operator/0.6.0/deploy/helm/secret-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/spark-k8s-operator/0.6.0/deploy/helm/spark-k8s-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/superset-operator/0.7.0/deploy/helm/superset-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/trino-operator/0.8.0/deploy/helm/trino-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/0.12.0/deploy/helm/zookeeper-operator/crds/crds.yaml
To install the 22.11
release run
$ stackablectl release install 22.11
[INFO ] Installing release 22.11
[INFO ] Installing airflow operator in version 0.6.0
[INFO ] Installing commons operator in version 0.4.0
[INFO ] Installing druid operator in version 0.8.0
[INFO ] Installing hbase operator in version 0.5.0
[INFO ] Installing hdfs operator in version 0.6.0
[INFO ] Installing hive operator in version 0.8.0
[INFO ] Installing kafka operator in version 0.8.0
[INFO ] Installing nifi operator in version 0.8.0
[INFO ] Installing opa operator in version 0.11.0
[INFO ] Installing secret operator in version 0.6.0
[INFO ] Installing spark-k8s operator in version 0.6.0
[INFO ] Installing superset operator in version 0.7.0
[INFO ] Installing trino operator in version 0.7.0
[INFO ] Installing zookeeper operator in version 0.12.0
# ...
Using helm
Use helm list
to list the currently installed operators.
You can use the following command to uninstall all of the operators that are part of the release 22.09:
$ helm uninstall airflow-operator commons-operator druid-operator hbase-operator hdfs-operator hive-operator kafka-operator nifi-operator opa-operator secret-operator spark-k8s-operator superset-operator trino-operator zookeeper-operator
release "airflow-operator" uninstalled
release "commons-operator" uninstalled
# ...
Afterwards you will need to update the CustomResourceDefinitions (CRDs) installed by the Stackable Platform. This is because helm will uninstall the operators but not the CRDs.
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/airflow-operator/0.6.0/deploy/helm/airflow-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/commons-operator/0.4.0/deploy/helm/commons-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/druid-operator/0.8.0/deploy/helm/druid-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/hbase-operator/0.5.0/deploy/helm/hbase-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/hdfs-operator/0.6.0/deploy/helm/hdfs-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/hive-operator/0.8.0/deploy/helm/hive-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/kafka-operator/0.8.0/deploy/helm/kafka-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/nifi-operator/0.8.0/deploy/helm/nifi-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/opa-operator/0.11.0/deploy/helm/opa-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/secret-operator/0.6.0/deploy/helm/secret-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/spark-k8s-operator/0.6.0/deploy/helm/spark-k8s-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/superset-operator/0.7.0/deploy/helm/superset-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/trino-operator/0.8.0/deploy/helm/trino-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/0.12.0/deploy/helm/zookeeper-operator/crds/crds.yaml
To install the release 22.11 run
$ helm repo add stackable https://repo.stackable.tech/repository/helm-stable/
$ helm repo update stackable
$ helm install --wait airflow-operator stackable/airflow-operator --version 0.6.0
$ helm install --wait commons-operator stackable/commons-operator --version 0.4.0
$ helm install --wait druid-operator stackable/druid-operator --version 0.8.0
$ helm install --wait hbase-operator stackable/hbase-operator --version 0.5.0
$ helm install --wait hdfs-operator stackable/hdfs-operator --version 0.6.0
$ helm install --wait hive-operator stackable/hive-operator --version 0.8.0
$ helm install --wait kafka-operator stackable/kafka-operator --version 0.8.0
$ helm install --wait nifi-operator stackable/nifi-operator --version 0.8.0
$ helm install --wait opa-operator stackable/opa-operator --version 0.11.0
$ helm install --wait secret-operator stackable/secret-operator --version 0.6.0
$ helm install --wait spark-k8s-operator stackable/spark-k8s-operator --version 0.6.0
$ helm install --wait superset-operator stackable/superset-operator --version 0.7.0
$ helm install --wait trino-operator stackable/trino-operator --version 0.7.0
$ helm install --wait zookeeper-operator stackable/zookeeper-operator --version 0.12.0
Stackable Operator for Apache Spark
The configuration of pod resource requests has been changed to be consistent with other operators that are part of the Stackable Data Platform (#174).
In the previous version, these were configured like this:
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"`
From now on, Pod resources can be configured in two different ways. The first and recommended way is to add a resources section for each role as the following examples shows:
driver:
resources:
cpu:
min: "1"
max: "1500m"
memory:
limit: "1Gi"
The second method is to use the sparkConf
section and and set them individually as spark properties:
sparkConf:
spark.kubernetes.submission.waitAppCompletion: "false"
spark.kubernetes.driver.pod.name: "resources-sparkconf-driver"
spark.kubernetes.executor.podNamePrefix: "resources-sparkconf"
spark.kubernetes.driver.request.cores: "2"
spark.kubernetes.driver.limit.cores: "3"
When both methods are used, the settings in the sparkConf
section override the resources
configuration.
Note that none of the settings above have any influence over the parallelism used by Spark itself. The only supported way to affect this is as follows:
sparkConf:
spark.driver.cores: "3"
spark.executor.cores: "3"
Release 22.09
This is the second release of the Stackable Data Platform. It contains lots of new features and bugfixes. The main features focus on OpenShift support and security.
New platform features
The following new major platform features were added:
- OpenShift compatibility
-
We have made continued progress towards OpenShift compability, and the following operators can now be previewed on OpenShift. Further improvements are expected in future releases, but no stability or compatibility guarantees are currently made for OpenShift clusters.
- Support for internal and external TLS
-
The following operators support operating the products at a maximal level of transport security by using TLS certificates to secure internal and external communication:
- LDAP authentication
-
Use a central LDAP server to manage all of your user identities in a single place. The following operators added support for LDAP authentication:
stackablectl
stackablectl
now supports deploying ready-to-use demos, which give an end-to-end demonstration of the usage of the Stackable Data Platform.
The quickstart guide shows how to get started with stackablectl
. Here you can see the available demos.
Supported Kubernetes versions
This release supports the following Kubernetes versions:
-
1.24
-
1.23
-
1.22
Support for 1.21
was dropped.
Upgrade from 22.06
Using stackablectl
You can list the available releases as follows
$ stackablectl release list
RELEASE RELEASE DATE DESCRIPTION
22.11 2022-11-08 Third release candidate of 22.11
22.09 2022-09-09 Second release focusing on security and OpenShift support
22.06 2022-06-30 First official release of the Stackable Data Platform
To uninstall the 22.06
release run
$ stackablectl release uninstall 22.06
[INFO ] Uninstalling release 22.06
[INFO ] Uninstalling airflow operator
[INFO ] Uninstalling commons operator
# ...
Afterwards you will need to update the CustomResourceDefinitions (CRDs) installed by the Stackable Platform. The reason is, that helm will uninstall the operators but not the CRDs.
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/airflow-operator/0.5.0/deploy/helm/airflow-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/commons-operator/0.3.0/deploy/helm/commons-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/druid-operator/0.7.0/deploy/helm/druid-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/hbase-operator/0.4.0/deploy/helm/hbase-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/hdfs-operator/0.5.0/deploy/helm/hdfs-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/hive-operator/0.7.0/deploy/helm/hive-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/kafka-operator/0.7.0/deploy/helm/kafka-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/nifi-operator/0.7.0/deploy/helm/nifi-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/opa-operator/0.10.0/deploy/helm/opa-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/secret-operator/0.5.0/deploy/helm/secret-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/spark-k8s-operator/0.5.0/deploy/helm/spark-k8s-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/superset-operator/0.6.0/deploy/helm/superset-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/trino-operator/0.6.0/deploy/helm/trino-operator/crds/crds.yaml
$ kubectl apply -f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/0.11.0/deploy/helm/zookeeper-operator/crds/crds.yaml
To install the 22.09
release run
$ stackablectl release install 22.09
[INFO ] Installing release 22.09
[INFO ] Installing airflow operator in version 0.5.0
[INFO ] Installing commons operator in version 0.3.0
[INFO ] Installing druid operator in version 0.7.0
[INFO ] Installing hbase operator in version 0.4.0
[INFO ] Installing hdfs operator in version 0.5.0
[INFO ] Installing hive operator in version 0.7.0
[INFO ] Installing kafka operator in version 0.7.0
[INFO ] Installing nifi operator in version 0.7.0
[INFO ] Installing opa operator in version 0.10.0
[INFO ] Installing secret operator in version 0.5.0
[INFO ] Installing spark-k8s operator in version 0.5.0
[INFO ] Installing superset operator in version 0.6.0
[INFO ] Installing trino operator in version 0.6.0
[INFO ] Installing zookeeper operator in version 0.11.0
# ...
Using helm
Use helm list
to list the currently installed operators.
You can use the following command to uninstall all of the operators that are part of the release 22.06:
$ helm uninstall airflow-operator commons-operator druid-operator hbase-operator hdfs-operator hive-operator kafka-operator nifi-operator opa-operator secret-operator spark-k8s-operator superset-operator trino-operator zookeeper-operator
release "airflow-operator" uninstalled
release "commons-operator" uninstalled
# ...
Afterwards you will need to update the CustomResourceDefinitions (CRDs) installed by the Stackable Platform. The reason is, that helm will uninstall the operators but not the CRDs.
$ kubectl apply \
-f https://raw.githubusercontent.com/stackabletech/airflow-operator/0.5.0/deploy/helm/airflow-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/commons-operator/0.3.0/deploy/helm/commons-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/druid-operator/0.7.0/deploy/helm/druid-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/hbase-operator/0.4.0/deploy/helm/hbase-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/hdfs-operator/0.5.0/deploy/helm/hdfs-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/hive-operator/0.7.0/deploy/helm/hive-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/kafka-operator/0.7.0/deploy/helm/kafka-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/nifi-operator/0.7.0/deploy/helm/nifi-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/opa-operator/0.10.0/deploy/helm/opa-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/secret-operator/0.5.0/deploy/helm/secret-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/spark-k8s-operator/0.5.0/deploy/helm/spark-k8s-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/superset-operator/0.6.0/deploy/helm/superset-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/trino-operator/0.6.0/deploy/helm/trino-operator/crds/crds.yaml \
-f https://raw.githubusercontent.com/stackabletech/zookeeper-operator/0.11.0/deploy/helm/zookeeper-operator/crds/crds.yaml
To install the release 22.09 run
$ helm repo add stackable https://repo.stackable.tech/repository/helm-stable/
$ helm repo update stackable
$ helm install --wait airflow-operator stackable/airflow-operator --version 0.5.0
$ helm install --wait commons-operator stackable/commons-operator --version 0.3.0
$ helm install --wait druid-operator stackable/druid-operator --version 0.7.0
$ helm install --wait hbase-operator stackable/hbase-operator --version 0.4.0
$ helm install --wait hdfs-operator stackable/hdfs-operator --version 0.5.0
$ helm install --wait hive-operator stackable/hive-operator --version 0.7.0
$ helm install --wait kafka-operator stackable/kafka-operator --version 0.7.0
$ helm install --wait nifi-operator stackable/nifi-operator --version 0.7.0
$ helm install --wait opa-operator stackable/opa-operator --version 0.10.0
$ helm install --wait secret-operator stackable/secret-operator --version 0.5.0
$ helm install --wait spark-k8s-operator stackable/spark-k8s-operator --version 0.5.0
$ helm install --wait superset-operator stackable/superset-operator --version 0.6.0
$ helm install --wait trino-operator stackable/trino-operator --version 0.6.0
$ helm install --wait zookeeper-operator stackable/zookeeper-operator --version 0.11.0
druid-operator
-
HDFS deep storage is now configurable via the HDFS discovery config map instead of a url to a HDFS name node (#262). Instead of
deepStorage:
hdfs:
storageDirectory: hdfs://druid-hdfs-namenode-default-0:8020/data
use
deepStorage:
hdfs:
configMapName: druid-hdfs
directory: /druid
kafka-operator
-
Add TLS encryption and authentication support for internal and client communications. This is breaking for clients because the cluster is secured per default, which results in a client port change (#442). If you don’t want to use TLS to secure your Kafka cluster you can restore the old behavior by using the
tls
attribute as follows:
apiVersion: kafka.stackable.tech/v1alpha1
kind: KafkaCluster
# ...
spec:
config:
tls: null
# ...
trino-operator
-
TrinoCatalogs now have their own CRD object and get referenced by the TrinoCluster (#263). Instead of
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
# ...
spec:
hiveConfigMapName: hive
s3:
inline:
host: minio
port: 9000
accessStyle: Path
credentials:
secretClass: s3-credentials
# ...
use
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
# ...
spec:
catalogLabelSelector:
trino: trino
# ...
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
name: hive
labels:
trino: trino
spec:
connector:
hive:
metastore:
configMap: hive
s3:
inline:
host: minio
port: 9000
accessStyle: Path
credentials:
secretClass: s3-credentials
Release 22.06
This is our first release of the Stackable Data Platform, bringing Kubernetes operators for 12 products as well as stackablectl, the commandline tool to easily install data products in Kubernetes. Operators spin up production ready product applications. Also, there are some common features across all operators, such as monitoring, service discovery and configuration overrides. Find the Platform features, stackablectl features and Operators below.
Please report any issues you find in the specific operator repositories or in our dedicated github.com/stackabletech/issues/[issues] repository. You may also join us in our Slack community or contact us via our homepage.
While we are very proud of this release it is our first one and we’ll add new features and fix bugs all the time and will have regular releases from now on.
Platform features
- Easily install production ready data applications
-
Using a familiar declarative approach, users can easily install data applications such as Apache Kafka or Trino across multiple cloud Kubernetes providers or on their own data centers. The installation process is fully automated while also providing the flexibility for the user to tune relevant aspects of each application.
- Monitoring
-
All products have monitoring with prometheus enabled. Learn more
- Service discovery
-
Products on the Stackable platform use service discovery to easily interconnect with each other. Learn more
- Configuration overrides
-
All operators support configuration overrides, these are documented in the specific operator documentation pages.
- Common S3 configuration
-
Many products support connecting to S3 to load and/or store data. There is a common resource for S3 connections and buckets across all operators that can be reused. Learn more
- Roles and role groups
-
To support hybrid hardware clusters, the Stackable platform uses the concept of role groups. Services and applications can be configured to maximize hardware efficiency.
- Standardized
-
Learn once reuse everywhere. We use the same conventions in all our operators. Configure your LDAP or S3 connections once and reuse them everywhere. All our operators reuse the same CRD structure as well.
stackablectl
stackablectl
is used to install and interact with the operators, either individually or with multiple at once.
Learn more
Operators
This is the list of all operators in this current release, with their versions for this release.
-
Stackable Operator for Apache Airflow (0.4.0)
-
Load DAGs from ConfigMaps or PersistentVolumeClaims
-
-
Stackable Operator for Apache Druid (0.6.0)
-
S3 and HDFS as deep storage options
-
ingestion from S3 buckets
-
authorization using OPA
-
-
Stackable Operator for Apache Hive (0.6.0)
-
Hive Metastore can index S3
-
-
Stackable Operator for Apache Kafka (0.6.0)
-
Seamless integration with NiFi and Druid
-
Supports OPA authorization
-
-
Stackable Operator for Apache Superset (0.5.0)
-
connects to Druid as a backend
-
Supports LDAP authentication
-
-
Stackable Operator for Trino (0.4.0)
-
Supports OPA and file-based authorization
-
Connects to the Hive Metastore
-
Query data from S3
-
TLS support
-
-
Stackable Operator for Apache ZooKeeper (0.10.0)
-
Supports creating ZNodes with CRDs
-
Read up on the supported versions for each of these products.
-
Stackable Operator for OPA (OpenPolicyAgent) (0.9.0)
-
Create RegoRules in ConfigMaps
-
-
Stackable Commons Operator (0.2.0)
-
Stackable Secret Operator (0.5.0)