Usage
Authentication
Every user has to authenticate themselves before using Superset. There are multiple options to set up the authentication of users.
LDAP
Superset supports authentication of users against an LDAP server. This requires setting up an AuthenticationClass for the LDAP server. The AuthenticationClass is then referenced in the SupersetCluster resource as follows:
apiVersion: superset.stackable.tech/v1alpha1
kind: SupersetCluster
metadata:
name: superset-with-ldap-server
spec:
image:
productVersion: 1.5.1
stackableVersion: 23.4.0-rc1
[...]
authenticationConfig:
authenticationClass: ldap (1)
userRegistrationRole: Admin (2)
1 | The reference to an AuthenticationClass called ldap |
2 | The default role that all users are assigned to |
Users that log in with LDAP are assigned to a default Role which is specified with the userRegistrationRole
property.
You can follow the Authentication with OpenLDAP tutorial to learn how to set up an AuthenticationClass for an LDAP server, as well as consulting the AuthenticationClass reference.
Authorization
Superset has a concept called Roles
which allows you to grant user permissions based on roles.
Have a look at the Superset documentation on Security.
Webinterface
You can see all the available roles in the Webinterface of Superset. You can view all the available roles in the Webinterface of Superset and can also assign users to these roles.
LDAP
Superset supports assigning Roles to users based on their LDAP group membership, though this is not yet supported by the Stackable operator.
All the users logging in via LDAP get assigned to the same role which you can configure via the attribute authenticationConfig.userRegistrationRole
on the SupersetCluster
object:
apiVersion: superset.stackable.tech/v1alpha1
kind: SupersetCluster
metadata:
name: superset-with-ldap-server
spec:
[...]
authenticationConfig:
authenticationClass: ldap
userRegistrationRole: Admin (1)
1 | All users are assigned to the Admin role |
Connecting Apache Druid Clusters
The operator can automatically connect Superset to Apache Druid clusters managed by the Stackable Druid Cluster.
To do so, create a DruidConnection
resource:
apiVersion: superset.stackable.tech/v1alpha1
kind: DruidConnection
metadata:
name: superset-druid-connection
spec:
superset:
name: superset
namespace: default
druid:
name: my-druid-cluster
namespace: default
The name
and namespace
in spec.superset
refer to the Superset cluster that you want to connect. Following our example above, the name is superset
.
In spec.druid
you specify the name
and namespace
of your Druid cluster.
The namespace
part is optional; if it is omitted it will default to the namespace of the DruidConnection.
The namespace for the Superset and Druid cluster can be omitted, in that case the Operator will assume that they are in the same namespace as the DruidConnection.
Once the database is initialized, the connection will be added to the cluster by the operator. You can see it in the user interface under Data > Databases:
Monitoring
The managed Superset instances are automatically configured to export Prometheus metrics. See Monitoring for more details.
Configuration & Environment Overrides
The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).
Overriding certain properties which are set by the operator (such as the STATS_LOGGER )
can interfere with the operator and can lead to problems.
|
Configuration Properties
For a role or role group, at the same level of config
, you can specify configOverrides
for the
superset_config.py
. For example, if you want to set the CSV export encoding and the preferred
databases adapt the nodes
section of the cluster resource as follows:
nodes:
roleGroups:
default:
config: {}
configOverrides:
superset_config.py:
CSV_EXPORT: "{'encoding': 'utf-8'}"
PREFERRED_DATABASES: |-
[
'PostgreSQL',
'Presto',
'MySQL',
'SQLite',
# etc.
]
Just as for the config
, it is possible to specify this at the role level as well:
nodes:
configOverrides:
superset_config.py:
CSV_EXPORT: "{'encoding': 'utf-8'}"
PREFERRED_DATABASES: |-
[
'PostgreSQL',
'Presto',
'MySQL',
'SQLite',
# etc.
]
roleGroups:
default:
config: {}
All override property values must be strings. They are treated as Python expressions. So care must be taken to not produce an invalid configuration.
For a full list of configuration options we refer to the main config file for Superset.
Environment Variables
In a similar fashion, environment variables can be (over)written. For example per role group:
nodes:
roleGroups:
default:
config: {}
envOverrides:
FLASK_ENV: development
or per role:
nodes:
envOverrides:
FLASK_ENV: development
roleGroups:
default:
config: {}
Storage for data volumes
The Superset operator currently does not support using PersistentVolumeClaims for internal storage.
Resource Requests
Stackable operators handle resource requests in a sligtly different manner than Kubernetes. Resource requests are defined on role or group level. See Roles and role groups for details on these concepts. On a role level this means that e.g. all workers will use the same resource requests and limits. This can be further specified on role group level (which takes priority to the role level) to apply different resources.
This is an example on how to specify CPU and memory resources using the Stackable Custom Resources:
---
apiVersion: example.stackable.tech/v1alpha1
kind: ExampleCluster
metadata:
name: example
spec:
workers: # role-level
config:
resources:
cpu:
min: 300m
max: 600m
memory:
limit: 3Gi
roleGroups: # role-group-level
resources-from-role: # role-group 1
replicas: 1
resources-from-role-group: # role-group 2
replicas: 1
config:
resources:
cpu:
min: 400m
max: 800m
memory:
limit: 4Gi
In this case, the role group resources-from-role
will inherit the resources specified on the role level. Resulting in a maximum of 3Gi
memory and 600m
CPU resources.
The role group resources-from-role-group
has maximum of 4Gi
memory and 800m
CPU resources (which overrides the role CPU resources).
For Java products the actual used Heap memory is lower than the specified memory limit due to other processes in the Container requiring memory to run as well. Currently, 80% of the specified memory limits is passed to the JVM. |
For memory only a limit can be specified, which will be set as memory request and limit in the Container. This is to always guarantee a Container the full amount memory during Kubernetes scheduling.
If no resource requests are configured explicitly, the Superset operator uses the following defaults:
nodes:
roleGroups:
default:
config:
resources:
cpu:
min: '200m'
max: "4"
memory:
limit: '2Gi'
The default values are most likely not sufficient to run a proper cluster in production. Please adapt according to your requirements. |