Creating a Trino cluster

Define an insecure cluster (testing)

Create an insecure single node Trino cluster for testing. This can be accessed with the UI/CLI via http without either user/password credentials or authorization.

For testing purposes we use the Trino CLI.

First, ensure all necessary operator have been deployed:

stackablectl operator install \
  secret commons hive trino

The Trino cluster can now be deployed:

---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
  name: hive
  labels:
    trino: simple-trino
spec:
  connector:
    hive:
      metastore:
        configMap: simple-hive-derby
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
  name: simple-trino
spec:
  image:
    productVersion: "396"
    stackableVersion: "23.4.0-rc1"
  catalogLabelSelector:
    matchLabels:
      trino: simple-trino
  coordinators:
    roleGroups:
      default:
        replicas: 1
  workers:
    roleGroups:
      default:
        replicas: 1
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
  name: simple-hive-derby
spec:
  image:
    productVersion: 3.1.3
    stackableVersion: "23.4.0-rc1"
  clusterConfig:
    database:
      connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true
      user: APP
      password: mine
      dbType: derby
  metastore:
    roleGroups:
      default:
        replicas: 1

We have defined a single catalog - Hive - which uses an embedded database (derby).

To interact with Trino, first obtain the host and port for the Trino coordinator service (in this and following examples, https://172.18.0.3:31748):

stackablectl services list

 PRODUCT  NAME               NAMESPACE  ENDPOINTS                                     EXTRA INFOS

 hive     simple-hive-derby  default    hive                172.18.0.4:32186
                                        metrics             172.18.0.4:30109

 trino    simple-trino       default    coordinator-metrics 172.18.0.3:32123
                                        coordinator-https   https://172.18.0.3:31748

Next, download the Trino CLI tool (this can be obtained from the Stackable repository, as shown below):

curl --output trino.jar https://repo.stackable.tech/repository/packages/trino-cli/trino-cli-396-executable.jar

Execute some CLI commands to verify operation, such as returning the names of all catalogs. Note that an insecure connection is specified:

./trino.jar --insecure --debug --server https://172.18.0.3:31748 --user=admin --execute "SHOW CATALOGS" --output-format=CSV_UNQUOTED

hive
system

Define a secure cluster (production)

For secure connections the following steps must be taken:

  1. Enable authentication

  2. Enable TLS between the clients and coordinator

  3. Enable internal TLS for communication between coordinators and workers

Via authentication

If authentication is enabled, TLS for the coordinator as well as a shared secret for internal communications (this is base64 and not encrypted) must be configured.

Securing the Trino cluster will disable all HTTP ports and disable the web interface on the HTTP port as well. In the definition below the authentication is directed to use the trino-users secret and TLS communication will use a certificate signed by the Secret Operator (indicated by autoTls).

---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
  name: hive
  labels:
    trino: simple-trino
spec:
  connector:
    hive:
      metastore:
        configMap: simple-hive-derby
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
  name: simple-trino
spec:
  image:
    productVersion: "396"
    stackableVersion: "23.4.0-rc1"
  config:
    tls:
      secretClass: trino-tls (1)
  authentication:
    method:
      multiUser:
        userCredentialsSecret:
          name: trino-users (2)
  catalogLabelSelector:
    matchLabels:
      trino: simple-trino (3)
  coordinators:
    roleGroups:
      default:
        replicas: 1
  workers:
    roleGroups:
      default:
        replicas: 1
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
  name: trino-tls (1)
spec:
  backend:
    autoTls: (4)
      ca:
        secret:
          name: secret-provisioner-trino-tls-ca
          namespace: default
        autoGenerate: true
---
apiVersion: v1
kind: Secret
metadata:
  name: trino-users (2)
type: kubernetes.io/opaque
stringData:
  # admin:admin
  admin: $2y$10$89xReovvDLacVzRGpjOyAOONnayOgDAyIS2nW9bs5DJT98q17Dy5i
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
  name: simple-hive-derby
spec:
  image:
    productVersion: 3.1.3
    stackableVersion: "23.4.0-rc1"
  clusterConfig:
    database:
      connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true
      user: APP
      password: mine
      dbType: derby
  metastore:
    roleGroups:
      default:
        replicas: 1
1 The name of (and reference to) the SecretClass
2 The name of (and reference to) the Secret
3 TrinoCatalog reference
4 TLS mechanism

The CLI now requires that a path to the keystore and a password be provided:

./trino.jar --debug --server https://172.18.0.3:31748
--user=admin --keystore-path=<path-to-keystore.p12> --keystore-password=<password>

Via TLS only

This will disable the HTTP port and UI access and encrypt client-server communications.

---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
  name: hive
  labels:
    trino: simple-trino
spec:
  connector:
    hive:
      metastore:
        configMap: simple-hive-derby
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
  name: simple-trino
spec:
  image:
    productVersion: "396"
    stackableVersion: "23.4.0-rc1"
  config:
    tls:
      secretClass: trino-tls (1)
  catalogLabelSelector:
    matchLabels:
      trino: simple-trino (2)
  coordinators:
    roleGroups:
      default:
        replicas: 1
  workers:
    roleGroups:
      default:
        replicas: 1
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
  name: trino-tls (1)
spec:
  backend:
    autoTls: (3)
      ca:
        secret:
          name: secret-provisioner-trino-tls-ca
          namespace: default
        autoGenerate: true
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
  name: simple-hive-derby
spec:
  image:
    productVersion: 3.1.3
    stackableVersion: "23.4.0-rc1"
  clusterConfig:
    database:
      connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true
      user: APP
      password: mine
      dbType: derby
  metastore:
    roleGroups:
      default:
        replicas: 1
1 The name of (and reference to) the SecretClass
2 TrinoCatalog reference
3 TLS mechanism

CLI callout:

./trino.jar --debug --server https://172.18.0.3:31748 --keystore-path=<path-to-keystore.p12> --keystore-password=<password>

Via internal TLS

Internal TLS is for encrypted and authenticated communications between coordinators and workers. Since this applies to all the data send and processed between the processes, this may reduce the performance significantly.

---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
  name: hive
  labels:
    trino: simple-trino
spec:
  connector:
    hive:
      metastore:
        configMap: simple-hive-derby
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
  name: simple-trino
spec:
  image:
    productVersion: "396"
    stackableVersion: "23.4.0-rc1"
  config:
    internalTls:
      secretClass: trino-internal-tls (1)
  authentication:
    method:
      multiUser:
        userCredentialsSecret:
          name: trino-users (2)
  catalogLabelSelector:
    matchLabels:
      trino: simple-trino
  coordinators:
    roleGroups:
      default:
        replicas: 1
  workers:
    roleGroups:
      default:
        replicas: 1
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
  name: trino-internal-tls (1)
spec:
  backend:
    autoTls: (3)
      ca:
        secret:
          name: secret-provisioner-trino-internal-tls-ca
          namespace: default
        autoGenerate: true
---
apiVersion: v1
kind: Secret
metadata:
  name: trino-users (2)
type: kubernetes.io/opaque
stringData:
  # admin:admin
  admin: $2y$10$89xReovvDLacVzRGpjOyAOONnayOgDAyIS2nW9bs5DJT98q17Dy5i
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
  name: simple-hive-derby
spec:
  image:
    productVersion: 3.1.3
    stackableVersion: "23.4.0-rc1"
  clusterConfig:
    database:
      connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true
      user: APP
      password: mine
      dbType: derby
  metastore:
    roleGroups:
      default:
        replicas: 1
1 The name of (and reference to) the SecretClass
2 The name of (and reference to) the Secret
3 TLS mechanism

Since Trino has internal and external communications running over a single port, this will enable the HTTPS port but not expose it. Cluster access is only possible via HTTP.

./trino.jar --debug --server http://172.18.0.3:31748 --user=admin

S3 connection specification

You can specify S3 connection details directly inside the TrinoCatalog specification or by referring to an external S3Connection custom resource.

To specify S3 connection details directly as part of the TrinoCatalog resource, you add an inline connection configuration as shown below:

s3: (1)
  inline:
    host: test-minio (2)
    port: 9000 (3)
    pathStyleAccess: true (4)
    secretClass: minio-credentials  (5)
    tls:
      verification:
        server:
          caCert:
            secretClass: minio-tls-certificates (6)
1 Entry point for the connection configuration
2 Connection host
3 Optional connection port
4 Optional flag if path-style URLs should be used; This defaults to false which means virtual hosted-style URLs are used.
5 Name of the Secret object expected to contain the following keys: accessKey and secretKey
6 Optional TLS settings for encrypted traffic. The secretClass can be provided by the Secret Operator or yourself.

A self provided S3 TLS secret can be specified like this:

---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
  name: minio-tls-certificates
spec:
  backend:
    k8sSearch:
      searchNamespace:
        pod: {}
---
apiVersion: v1
kind: Secret
metadata:
  name: minio-tls-certificates
  labels:
    secrets.stackable.tech/class: minio-tls-certificates
data:
    ca.crt: <your-base64-encoded-ca>
    tls.crt: <your base64-encoded-public-key>
    tls.key: <your-base64-encoded-private-key>

It is also possible to configure the bucket connection details as a separate Kubernetes resource and only refer to that object from the TrinoCatalog specification like this:

s3:
  reference: my-connection-resource (1)
1 Name of the connection resource with connection details

The resource named my-connection-resource is then defined as shown below:

---
apiVersion: s3.stackable.tech/v1alpha1
kind: S3Connection
metadata:
  name: my-connection-resource
spec:
  host: test-minio
  port: 9000
  accessStyle: Path
  credentials:
    secretClass: minio-credentials

This has the advantage that the connection configuration can be shared across applications and reduces the cost of updating these details.