Skip to main content

Installing Semantic Treehouse

caution

This guide and the Helm chart is a work in progress. If you're stuck, or you want to contribute, please reach out to us via Discord.

TL;DR

To deploy Semantic Treehouse (STH) with Helm, run:

helm repo add sth https://charts.semantic-treehouse.nl
helm install mysth sth/semantic-treehouse
caution

The application is not ready to use yet, because it needs a database installation and configuration. This is not covered in the STH Helm chart. See sections below about the database.

Introduction

This documentation page describes how to install/deploy a Semantic Treehouse environment. It might seem daunting, but we hope this document clarifies how to get STH up and running. As this is a work in progress, we will improve this guide in the coming period and are open to suggestions.

Semantic Treehouse is a web application with many functionalities, components and customisations. We use Helm for the installation process. Helm is the package manager for Kubernetes which facilitates deploying applications to a cloud infrastructure.

What is included in the installation?

  • Semantic Treehouse web application, including:
    • Kubernetes service, deployment and persistent storage claim
    • Environment configurations
    • Initial database schema

What is not included (yet)?

  • Load balancer / ingress
  • Database server
  • Additional services for Semantic Treehouse:
    • Validator - for validating XML messages
    • WebVOWL - for visualizing ontologies
    • JSON schema preprocessor

Preconditions - what do you need to know (and have)?

  1. This is a pretty technical process, so a decent level of technical skill is required. The things you need to be able to do is install tools, execute terminal commands and SQL statements and edit configuration files.
  2. Access to Kubernetes cluster.
    • If you'll want to run it on your local machine we suggest to look at Minikube. For Minikube on Windows, you'll need local admin rights, Docker Desktop and preferably WSL. We refer to the Minikube installation guide for setting it up
    • You'll need basic understanding and cluster management tool for Kubernetes, e.g. kubectl CLI or management app like Lens
  3. STH regular resources for running a web application:
    • CPU: 2vCPU is more then enought
    • RAM: 256 mb is ok, 512 Mb is better
    • Disk: defaults to 2 Gb, but can be customized
  4. Locally installed Helm CLI.
  5. Access to a database server running MariaDB v10.6.

Step 1: Add the STH charts repository

To add the Semantic Treehouse charts repository, run:

helm repo add sth https://charts.semantic-treehouse.nl
About the chart and values files

The chart is a template and instruction for Helm to install the application. The main part of configuring Helm charts is through a values file. The chart contains a default values file with explanations, but you'll be able to copy the file with a custom name and make your own. You only need to modify the values file and don't need to look at the other files of the chart.

The values file contains customisations like the server URL where your environment will be located at, eg. http://localhost:8080 or https://domain.semantic-treehouse.nl.

To see details about the chart (latest version), run:

helm show chart sth/semantic-treehouse

To use the default values file as template for customizations, run:

helm show values sth/semantic-treehouse > mysth-values.yaml

To find chart versions use the following command, run:

helm search repo sth -l
About Semantic Treehouse chart and app versions

The chart specifies the chart version (x.y.z) and refers to a specific app version of Semantic Treehouse (x.y.z). App versions follow the releases.

E.g. chart version 0.1.0 uses STH app version v3.0.0.

note

Don't worry, the app will not automatically update itself when you pin a x.y.z version in the chart. If you want to upgrade you can choose a newer version of the chart and that will point to the latest version of semantic treehouse. After you apply a Helm upgrade to the release, then the migration container should take care of migrating the database when a new version is published. Please read the changelog of semantic treehouse to see what's changing in every update.

Step 2: Prepare the database

Semantic Treehouse requires an initialised database schema in a correctly configured MariaDB server. This section describes the procedure to get a correct schema and credentials. The database is used to store most of the application data and holds the state of the app (like specifications you upload and user accounts).

Deploy a MariaDB server

The Semantic Treehouse Helm chart does not install the database server; we assume it is running and you have access to it. We plan to include the database and the initialisation into this Chart, but we're not there yet, so this section describes how to get it done manually.

You can use a Helm chart of Bitnami to deploy a MariaDB server. Read about is here. Important: use MariaDB version 10.6

Verify access to the database server. Accessible means that pods in your kubernetes cluster are able to reach the database server. You can test access with the command minikube ssh and a netcat command to the IP and port of the database server. If a timeout occurs, that might indicate a firewall issue. You might want to use a database management tool with a UI like HeidiSQL, phpMyAdmin or MySQL workbench.

For Semantic Treehouse to function properly we need some specific MariaDB configuration settings that are non-default, so be sure to set them on your MariaDB server:

configuration: |-
[mysqld]
## Customized for Semantic Treehouse use
lower-case-table-names=1
sql_mode='ANSI,TRADITIONAL'
character-set-server=utf8mb4
collation-server=utf8mb4_nopad_bin

Create a new database schema

Create a new schema with a name of your choice with default charset utf8mb4 and collation utf8mb4_nopad_bin. Example SQL command:

CREATE DATABASE "mysth" DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_nopad_bin;

You'll need to configure the schema name later on in the Helm values file.

Create database users

The STH backend application and the database migration script (init container) both need access to the newly created schema. Therefore we need two database users:

  1. A database user with the correct permissions to read, write and delete database records. Used by the Semantic Treehouse backend application
    • Default username: sth-user
  2. A database user with the correct permissions to performing schema upgrades. Used by the migration script that runs as a init container before the application starts. The Semantic Treehouse container will not start if the init container can not access the database; so this is important!
    • Default username: sth-migration-user

You need the user credentials in the next step. Those are stored in a Kubernetes secrets.

Step 3: Store database user credentials in Kubernetes secrets

User credentials are sensitive information; we recommend you to store them in a secret. The name of the secret and selected username for both users is important, because we'll configure it in the Helm values file.

 kubectl create secret generic mysth-migration-db-user \
--from-literal=username='sth-migration-user' \
--from-literal=password='MyStrongPassword__123401243..'
 kubectl create secret generic mysth-sth-db-user \
--from-literal=username='sth-user' \
--from-literal=password='MyOtherStrongPassword__98701243..'

Note the space character preceding the command, this is intentional, since it excludes this command from the shell history.

Step 4: Configure Helm values file for the environment

Get a default values file by the running the following command:

helm show values sth/semantic-treehouse > mysth-values.yaml

The most important values that need to be changed are addressed below, grouped into three categories:

  • STH application parameters
  • Database parameters
  • Data persistence parameters

The Helm chart does not contain an ingress resource. We'll need to configure external trafic (from internet) manually.

STH application parameters

# Configurations for the Semantic Treehouse core container
sth:
# Specifies the URL where the application will be deployed, e.g. `https://mysth.semantic-treehouse.nl`
# For a local deployment this is most often `http://localhost:8080` or any other HTTP port you configure
serverUrl: "http://localhost:8080"

# Specifies if detailed log information is presented to end user. Do not use this in production environment
debugMode: false

# Specifies if the application is in production mode. Initialisation and admin features are disabled in production mode
# Set to false the first time, so you can initialize Semantic Treehouse and login with the admin account (without OAuth login)
productionMode: false

# This URL points to the json schema preprocessor service, not yet added to the Helm chart
jsonSchemaPreprocessorEndpoint: "http://path-to-jschema-preprocessor.local"

# This URL points to the validator service, not yet added to the Helm chart
validatorEndpoint: "http://path-to-validator.local/validation-request"

Database parameters

database:
# Database server endpoint (IP or hostname) without port
existingService: database-host-here.local

# Database port number
port: 3306

# Database schema name that holds the STH application data
schemaName: mysth

# Database user with permissions to read, write, delete on schema tables
sthUser:
# Name of existing secret that holds user credentials (username + password)
# When set to empty string Helm generates user credentials
existingSecret: ""

# Database user with permissions to performing schema upgrades
# Used by the migration script that runs as a init container before the application starts
migrationUser:
# Name of existing secret that holds user credentials (username + password)
# When set to empty string Helm generates user credentials
existingSecret: ""

Next to the database, the application also stores files on a file system. Typically this involves user uploads, like PDF documents, xml schema files for validation or specification icons. We recommend configuring a persistent volume claim to keep this data safe.

persistence:
# Specifies existing persistent volume claim (PVC) to use for the file
# If this is not empty, the other configs below are ignored
# If empty, Helm will create a new PVC with the configs below
existingClaim: ""

# Size of the new PVC
size: 2Gi

# Reclaim policy for the new PVC
reclaimPolicy: "Retain"

Step 5: Run the Helm installation

Installing the chart happens by running the following command from the folder of the chart. Note that:

  • mysth is the Helm release name in our example
  • --namespace mysth-ns specifies the kubernetes namespace where to deploy the resources
  • -f mysth-values.yaml refers to the values files configured in previous step
  • sth/semantic-treehouse points to the Helm chart in the added repo sth in step 1
helm install mysth --namespace mysth-ns -f mysth-values.yaml sth/semantic-treehouse

If you first want to test the installation without applying it, you can use the --dry-run option of Helm. Helm will try to generate the resource templates, but will not apply them to the cluster. If it reports errors, you can fix them first.

helm install mysth --dry-run --namespace mysth-ns -f mysth-values.yaml sth/semantic-treehouse

This should create the Kubernetes resources necessary for STH: a deployment, a service (hostname exposed in the cluster internal network), some config maps, a persistent volume claim and secrets.

Step 6: Monitoring the rollout

You can monitor the Helm install process by looking at your Kubernetes management app or using kubectl.

Some things that can go wrong:

  • The secrets for sth and migration database accounts don't exist. This will cause the Helm install to fail.
  • A database user account (username/password combination) can't access the schema in the database server. Check with your database management tool whether remote login is allowed for both users. Also check whether the schema mentioned in your values.yaml file exists in the database server.

Step 7: Accessing the service using port-forward

If it's running ok, we can create a port-forward to be able to access the server from localhost.

Since we didn't include an ingress (yet), we'll have to allow access to our service inside the cluster, before a browser window can reach our app. The kubectl port-forward can create a listening port on the host that routes to the application inside your cluster. Your terminal window will output a URL to the service if it succeeds. That URL should lead to the frontend of the STH environment.

Step 8: Initialise database

The UI is installed now, but we need to initialise the database (create all tables). We do this by calling the following API call:

GET http://localhost:8080/api/v1/admin/installer?defaultPop=true

This operation is only available when the STH variable productionMode is set to false and the database user has permissions.

After visiting this page, a technical response should appear indicating the application is (re-)installed and we can proceed to login.

Step 9: Login for the first time

Semantic Treehouse uses OAuth for user authentication. This needs to be configured for each environment individually. See Configure OAuth identity providers.

Disabling production mode allows us to circumvent this. This happens through the Helm value config productionMode: false. It unlocks the API endpoint GET api/v1/admin/test/login/{accountId}. You can use admin as accountId. This call should be confirmed by a 200 OK response after which you can navigate to the application; you are not redirected automatically. The same browser will be logged in based on the browser session.

SUCCES

This concludes the guide to install a STH environment.

Uninstalling

danger

Removing the application from the Kubernetes cluster is fortunately one of the simplest steps. It can be done simply by typing:

helm delete -n mysth-ns mysth

If you've forgotten the release name, execute helm ls to see which applications are deployed by Helm.

Existing secrets (as mentioned in the yaml file) will not be removed. But beware, Helm generated resources will be removed (including the persistent volume that holds the uploads). The database will not be removed since it is not included in the Helm chart.