Service Offering

metalstack.cloud lets you provision and manage Kubernetes clusters in an easy, developer friendly manner and takes care of IP addresses and persistent storage. Because we provide Kubernetes in its vanilla flavor, you will find many references to the official Kubernetes documentation.

The platform is based on the open source project metal-stack.io to manage the underlaying bare metal resources.

Our servers are located in an Equinix data center in Munich, Germany. The location is GDPR-compliant, ISO 27001 certified, has redundant power from renewable sources, a redundant internet uplink and offers HVAC measures.

1. Prerequisites

To use our platform, you need an existing Github, Microsoft or Google account and a valid email address. With a OAuth authentication flow you can then register and login to our platform.

Furthermore, a valid credit card is required, as well as your company’s VAT ID when you want to use our service after the trial phase.

2. User and Project Management

On the platform you see two organizational elements: tenants and projects. Users and organizations are tenants. Each user can be a member or owner of multiple tenants. Each tenant can contain multiple projects and each project can contain many clusters. Users can be invited to other tenants or into single projects.

2.1 Roles

Every tenant or project membership of a user has a role.

The following roles exist:

A tenant guest has access to at least one project. Only basic information of the tenant is accessible.
A viewer can only display resources and can generate a kubeconfig.
An editor can change and create resources.
A project owner is allowed to invite new members.
The tenant owner can access billing data and has access to the onboarding.

2.2 Tenant Invitations

In case you are an owner of a tenant like an organization or your user, you are able to invite users into your tenant. Open the web UI, click on the tenant image in the upper right corner. A dropdown will pop up. In here select Manage Organizations, and click on the tenant you want to invite a user to. Now hit the Invite member button above the table containing all members.

Now you are able to select the role of the member to be invited. Once you click on the Create link button, a link will be generated that will expire if not used. The link can only be used to invite a single tenant member. Share this link with the person you want to invite.

Every member will be able to see the tenant and all containing projects with the selected role.

2.3 Project Invitations

If you are the owner of the current project, you are able to invite other users into the current project. Make sure to select the desired tenant and project in the web UI.

Navigate to Project Members under SETTINGS in the navigation. Click the Invite new member button to access the correct form. Now you are able to select the role of the member to be invited. Once you click on the Create link button, a link will be generated that will expire if not used. The link can only be used to invite a single project member. Share this link with the person you want to invite.

Every member will be able to see the tenant of your project. If the newly added member wasn’t a member of the tenant of your project, the member gains the guest role within that tenant.

2.4 Creating Organizations

Every user is able to create new organization tenants to group their projects. Open the web UI, click on the tenant image in the upper right corner. A dropdown will pop up. In here select Manage Organizations, and click on the tenant you want to invite a user to.

Now hit the Create new organization button and fill in all information and click on the Create Organization button to finish.

2.5 Creating Projects

To create a new project within a tenant, you either need to be an editor or an owner of the tenant. Make sure to select the desired tenant in the web UI.

Now open the project switch dropdown and select the Manage Projects entry. Hit the Create new project button. Fill in all information and Create project.

3. Managed Kubernetes

The base costs for Kubernetes clusters incur from worker nodes and the Kubernetes control-plane.

3.1 Machine Types

The platform offers machine types with these hardware specifications:

Name	CPU	Memory	Storage	Price/min
n1-medium-x86	1x Intel Xeon D-2141I	32GB RAM	960GB NVMe	0.01250€/min
c1-medium-x86	1x Intel Xeon D-2141I	128GB RAM	960GB NVMe	0.01917€/min
c1-large-x86	2x Intel Xeon Silver 4214 (12 Core)	192GB RAM	960GB NVMe	0.02916€/min

3.2 Provisioning

Creating a Cluster

If you want to create a new cluster you first have to navigate to the cluster overview by clicking on Kubernetes in the navigation. Then click on the Create Cluster Button. Select the version of Kubernetes that you require to run your cluster.

In the following form you can create your desired Kubernetes cluster.

First you have to specify a name and then you can choose a location, the different server types and number of nodes for the cluster. The name must be between two and 10 characters long, in lower case, and no special characters are allowed, except ’-‘. Whitespace and special characters are not supported. This restriction is necessary due to DNS constraints of your cluster’s API server.

Attention: You should not rely on the IP address of your API server as it is not guaranteed that the IP of your API server forever stays the same. Use the DNS name inside your cluster’s kubeconfig instead.

Lastly you can specify the used Kubernetes version and then create the cluster with the submit button. Clusters will be provisioned in the location of your choice. The cluster creation may take a couple of minutes to complete. You can follow the process in cluster overview.

Clusters which are placed inside the same project are allowed to announce the same IP addresses for services of type load balancer, which allows ECMP load balancing through the BGP routing protocol for external services inside your clusters.

On the other side, clusters placed in different projects can not announce the same IP address. Please refer to the ip addresses section for further details on IP addresses.

3.3 Kubernetes Cluster & Kubeconfig

After you have submitted the cluster, it is shown in the cluster overview. There you can see your new cluster and on the left side is an indication if the cluster is already running or still being created.

Under the “Actions” column you can open a menu to view the details of your cluster, generate the kubeconfig to access the cluster and delete the cluster if it is no longer needed.

Attention: Please be aware that the downloadable kubeconfig has cluster-admin privileges! To mitigate the impact of leaked credentials, it is required to define an expiration time for the kubeconfig. You can use the admin kubeconfig to define more fine-grained permissions with service accounts.

In the cluster details view you can see all the available information about your cluster. It is also possible to update some of the cluster properties like the cluster version.

For the time being, within a cluster we support one server type only (worker groups will follow soon). Costs of a cluster change in accordance with your chosen server type. Changing the server type causes a worker roll. metalstack.cloud offers auto updates and auto scaling for your clusters by default. metalstack.cloud updates Kubernetes patch versions as well as operating systems automatically. Specify a maintenance time window, during which these updates may be performed. The number of worker nodes is scaled in the range you provided depending on your workload. For further information on interruption-free cluster operation, read here.

Workers

Choose the range of servers your cluster can utilize. For production use-cases we recommend to configure two worker nodes at minimum in order to spreading your applications across multiple worker nodes. This is important for interruption-free operation during cluster maintenance operations.

At max a cluster can have 32 workers (theoretical limit is 1024, which we can raise at a later point in time).

Our platform scales your cluster in the specified range if sufficient workers are available. Local storage depends on the lifecycle of a worker: it is ephemeral and will be wiped when the worker is rotated out of your cluster. You can change the number of guaranteed workers in the minimum setting. You only pay the number of workers you actually use.

Control-Plane

The Kubernetes control-plane of every cluster is managed outside of your cluster in the responsibility of metalstack.cloud.

The control-plane needs to be paid for the whole lifetime of a cluster. The control-plane includes a highly-available, regularly backed-up Kubernetes control-plane (kube-apiservers, kube-controller-manager, kube-scheduler, ETCD, …), a dedicated firewall, IDS events and private networking with an internet gateway.

Firewall

A firewall is always deployed along with your cluster as a physical server from the type n1-medium-x86. The firewall secures your cluster from external networks like the internet.

The firewall can be configured through the custom resource called ClusterwideNetworkPolicy (CWNP). With CWNPs you can control which egress and ingress traffic to external networks should be allowed. Ingress traffic for services of type load balancer is allowed automatically without the need to define an extra CWNP resource.

The package drops that occur on the firewall are forwarded to a special pod in your cluster. The pod is deployed into the firewall namespace called droptailer.

The state of your firewall can be checked by another custom resource called FirewallMonitor, which also resides in the firewall namespace, e.g.:

kubectl get fwmon -n firewall
NAME                                       MACHINE ID                             IMAGE                 SIZE            LAST EVENT    AGE
shoot--f8e67080bc--test-firewall-d2a72     77abee12-5c0d-4adf-91f2-e48ffa4f3449   firewall-ubuntu-3.0   n1-medium-x86   Phoned Home   68d

When being provisioned, a firewall gets an internet IP automatically for outgoing communication. Your outgoing cluster traffic is masqueraded behind this IP address (SNAT). When the firewall gets rolled, it is possible that the source IP of your outgoing cluster traffic changes. We can provide static egress IP addresses in the near future. If you require this feature before it is GA, please contact us.

Example CWNP

apiVersion: metal-stack.io/v1
kind: ClusterwideNetworkPolicy
metadata:
  namespace: firewall
  name: clusterwidenetworkpolicy-egress
spec:
  egress:
    - to:
        - cidr: 154.41.192.0/23
        - cidr: 185.164.161.0/24
      ports:
        - protocol: TCP
          port: 5432

Full examples and documentation can be found at https://github.com/metal-stack/firewall-controller.

Interruption-Free Cluster Operation

To keep service interruptions as small as possible during cluster upgrades or within or maintenance time windows, we recommend reading the following section.

Kubelet Restart

Both during a maintenance time window or a Kubernetes version patch upgrade, the kubelet service on the worker nodes gets restarted (jittered within a 5-minute time window).

Hence, we advise you to verify that your workload tolerates the restart of the kubelet service. The restart of the kubelet service can be manually tested using the following node annotation:

kubectl annotate node <node-name> worker.gardener.cloud/restart-systemd-services=kubelet

Additionally, when a kubelet gets restarted, Kubernetes changes the status of the worker node to NotReady for a couple of seconds (see here. Effectively, this leads to the temporary withdrawal of external ip announcements for this worker node. Active network connections to this node are interrupted. To reduce the impact of the restart, our recommendation is to spread services that receive external network traffic onto more than a single node in the cluster, which ensures your external service stays reachable during this operation.

MetalLB Speaker Restart

In order to offer services of type load balancer in our clusters, we manage an installation of MetalLB in the metallb-system namespace of your cluster. During the maintenance time window there is a chance that we update the resources of the MetalLB deployment. This operation can potentially trigger a rolling update of the metallb-speaker daemon set.

When a speaker shuts down, the external ip announcements for this worker nodes are withdrawn until the pod comes back up running.

In general, this is not a huge deal if the cluster has more than one worker node and your service type load balancer is deployed with externalTrafficPolicy: Cluster. However, we recommend spreading service that receive external network traffic onto more than a single node in the cluster in order to keep potential service interruption as small as possible.

You can simply test this behavior by running a restart of the daemon set manually:

kubectl rollout restart -n metallb-system ds speaker

Worker Node Rolls

There can be multiple reasons that cause a roll of worker nodes of your cluster:

Major and minor upgrades of the Kubernetes version
Significant changes to the worker group (e.g. updating the worker’s OS image)

When this happens, a new worker node is added to your cluster. Then, an old worker node gets drained (StatefulSets are drained sequentially) and removed. This procedure repeats until all worker nodes of your cluster were updated.

To make this procedure as smooth as possible, we recommend taking the following actions:

Refrain from using the local storage on the worker nodes and instead use cloud storage (see volumes section)
- Local storage cannot be restored once a worker node was removed from the cluster!
- As local storage volumes cannot be transferred to different worker nodes, they might cause issues with StatefulSets during worker rolls.
Spread your workloads across the cluster such that you can tolerate draining a worker node
Configure PodDisruptionBudgets

3.4 Deleting a Cluster

To delete a cluster, select the cluster you wish to terminate and click “Delete”. Please be aware that for this action an extra confirmation (typing in the name of the current cluster) is needed in order to be sure you really REALLY mean it. Once issued, a cluster deletion cannot be cancelled anymore. This process usually takes a couple of minutes.

Be aware that all the cluster’s PersistentVolumeClaims (PVC) are deleted during cluster deletion. Depending on the ReclaimPolicy this might cause associated volumes referenced by the claim to get deleted as well. If you set the ReclaimPolicy to Retain, PVCs are not deleted, which allows re-using them in another cluster at a later point in time. For further information please refer to the official documentation and our volume section. Please be aware that, if the reclaim policy is set to Retain, you must delete the volumes after usage yourself in the console if you do not require them anymore. Unused volumes are also subject to your bill.

Attention:: As clusters are deleted gracefully, deployed mutating webhooks may result in a delay in the deletion of the cluster. This is because controllers may be deleted before the deletion of resources intercepted by these webhooks.

4. Volumes & Snapshots

4.1 Volumes

In the volumes view you can see the volumes of your clusters. It is not possible to create volumes through the web console as they are actually managed and created through the Kubernetes resources in your cluster.

Only those volumes that were created from our storage classes that utilize the csi.lightbitslabs.com provisioner are visible in the volumes view:

❯ k get sc
NAME                PROVISIONER              RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION
csi-lvm             metal-stack.io/csi-lvm   Delete          WaitForFirstConsumer   false
premium (default)   csi.lightbitslabs.com    Delete          Immediate              true
premium-encrypted   csi.lightbitslabs.com    Delete          Immediate              true

The volumes used inside your cluster may survive the lifespan of the cluster itself by utilizing ReclaimPolicy: Retain in your PersistentVolume (PV) resources. With this policy, you can also de-attach volumes and use them in other clusters inside the same project. However, a volume can never be attached to multiple clusters / worker nodes at the same time.

Please be aware if the reclaim policy is set to Retain you must delete the volumes after usage by yourself in the volume view if you do not require them anymore. Unused volumes are also subject to billing.

4.2 Volume Encryption

With metalstack.cloud you can bring your own key to encrypt a PersistentVolume. The encryption is done client-side on the worker node and uses the Linux-Kernel-native LUKS2 encryption method.

To use this feature you have to choose the premium-encrypted storage class in your PersistentVolumeClaim and create a secret in the namespace where the encrypted volume will be used:

---
apiVersion: v1
kind: Secret
metadata:
name: storage-encryption-key
namespace: default # please fill in the namespace where this volume is going to be used
stringData:
  host-encryption-passphrase: please-change-me # change to a safe password
type: Opaque

Please be aware that in the case of key loss, it is not possible to decrypt your data afterwards, and will therefore be rendered useless.

Hint: The performance of encrypted volumes storage is lower than the performance of unencrypted volumes. Also the size of an encrypted volume should not exceed 1TB as otherwise this may lead to provisioner pods or processes on the node to exceed memory usage, effectively preventing your volume to be mounted.

Creating, Managing and Deletion

For the operations mentioned, please refer to the official Kubernetes documentation.

4.3 Snapshots

Snapshots are shown in the snapshots tab within the volumes view.

The process and usage of snapshots Kubernetes documentation.

5. IP Addresses

In the IP addresses view you can allocate internet IPs for your clusters. You can give your new IP a name and add an optional description. Click the Allocate button to acquire an IP. After the IP is created, you can see the IP in the IP addresses view. By default IP addresses are ephemeral. At “Actions” menu you can open the IPs details view, make the IP static and delete the IP.

If your Kubernetes Service resource is of type LoadBalancer, metalstack.cloud automatically assigns an ephemeral internet IP address to your service. Ephemeral IP addresses are cleaned up automatically as soon as your service (or cluster respectively) is deleted. Please be aware that it is not guaranteed to receive the same IP address again when the services is being recreated.

In order to assign an IP address that was created through the IP addresses view, please define the IP address in the loadBalancerIP field of your service resource.

If you would like to keep an IP address longer than the lifetime of the Service resource or the cluster, you need to turn it into a static IP address through the metalstack.cloud console.

Attention: Within the same project, an IP address can be used in several clusters and locations at the same time. The traffic is routed using ECMP load balancing through the BGP. Ephemeral IPs are deleted automatically as soon as no service references the IP address anymore. An IP address can be allocated exclusively for one project. It can not be used for other projects.

Public internet IP addresses are subject to billing.

6. API Access

6.1 Access Token

It is possible to generate Tokens from the UI. Navigate to Access Tokens under SETTINGS in the navigation. Click the Generate new token button to access the correct form. Please be sure to be in the correct project you want the token for.

Now you can provide a short description and set the expiration of the token in days. After that you can control the scope of the token and what methods it should be allowed to use.

After clicking the Generate token button, you should copy the token and store it somewhere safe. You will not be able to see it again. If you loose a token, you can always delete it and generate a new one.

When leaving the token form you should see the list of your tokens for this project. You can check the details of your tokens or delete them.

6.2 API

Here is our API documentation with examples. The endpoint of the api is https://api.metalstack.cloud. You can use this guide to learn how you can access the api with your own tools.

Sometimes you need the Project ID to access some parts of the api. For example, to create a cluster. Either you can extract that from your token programmatically, or navigate to the Dashboard. There you will find a button besides the heading, where you can copy the Project ID also.

6.3 Terraform Provider

Here is a guide on how to use our Terraform Provider, that is also compatible with Open Tofu.

6.4 Auditing

API Requests that may change resources or may expose sensitive data are logged and can be audited by the owner of the tenant.

To view the logged requests and responses, open the web UI, click on the tenant image in the upper right corner. A dropdown will pop up. In here select Auditing. This option will only be displayed if you are the owner of the current tenant.

Above the table you can toggle the live reload of incoming requests or add filters in case you search for a specific request like within a given time frame. Below the table you can paginate through all results.

To display more details of an audit trace, click on its request ID. Now all information is being displayed, including the request and response body.

7. DNS Management

It is possible to create DNS records for your Services of type LoadBalancer and Ingress resources through dedicated resource annotations. However, this is only possible underneath your own cluster subdomain, which can for instance be discovered through the command kubectl cluster-info.

For services type LoadBalancer this may look like this:

apiVersion: v1
kind: Service
metadata:
  annotations:
    dns.gardener.cloud/class: garden
    dns.gardener.cloud/dnsnames: my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io
    dns.gardener.cloud/ttl: "180"
  labels:
    app: my-app
  name: my-app
  namespace: default
spec:
  ports:
  - name: my-backend
    port: 443
    protocol: TCP
    targetPort: 443
  selector:
    app: my-app
  type: LoadBalancer

And the Ingress resource template:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    dns.gardener.cloud/class: garden
    dns.gardener.cloud/dnsnames: my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io
    dns.gardener.cloud/ttl: "180"
  labels:
    app: my-app
  name: my-app
  namespace: default
spec:
  ingressClassName: nginx
  rules:
  - host: my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io
    http:
      paths:
      - backend:
          service:
            name: my-backend
            port:
              number: 443
        path: /
        pathType: Prefix

You can check whether your DNS entry was created or not through events associated with your corresponding resource:

$ kubectl describe ing my-app
Name:             my-app
Labels:           app=my-app
Namespace:        default
Address:          154.41.192.43
Ingress Class:    nginx
Default backend:  <default>
Rules:
  Host                                              Path  Backends
  ----                                              ----  --------
  my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io  
                                                    /   nginx:80 (10.240.2.6:80)
Annotations:                                        dns.gardener.cloud/class: garden
                                                    dns.gardener.cloud/dnsnames: my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io
                                                    dns.gardener.cloud/ttl: 180
Events:
  Type    Reason          Age                 From                      Message
  ----    ------          ----                ----                      -------
  Normal  Sync            41s (x3 over 72s)   nginx-ingress-controller  Scheduled for sync
  Normal  dns-annotation  41s                 dns-controller-manager    my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io: created dns entry object shoot--<project-hash>--<cluster-name>/my-app-ingress-s5f22
  Normal  dns-annotation  41s                 dns-controller-manager    my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io: dns entry is pending
  Normal  dns-annotation  41s (x3 over 41s)   dns-controller-manager    my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io: dns entry pending: waiting for dns reconciliation
  Normal  dns-annotation  32s (x11 over 40s)  dns-controller-manager    my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io: dns entry active

7.1 Bring Your Own DNS Provider

You can manage your own DNS entries through Kubernetes resources at your own DNS provider as shown above.

Following DNS providers are supported:

alicloud-dns
aws-route53
azure-private-dns
azure-dns
cloudflare-dns
google-clouddns
infoblox-dns
netlify-dns
openstack-designate
powerdns
remote
rfc2136

Here is an example configuration for Google CloudDNS:

apiVersion: dns.gardener.cloud/v1alpha1
kind: DNSProvider
metadata:
  annotations:
    dns.gardener.cloud/class: garden
  name: my-own-domain
  namespace: default
spec:
  type: google-clouddns
  secretRef:
    name: my-own-domain-credentials
  domains:
    include:
    - my-dns-provider-test.example.com
---
apiVersion: v1
kind: Secret
metadata:
  name: my-own-domain-credentials
  namespace: default
type: Opaque
stringData:
  serviceaccount.json: |
      {...}

Further example configurations for DNS providers and the required secrets can be looked up here.

7.2 Certificate Management

Similar to managing DNS entries and providers, it is possible to issue certificates through Kubernetes resources.

In order to issue an ingress certificate you can use the following annotations in your Ingress definition:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    cert.gardener.cloud/purpose: managed
    dns.gardener.cloud/class: garden
    dns.gardener.cloud/dnsnames: my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io
    dns.gardener.cloud/ttl: "180"
  labels:
    app: my-app
  name: my-app
  namespace: default
spec:
  ingressClassName: nginx
  rules:
  - host: my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io
    http:
      paths:
      - backend:
          service:
            name: my-backend
            port:
              number: 443
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io
    secretName: my-app-tls-secret

The state of your certificate can be checked through associated events:

$ kubectl describe ingress
Name:             my-app
Labels:           app=my-app
Namespace:        default
Address:          154.41.192.43
Ingress Class:    nginx
Default backend:  <default>
TLS:
  my-app-tls-secret terminates my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io
Rules:
  Host                                              Path  Backends
  ----                                              ----  --------
  my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io  
                                                    /   nginx:80 (10.240.2.6:80)
Annotations:                                        cert.gardener.cloud/purpose: managed
                                                    dns.gardener.cloud/class: garden
                                                    dns.gardener.cloud/dnsnames: my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io
                                                    dns.gardener.cloud/ttl: 180
Events:
  Type    Reason          Age                 From                    Message
  ----    ------          ----                ----                    -------
  Normal  reconcile        107s                cert-controller-manager   created certificate object default/my-app-ingress-6cx9g
  Normal  cert-annotation  107s                cert-controller-manager   my-app-tls-secret: cert request is pending
  Normal  dns-annotation   107s                dns-controller-manager    my-app.<cluster-name>.<project-hash>.k8s.metalstackcloud.io: created dns entry object shoot--<project-hash>--<cluster-name>/my-app-ingress-992nl
  Normal  cert-annotation  107s                cert-controller-manager   my-app-tls-secret: certificate pending: certificate requested, preparing/waiting for successful DNS01 challenge
  Normal  cert-annotation  47s                 cert-controller-manager   my-app-tls-secret: certificate ready

Again, similar to DNS provider, you can also specify custom certificate issuers and reference those through annotations.

apiVersion: cert.gardener.cloud/v1alpha1
kind: Issuer
metadata:
  name: my-own-issuer
  namespace: default
spec:
  acme:
    domains:
      include:
      - my-dns-provider-test.metal-stack.dev
    email: <Email>
    privateKeySecretRef:
      name: my-own-issuer-secret
      namespace: default
    server: https://acme-v02.api.letsencrypt.org/directory
---
apiVersion: v1
kind: Secret
metadata:
  name: my-own-issuer-secret
  namespace: default
type: Opaque
stringData:
  privateKey: |
   {...}

This issuer can be used with the additional annotation: cert.gardener.cloud/issuer: default/my-own-issuer.