Service Offering
metalstack.cloud lets you provision and manage Kubernetes clusters in an easy, developer friendly manner and takes care of IP addresses and persistent storage. Because we provide Kubernetes in its vanilla flavor, you will find many references to the official Kubernetes documentation.
The platform is based on the open source project metal-stack.io to manage the underlaying bare metal resources.
Our servers are located in an Equinix data center in Munich, Germany. The location is GDPR-compliant, ISO 27001 certified, has redundant power from renewable sources, a redundant internet uplink and offers HVAC measures.
1. Prerequisites
To use our platform, you need an existing Github, Microsoft or Google account and a valid email address. With a OAuth authentication flow you can then register and login to our platform.
Furthermore, a valid credit card is required, as well as your company’s VAT ID when you want to use our service after the trial phase.
2. User and Project Management
On the platform you see two organizational elements: tenants and projects. Users and organizations are tenants. Each user can be a member or owner of multiple tenants. Each tenant can contain multiple projects and each project can contain many clusters. Users can be invited to other tenants or into single projects.
2.1 Roles
Every tenant or project membership of a user has a role.
The following roles exist:
- A tenant guest has access to at least one project. Only basic information of the tenant is accessible.
- A viewer can only display resources and can generate a kubeconfig.
- An editor can change and create resources.
- A project owner is allowed to invite new members.
- The tenant owner can access billing data and has access to the onboarding.
2.2 Tenant Invitations
In case you are an owner of a tenant like an organization or your user, you are able to invite users into your tenant. Open the web UI, click on the tenant image in the upper right corner. A dropdown will pop up. In here select Manage Organizations
, and click on the tenant you want to invite a user to.
Now hit the Invite member
button above the table containing all members.
Now you are able to select the role of the member to be invited.
Once you click on the Create link
button, a link will be generated that will expire if not used. The link can only be used to invite a single tenant member. Share this link with the person you want to invite.
Every member will be able to see the tenant and all containing projects with the selected role.
2.3 Project Invitations
If you are the owner of the current project, you are able to invite other users into the current project. Make sure to select the desired tenant and project in the web UI.
Navigate to Project Members
under SETTINGS
in the navigation. Click the Invite new member
button to access the correct form. Now you are able to select the role of the member to be invited.
Once you click on the Create link
button, a link will be generated that will expire if not used. The link can only be used to invite a single project member. Share this link with the person you want to invite.
Every member will be able to see the tenant of your project. If the newly added member wasn’t a member of the tenant of your project, the member gains the guest role within that tenant.
2.4 Creating Organizations
Every user is able to create new organization tenants to group their projects.
Open the web UI, click on the tenant image in the upper right corner. A dropdown will pop up. In here select Manage Organizations
, and click on the tenant you want to invite a user to.
Now hit the Create new organization
button and fill in all information and click on the Create Organization
button to finish.
2.5 Creating Projects
To create a new project within a tenant, you either need to be an editor or an owner of the tenant. Make sure to select the desired tenant in the web UI.
Now open the project switch dropdown and select the Manage Projects
entry. Hit the Create new project
button. Fill in all information and Create project
.
3. Managed Kubernetes
The base costs for Kubernetes clusters incur from worker nodes and the Kubernetes control-plane.
3.1 Machine Types
The platform offers machine types with these hardware specifications:
Name | CPU | Memory | Storage | Price/min |
---|---|---|---|---|
n1-medium-x86 | 1x Intel Xeon D-2141I | 32GB RAM | 960GB NVMe | 0.01250€/min |
c1-medium-x86 | 1x Intel Xeon D-2141I | 128GB RAM | 960GB NVMe | 0.01917€/min |
c1-large-x86 | 2x Intel Xeon Silver 4214 (12 Core) | 192GB RAM | 960GB NVMe | 0.02916€/min |
3.2 Provisioning
LoadingCreating a Cluster
If you want to create a new cluster you first have to navigate to the cluster overview by clicking on Kubernetes
in the navigation. Then click on the Create Cluster
Button. Select the version of Kubernetes that you require to run your cluster.
In the following form you can create your desired Kubernetes cluster.
First you have to specify a name and then you can choose a location, the different server types and number of nodes for the cluster. The name must be between two and 10 characters long, in lower case, and no special characters are allowed, except ’-‘. Whitespace and special characters are not supported. This restriction is necessary due to DNS constraints of your cluster’s API server.
Attention: You should not rely on the IP address of your API server as it is not guaranteed that the IP of your API server forever stays the same. Use the DNS name inside your cluster’s kubeconfig instead.
Lastly you can specify the used Kubernetes version and then create the cluster with the submit
button.
Clusters will be provisioned in the location of your choice. The cluster creation may take a couple of minutes to complete. You can follow the process in cluster overview.
Clusters which are placed inside the same project are allowed to announce the same IP addresses for services of type load balancer, which allows ECMP load balancing through the BGP routing protocol for external services inside your clusters.
On the other side, clusters placed in different projects can not announce the same IP address. Please refer to the ip addresses section for further details on IP addresses.
3.3 Kubernetes Cluster & Kubeconfig
After you have submitted the cluster, it is shown in the cluster overview. There you can see your new cluster and on the left side is an indication if the cluster is already running or still being created.
Under the “Actions” column you can open a menu to view the details of your cluster, generate the kubeconfig to access the cluster and delete the cluster if it is no longer needed.
Attention: Please be aware that the downloadable kubeconfig has cluster-admin privileges! To mitigate the impact of leaked credentials, it is required to define an expiration time for the kubeconfig. You can use the admin kubeconfig to define more fine-grained permissions with service accounts.
In the cluster details view you can see all the available information about your cluster. It is also possible to update some of the cluster properties like the cluster version.
For the time being, within a cluster we support one server type only (worker groups will follow soon). Costs of a cluster change in accordance with your chosen server type. Changing the server type causes a worker roll. metalstack.cloud offers auto updates and auto scaling for your clusters by default. metalstack.cloud updates Kubernetes patch versions as well as operating systems automatically. Specify a maintenance time window, during which these updates may be performed. The number of worker nodes is scaled in the range you provided depending on your workload. For further information on interruption-free cluster operation, read here.
Workers
Choose the range of servers your cluster can utilize. For production use-cases we recommend to configure two worker nodes at minimum in order to spreading your applications across multiple worker nodes. This is important for interruption-free operation during cluster maintenance operations.
At max a cluster can have 32 workers (theoretical limit is 1024, which we can raise at a later point in time).
Our platform scales your cluster in the specified range if sufficient workers are available. Local storage depends on the lifecycle of a worker: it is ephemeral and will be wiped when the worker is rotated out of your cluster. You can change the number of guaranteed workers in the minimum setting. You only pay the number of workers you actually use.
Control-Plane
The Kubernetes control-plane of every cluster is managed outside of your cluster in the responsibility of metalstack.cloud.
The control-plane needs to be paid for the whole lifetime of a cluster. The control-plane includes a highly-available, regularly backed-up Kubernetes control-plane (kube-apiservers, kube-controller-manager, kube-scheduler, ETCD, …), a dedicated firewall, IDS events and private networking with an internet gateway.
Firewall
A firewall is always deployed along with your cluster as a physical server from the type n1-medium-x86
. The firewall secures your cluster from external networks like the internet.
The firewall can be configured through the custom resource called ClusterwideNetworkPolicy
(CWNP). With CWNPs you can control which egress and ingress traffic to external networks should be allowed. Ingress traffic for services of type load balancer is allowed automatically without the need to define an extra CWNP resource.
The package drops that occur on the firewall are forwarded to a special pod in your cluster. The pod is deployed into the firewall
namespace called droptailer
.
The state of your firewall can be checked by another custom resource called FirewallMonitor
, which also resides in the firewall
namespace, e.g.:
kubectl get fwmon -n firewall
NAME MACHINE ID IMAGE SIZE LAST EVENT AGE
shoot--f8e67080bc--test-firewall-d2a72 77abee12-5c0d-4adf-91f2-e48ffa4f3449 firewall-ubuntu-3.0 n1-medium-x86 Phoned Home 68d
When being provisioned, a firewall gets an internet IP automatically for outgoing communication. Your outgoing cluster traffic is masqueraded behind this IP address (SNAT). When the firewall gets rolled, it is possible that the source IP of your outgoing cluster traffic changes. We can provide static egress IP addresses in the near future. If you require this feature before it is GA, please contact us.
Example CWNP
apiVersion: metal-stack.io/v1
kind: ClusterwideNetworkPolicy
metadata:
namespace: firewall
name: clusterwidenetworkpolicy-egress
spec:
egress:
- to:
- cidr: 154.41.192.0/23
- cidr: 185.164.161.0/24
ports:
- protocol: TCP
port: 5432
Full examples and documentation can be found at https://github.com/metal-stack/firewall-controller.
Interruption-Free Cluster Operation
To keep service interruptions as small as possible during cluster upgrades or within or maintenance time windows, we recommend reading the following section.
Kubelet Restart
Both during a maintenance time window or a Kubernetes version patch upgrade, the kubelet
service on the worker nodes gets restarted (jittered within a 5-minute time window).
Hence, we advise you to verify that your workload tolerates the restart of the kubelet
service. The restart of the kubelet
service can be manually tested using the following node annotation:
kubectl annotate node <node-name> worker.gardener.cloud/restart-systemd-services=kubelet
Additionally, when a kubelet
gets restarted, Kubernetes changes the status of the worker node to NotReady
for a couple of seconds (see here. Effectively, this leads to the temporary withdrawal of external ip announcements for this worker node. Active network connections to this node are interrupted. To reduce the impact of the restart, our recommendation is to spread services that receive external network traffic onto more than a single node in the cluster, which ensures your external service stays reachable during this operation.
MetalLB Speaker Restart
In order to offer services of type load balancer in our clusters, we manage an installation of MetalLB in the metallb-system
namespace of your cluster. During the maintenance time window there is a chance that we update the resources of the MetalLB deployment. This operation can potentially trigger a rolling update of the metallb-speaker
daemon set.
When a speaker shuts down, the external ip announcements for this worker nodes are withdrawn until the pod comes back up running.
In general, this is not a huge deal if the cluster has more than one worker node and your service type load balancer is deployed with externalTrafficPolicy: Cluster
. However, we recommend spreading service that receive external network traffic onto more than a single node in the cluster in order to keep potential service interruption as small as possible.
You can simply test this behavior by running a restart of the daemon set manually:
kubectl rollout restart -n metallb-system ds speaker
Worker Node Rolls
There can be multiple reasons that cause a roll of ther worker nodes of your cluster:
- Major and minor upgrades of the Kubernetes version
- Significant changes to the worker group (e.g. updating the worker’s OS image)
When this happens, a new worker node is added to your cluster. Then, an old worker node gets drained (StatefulSets
are drained sequentially) and removed. This procedure repeats until all worker nodes of your cluster were updated.
To make this procedure as smooth as possible, we recommend taking the following actions:
- Refrain from using the local storage on the worker nodes and instead use cloud storage (see volumes section)
- Local storage cannot be restored once a worker node was removed from the cluster!
- As local storage volumes cannot be transferred to different worker nodes, they might cause issues with StatefulSets during worker rolls.
- Spread your workloads across the cluster such that you can tolerate draining a worker node
- Configure
PodDisruptionBudgets
3.4 Deleting a Cluster
To delete a cluster, select the cluster you wish to terminate and click “Delete”. Please be aware that for this action an extra confirmation (typing in the name of the current cluster) is needed in order to be sure you really REALLY mean it. Once issued, a cluster deletion cannot be cancelled anymore. This process usually takes a couple of minutes.
Be aware that all the cluster’s PersistentVolumeClaim
s (PVC) are deleted during cluster deletion. Depending on the ReclaimPolicy
this might cause associated volumes referenced by the claim to get deleted as well. If you set the ReclaimPolicy
to Retain
, PVCs are not deleted, which allows re-using them in another cluster at a later point in time. For further information please refer to the official documentation and our volume section.
Please be aware that, if the reclaim policy is set to Retain
, you must delete the volumes after usage yourself in the console if you do not require them anymore. Unused volumes are also subject to your bill.
Attention:: As clusters are deleted gracefully, deployed mutating webhooks may result in a delay in the deletion of the cluster. This is because controllers may be deleted before the deletion of resources intercepted by these webhooks.
4. Volumes & Snapshots
4.1 Volumes
In the volumes view you can see the volumes of your clusters. It is not possible to create volumes through the web console as they are actually managed and created through the Kubernetes resources in your cluster.
Only those volumes that were created from our storage classes that utilize the csi.lightbitslabs.com
provisioner are visible in the volumes view:
❯ k get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION
csi-lvm metal-stack.io/csi-lvm Delete WaitForFirstConsumer false
premium (default) csi.lightbitslabs.com Delete Immediate true
premium-encrypted csi.lightbitslabs.com Delete Immediate true
The volumes used inside your cluster may survive the lifespan of the cluster itself by utilizing ReclaimPolicy: Retain
in your PersistentVolume
(PV) resources. With this policy, you can also de-attach volumes and use them in other clusters inside the same project. However, a volume can never be attached to multiple clusters / worker nodes at the same time.
Please be aware if the reclaim policy is set to Retain
you must delete the volumes after usage by yourself in the volume view if you do not require them anymore. Unused volumes are also subject to billing.
4.2 Volume Encryption
With metalstack.cloud you can bring your own key to encrypt a PersistentVolume
. The encryption is done client-side on the worker node and uses the Linux-Kernel-native LUKS2 encryption method.
To use this feature you have to choose the premium-encrypted
storage class in your PersistentVolumeClaim
and create a secret in the namespace where the encrypted volume will be used:
---
apiVersion: v1
kind: Secret
metadata:
name: storage-encryption-key
namespace: default # please fill in the namespace where this volume is going to be used
stringData:
host-encryption-passphrase: please-change-me # change to a safe password
type: Opaque
Please be aware that in the case of key loss, it is not possible to decrypt your data afterwards, and will therefore be rendered useless.
Hint: The performance of encrypted volumes storage is lower than the performance of unencrypted volumes. Also the size of an encrypted volume should not exceed 1TB as otherwise this may lead to provisioner pods or processes on the node to exceed memory usage, effectively preventing your volume to be mounted.
Creating, Managing and Deletion
For the operations mentioned, please refer to the official Kubernetes documentation.
4.3 Snapshots
Snapshots are shown in the snapshots tab within the volumes view.
The process and usage of snapshots Kubernetes documentation.
5. IP Addresses
In the IP addresses view you can allocate internet IPs for your clusters. You can give your new IP a name and add an optional description. Click the Allocate
button to acquire an IP. After the IP is created, you can see the IP in the IP addresses view. By default IP addresses are ephemeral. At “Actions” menu you can open the IPs details view, make the IP static and delete the IP.
If your Kubernetes Service
resource is of type LoadBalancer
, metalstack.cloud automatically assigns an ephemeral internet IP address to your service. Ephemeral IP addresses are cleaned up automatically as soon as your service (or cluster respectively) is deleted. Please be aware that it is not guaranteed to receive the same IP address again when the services is being recreated.
In order to assign an IP address that was created through the IP addresses view, please define the IP address in the loadBalancerIP
field of your service resource.
If you would like to keep an IP address longer than the lifetime of the Service
resource or the cluster, you need to turn it into a static IP address through the metalstack.cloud console.
Attention: Within the same project, an IP address can be used in several clusters and locations at the same time. The traffic is routed using ECMP load balancing through the BGP. Ephemeral IPs are deleted automatically as soon as no service references the IP address anymore. An IP address can be allocated exclusively for one project. It can not be used for other projects.
Public internet IP addresses are subject to billing.
6. API Access
6.1 Access Token
It is possible to generate Tokens from the UI. Navigate to Access Tokens
under SETTINGS
in the navigation. Click the Generate new token
button to access the correct form. Please be sure to be in the correct project you want the token for.
Now you can provide a short description and set the expiration of the token in days. After that you can control the scope of the token and what methods it should be allowed to use.
After clicking the Generate token
button, you should copy the token and store it somewhere safe. You will not be able to see it again. If you loose a token, you can always delete it and generate a new one.
When leaving the token form you should see the list of your tokens for this project. You can check the details of your tokens or delete them.
6.2 API
Here is our API documentation with examples. The endpoint of the api is https://api.metalstack.cloud. You can use this guide to learn how you can access the api with your own tools.
Sometimes you need the Project ID to access some parts of the api. For example, to create a cluster. Either you can extract that from your token programmatically, or navigate to the Dashboard. There you will find a button besides the heading, where you can copy the Project ID also.
6.3 Terraform Provider
Here is a guide on how to use our Terraform Provider, that is also compatible with Open Tofu.