Kubesafe: Never run Kubernetes commands on the wrong cluster again

73 points by Telemaco019 10 months ago

physicles 10 months ago

We’ve basically solved this where I work, with these steps:

- Each environment gets its own directory. We use kustomize to share config between environments.

- direnv “sets” the current context when you cd under a cluster’s directory (it sets an environment variable that a kubectl alias uses. Nobody calls kubectl directly; it wouldn’t work because we’ve banned the current context from the yaml files). You switch clusters by changing to that cluster’s directory.

- most of the time, the only command you run is ‘make’ which just does kubectl kustomize apply (or whatever). 100% of cluster config is checked into git (with git-crypt for secrets), so the worst that can happen is that you reapply something that’s already there.

I’ve also colored the command prompt according to the current cluster.

But anyway it’s essentially impossible to apply a config change to the wrong cluster. I haven’t worried about this in years.

noja 10 months ago

what if you forget to run direnv the second time?
- Gooblebrai 10 months ago
  
  You don't have to run anything. The point of direnv is that loads/unloads automatically when your enter/leave a directory
- ijustlovemath 10 months ago
  
  I imagine they use an alias or bash function for cd, which uses direnv under the hood
  - physicles 10 months ago
    
    Yep. Direnv inserts itself into your prompt function.

ed_mercer 10 months ago

I got burned by this recently and came to the conclusion that the concept of a current context is evil. Now I always specify —-context when running kubectl commands.

lukaslalinsky 10 months ago

I also got burned by this, pretty badly, and ever since it happened, I don't even have a default kubeconfig, have to specify it for every single kubectl run.
- cryptonector 10 months ago
  
  I never even set up a default context. I sussed out that problem from the get-go and always use `--context`. But that's not really enough if you use shell history, or if your clusters differ in few letters that are easy to typo.
- vdm 10 months ago
  
  This is the way. Same for awscli profiles.
Telemaco019 10 months ago

Got burned too, we've all been there I guess :)
I also tried to avoid current context initially, but it just slowed me down. Switching between clusters is so much easier with the current context and kubectx.
That’s why I built kubesafe. In this way I can keep using the current context without worrying about screwing up. If I accidentally target the wrong context, at least I get a warning before executing the command.
The only hassle now is remembering to add new prod contexts to the safe list, but that’s about to change with regex support coming soon :)
cyberpunk 10 months ago

I actually go a step further and keep multiple kubeconfigs and have a load of shell aliases for managing them.
Active one is in $PS1 somewhere.
remram 10 months ago

I found that some cloud providers and other tools like minikube don't play nice with other clusters in the same config. I now use a tiny shell function that selects KUBECONFIG out of a folder, and adds the current cluster name's to my prompt.
physicles 10 months ago

Check out direnv, and use a shell alias for kubectl. And yeah, current context is evil.
- 8organicbits 10 months ago
  
  This is a good suggestion, but keep in mind that you can accidentally run a command in the wrong directory. I've certainly done that too, with painful results.
  - physicles 10 months ago
    
    What kind of command was it?
    If I’m doing something more involved, I’ve got a k9s window open in another pane, making sure the command is having the intended effect.
    I guess the riskiest commands would be things like deleting persistent volumes. But our storage class doesn’t automatically clean up the disk in the cloud provider, so we could recover from that too.
- remram 10 months ago
  
  What if you have dev and prod clusters/namespaces for the same project (and thus directory)?
  - physicles 10 months ago
    
    We’ve avoided that situation with kustomize. Common resources go into a ‘bases’ directory, and if two clusters have identical resources, then they both have their own directories and reference all the base resources from there.
    In practice, there are always slight differences between cluster config between test and prod (using different S3 buckets, for example) so this is needed anyway.
rad_gruchalski 10 months ago

I just print the current context in my shell, next to the git branch.

adolph 10 months ago

Don't keep anything in the default .kube/config. Set KUBECONFIG envar instead. Keep every cluster in separate config. Set an indicator in PS1. Helm et al follow the envar. Roast my zsh:

  k8x() {
    export env=$1;
  # exit if no param
    if [ -z $1 ]; then
      if [ -z ${KUBECONFIG+x} ]; then 
        echo "Need param of a k8s environment";
        return 1
      else 
        echo "Removing KUBECONFIG variable";
        PS1="$(echo "$PS1" | sed -e 's;^([^)]*) ;;')";
        unset KUBECONFIG;
        return 0
      fi
    fi;
  # exit if no file for param
    cfgPath="$HOME/.kube/config.${env}";
    if [ ! -f $cfgPath ]; then
      echo "A config does not exist";
      return 1
    fi;
    PS1="$(echo "$PS1" | sed -e 's;^([^)]*) ;;' -e 's;^;('$env') ;')";
    export KUBECONFIG="$cfgPath";
  }

cduzz 10 months ago

In the early 1990s I ran a math department's 4 servers and 50 workstations and (with a few exceptions) only ever did administrative actions through scripts.

I've worked in lots of places since and the world's matured from scripts and rsync to ansible and puppet and similar.

Have we regressed to the point where we've turned big clusters of systems back into "oops I ran a command as superuser in the wrong directory" ?

renewiltord 10 months ago

Someone here showed me this cool technique with `fzf`:

    #!/usr/bin/env bash

    set -e

    context=$(kubectl config get-contexts | awk '{print $2;}' | grep -v NAME | fzf --preview 'kubectl config use-context {} && kubectl get namespaces')
    kubectl config use-context $context

You get a two-pane window with the context on the left and the namespaces on the right. That's all I need to find what I'm looking at. It's destructive, though.

evnix 10 months ago

Have been burnt by this, I have to deal with close to 8 clusters and it is very easy to make a mistake.

Would highly recommend kubie, it allows you to switch and shows you the name of the cluster in the prompt. It's probably a more visual way of solving the same problem.

https://github.com/sbstp/kubie

terinjokes 10 months ago

It also solves a problem many of the other solutions here miss: the prompt is printed once and so it can easily be showing stale information if you change the current context in another shell.
With kubie entering a context copies the configuration to a new file and sets KUBECONFIG appropriately, so it is not affected by changes in another shell.
Yasuraka 10 months ago

I do this with kubectx to switch and kube-ps1 with ohmyzsh to display cluster/namespace in my usual prompt
glitchcrab 10 months ago

'close to 8 clusters' is a strange turn of phrase. So you manage 6 or 7?
- elygre 10 months ago
  
  It might also be a trick — maybe it’s nine?
- evnix 10 months ago
  
  7 during good times. 8 when things go south.
glitchcrab 10 months ago

I toyed with the idea of having a kubeconfig per cluster some time ago, but I work with 10s of clusters on a daily basis (often with multiple terminals targeting the same cluster) and having to auth every single time would have been too much of a pain.
Instead I went with kubeswitch which still gives you a different kubeconfig per terminal but allows you to re-use existing sessions.
https://github.com/danielfoehrKn/kubeswitch
- Telemaco019 10 months ago
  
  Cool project, I didn't know it. I love the idea, thanks for sharing it!
- arccy 10 months ago
  
  whether a reauth is necessary depends on your k8s setup a lot of the cloud ones only configure kubeconfig to call an external command, which can share auth state between terminals
  - glitchcrab 10 months ago
    
    Sure, but I'm switching between AWS, Azure and vSphere clusters regularly and they all behave differently.

decasia 10 months ago

I like to print the k8s context and current namespace in the shell prompt.

It's still possible I could mess something up with kubectl, but it provides constant reminders of what I'm working with.

Telemaco019 10 months ago

I also have it in my zsh config, but that didn’t stop me from screwing up in the past. Having an active confirmation prompt for potentially risky commands is what works best for me

millerm 10 months ago

Hah! I accidentally deleted a production deployment the other day, because I thought it was mucking with my local Colima Kubernetes's cluster. I forgot that I had my context set to one of my AWS clusters. I had been meaning to write a command to wrap helm and kubectrl to prompt me with info before committing, so I will have to take a peek at this.

thewisenerd 10 months ago

haha

i added the following to my bashrc a few days ago for similar reasons; this forces me to be explicit about the cluster; now i mess up the wrong namespace instead :)

    if [[ -e "/opt/homebrew/bin/kubectl" ]]; then
        /opt/homebrew/bin/kubectl config unset current-context >/dev/null
    fi

JohnMakin 10 months ago

I am not trying to shit on this, sorry - but can't you achieve the same thing with rudimentary automation, and barring that, rudimentary scripting? This seems to just be adding y/n prompts to certain contexts. How's that different than a bash wrapper script that does something like this?

context=$(grep "current-context:" ~/.kube/config | grep "*prod*")

if [[ -z ${context} ]]

then # do the command

else # do a y/n prompt

Am I missing something?

Telemaco019 10 months ago

Thanks for the feedback John! You're right, that's pretty much it :)
I developed kubesafe because (1) I was tired of tinkering with shell aliases and scripts (especially when I wanted to define protected commands) and (2) I needed something that worked smoothly with all Kubernetes tools like kubectl, helm, kubecolor, etc.
Kubesafe is just a convenient way to manage protected commands and contexts. Nothing too fancy!
Btw - I also found a kubectl plugin written in Bash that’s similar to what you mentioned, in case you're interested: https://github.com/jordanwilson230/kubectl-plugins/blob/krew...
- JohnMakin 10 months ago
  
  thanks for the explanation, I like the idea
  - Telemaco019 10 months ago
    
    You're welcome! And thanks again for the feedback!

jasonhansel 10 months ago

Can you use this with kubecolor? https://github.com/kubecolor/kubecolor

Incidentally: I have no idea why something like kubecolor isn't built in to kubectl itself.

Telemaco019 10 months ago

Absolutely! kubesafe is simply a wrapper, so you can use it with any Kubernetes tool by passing the tool as the first argument to kubesafe.
Example with kubecolor:
`kubesafe kubecolor get pods --all-namespaces`

smlx 10 months ago

I came up with a simpler solution that keeps kube contexts separated per terminal.

https://smlx.dev/posts/kubectl-global-state/

acedTrex 10 months ago

I handle this by never keeping production kubeconfigs on my local device. i pull them down on demand.

robertlagrant 10 months ago

This seems good, but can it also be done via ACLs in vanilla Kubernetes?

Telemaco019 10 months ago

Thanks Robert! Yes, you can achieve this with ACLs in Kubernetes, but it requires setting up multiple Roles and contexts. Even then, you might accidentally switch to a higher-permission Role and accidentally run a risky command, thinking you're in a different cluster or using a low-permission user.
Kubesafe is just an extra safety net to prevent those kind of accidents :)
- robertlagrant 10 months ago
  
  That makes sense - thanks for the reply.

coding123 10 months ago

Another option, just give prod's creds to CI only.

Telemaco019 10 months ago

I think it’s a tradeoff between safety and speed. Having only the CI/CD with production access can significantly slow you down, especially in the early stages when you’re focused on the product and still building out your tooling/infrastructure.