Integration Options with rbac and Google Hosted IAM Access Tokens
Cheaper CPU costs over AWS CPU (without contractual discounts)
Project structure vs Account Structure
Container Registry
Google Support
Issues With GKE
Default Clusters are Not Production Ready
Networking complexities to bridge cloud and on prem networks
Familiarization with GCP is still not as well known as AWS or Azure
SSL is not as easy as ACM, but what is?
How to make auth easy for users and keep things secure?
Best Practices For GKE
Use shared networks for private clusters
Create small subnets for each cluster, Default firewall rules do not allow east west traffic from clusters in different subnets inside the same network.
Turn off security risk plugins, Dashboard, legacy auth, basic auth.
Manage the cluster and node pools separately.
Use GCP Ingress Controller as much as possible and learn about it's features.
Create regional clusters over zonal clusters, unless you know that your workloads can only run in specific zones in a region.
A Very useful tool, to allow total control over networking that can be shared to individual projects
Access can be granted to individual users for individual subnets.
Requires Cloud NAT to be installed in the network. Only one NAT per network is needed. Two options for NAT. Auto Scaled or Self Managed. We opted for self managed, so we know which IPs our traffic is coming from. Downside is we have to add IPs if our egress gets too high and we start dropping packets. ~64k per IP
One Subnet Per GKE cluster. This does mean you should use as small amount of subnets as possible, so you don't run out IP space.
Use VPC Service Controls
Flow Logs per subnet and Network
Master API Security
Lock Down Master API to Trusted Networks or Don't let any networks hit the public IP address of the master and only communicate with the private IP address.
Allows for Zero Trust methodology on k8s apis
Perform Master Credential Rotation, Recreates Master Ip at the same time
K8s Internal Security
Use firewalls between each GKE cluster in the same network. Do not allow GKE clusters to talk to other GKE clusters in your network at layer 4. Make them go to layer 7 to talk to other applications.
Use Network Security Policies
Use Pod Security Policies
Using Istio automatic injector on all pods is another option to secure East/West traffic at a higher layer
For Node Security, Either Create a new service account with limited permissions for each cluster to use, or set node IAM permissions to be least privileged.
Can take it a step further and create applications that can return access_tokens to users via API or local clients. Using google hosted google oidc oidc application
Map users to namespaces via rbac. Automate this step.
create least privileged rolebindings that allow users to do only what is necessary
You get all audit logs by default
Node Pools
Create Clusters
Delete default node pool
Create node pools for specific use
use preemptive nodes for nonprod
Use Auto Upgrades
Use Auto Node Repair
Use AutoScaling
Opens up many options for having different types of workloads in the same cluster, via taints and tolerations and other k8s pod targeting systems
GCP Ingress Controller
Creates Layer 7 LoadBalancers
Create BackendConfig for each Service
Use Cloud Armor and Cloud IAP for Private Applications
Use Cloud CDN if needed
Set Timeouts, Session Affinity, and Connection Draining when applicable
Automatic StackDriver Metrics
Automatic StackDriver Logs on LB
You can still create internal or external network load balancers if that is a requirement via k8s service objects. You just miss all the goodies the layer7 lb brings.
Regional/Zonal Clusters
Regional Clusters automatically provision nodes in each zone in the region. This give HA by default.
Zonal Clusters can do the same as Regional Clusters, you just have to specify you wan to run your node pools in all zone. The benefits of using Zonal clusters is you have the ability to choose specific zones. Why would you not want to run in all zones in a region? Because not all node types are available in all zones.
Project Structure / Container Registry
Create Projects Per Logical Group
Allows more open IAM access for each group, without concern of a team muddling up another teams resources
Create registries per team that can only be accessed by them and their clusters
GKE Beta/Alpha Features
Binary Authorization (beta)
Application-layer Secrets Encryption (beta)
Node auto-provisioning (beta)
GKE usage metering (beta)
Vertical Pod Autoscaling (beta)
Cloud Run on GKE (beta)
istio-on-gke (beta)
Managed Certs (beta) 1.12.6 or higher required
Alpha Clusters to get test latest k8s greatness
GKE Sandbox
GKE Gotchas
Do not change NodePort on services. Make sure when you apply a service yaml, you do not update the nodePort the service is already using. You will have downtime.
You cannot use a cluster master, node, Pod, or Service IP range that overlaps with 172.17.0.0/16.
While GKE can detect overlap with the cluster master address block, it cannot detect overlap within a shared VPC network.
Make sure your firewall rules allow for GCP healthchecks