pro Patek.cz
ale není to o tom ^^
Works as SRE at F5, before Volterra.io, Mirantis, IBM, ...
n(vi)m lover. developer. geek. quad fpv pilot.
all with the passion for the edge thing
On Twitter as @epcim.
vývoj a provoz moderních aplikací a infrastruktury
Buzzword
DevOps
"SRE" je vlastně pracovní role, taky definováno jako principy a praktiky software inženýrství aplikované na infrastrukturu a provoz
pattern
Gee Kim, Co-Author of: “The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win.”
x
x
x
x
x
x
x
?
x
Text
x
x
x
?
...být schopen nasadit a provozovat aplikace i infrastrukturu jako kód
A cultural and professional movement, focused on how we build and operate high velocity organizations, born from the experiences of its practitioners.
Aplikace
Logiku
Aplikace
Kontejner
Orchestrace
Jake nastroje pouzivame
Ukazka
Jak vypada sprava aplikaci distribuovanych v cloudu
Nástroj pro distribuované verzování zdrojového kódu
Původně pro Linux kernel. (Jiné: Bzr, Hg, Svn).
https://git-scm.com/book/cs/v2
git clone https://github.com/GoogleContainerTools/kpt-functions-catalog
git show
git diff
git pull -r
git checkout -b new-branch
git cherry-pick dbd4b43
git commit
git rebase origin/main
git rebase -i HEAD~5
Programovací jazyk od Google (2007).
High performance and fast development. Powerfull standard library.
package main
import "fmt"
func main() {
fmt.Println("hello world")
}
$ go build hello-world.go
$ ls
hello-world hello-world.go
$ ./hello-world
hello world
// GotplRenderBuf process templates to buffer
func (p *RenderPlugin) GotplRenderBuf(t string, out *bytes.Buffer) error {
// read template
tContent, err := ioutil.ReadFile(t)
if err != nil {
return fmt.Errorf("read template failed: %w", err)
}
// init
fMap := sprig.TxtFuncMap()
for k, v := range SprigCustomFuncs {
fMap[k] = v
}
tpl := template.Must(
template.New(t).Funcs(fMap).Parse(string(tContent)),
)
//render
err = tpl.Execute(out, p.Values)
if err != nil {
return err
}
return nil
}
Nástroj který skrze deklarativní jazyk (DSL), indepotent konfigurace a CLI rozhraní umožňuje přes API spravovat vzdálené ......
https://registry.terraform.io/browse/providers
variable "aws_region" {
default = "eu-west-1"
}
variable "domain" {
default = "my_domain"
}
provider "aws" {
region = "${var.aws_region}"
}
# Note: The bucket name needs to carry the same name as the domain!
# http://stackoverflow.com/a/5048129/2966951
resource "aws_s3_bucket" "site" {
bucket = "${var.domain}"
acl = "public-read"
policy = <<EOF
{
"Version":"2008-10-17",
"Statement":[{
"Sid":"AllowPublicRead",
"Effect":"Allow",
"Principal": {"AWS": "*"},
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::${var.domain}/*"]
}]
}
EOF
website {
index_document = "index.html"
}
}
# Note: Creating this route53 zone is not enough. The domain's name servers need to point to the NS
# servers of the route53 zone. Otherwise the DNS lookup will fail.
# To verify that the dns lookup succeeds: `dig site @nameserver`
resource "aws_route53_zone" "main" {
name = "${var.domain}"
}
resource "aws_route53_record" "root_domain" {
zone_id = "${aws_route53_zone.main.zone_id}"
name = "${var.domain}"
type = "A"
alias {
name = "${aws_cloudfront_distribution.cdn.domain_name}"
zone_id = "${aws_cloudfront_distribution.cdn.hosted_zone_id}"
evaluate_target_health = false
}
}
resource "aws_cloudfront_distribution" "cdn" {
origin {
origin_id = "${var.domain}"
domain_name = "${var.domain}.s3.amazonaws.com"
}
# If using route53 aliases for DNS we need to declare it here too, otherwise we'll get 403s.
aliases = ["${var.domain}"]
enabled = true
default_root_object = "index.html"
default_cache_behavior {
allowed_methods = ["GET", "HEAD", "OPTIONS"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "${var.domain}"
forwarded_values {
query_string = true
cookies {
forward = "none"
}
}
viewer_protocol_policy = "allow-all"
min_ttl = 0
default_ttl = 3600
max_ttl = 86400
}
# The cheapest priceclass
price_class = "PriceClass_100"
# This is required to be specified even if it's not used.
restrictions {
geo_restriction {
restriction_type = "none"
locations = []
}
}
viewer_certificate {
cloudfront_default_certificate = true
}
}
output "s3_website_endpoint" {
value = "${aws_s3_bucket.site.website_endpoint}"
}
output "route53_domain" {
value = "${aws_route53_record.root_domain.fqdn}"
}
output "cdn_domain" {
value = "${aws_cloudfront_distribution.cdn.domain_name}"
}
Je taky nástroj s deklarativní konfigurací, CLI rozhraním a taky umožňuje spravovat vzdálené "servery, routery".
Tzv. config management.
https://www.guru99.com/ansible-tutorial.html
(Jiné: Salt, https://saltproject.io/ ).
# playbook-base.yaml
- name: base
become: true
become_method: sudo
hosts: all
roles:
- sshd
- users
- system
- netplan
- network
#- hardening
#- cleanup
serial: "{{ batch_size|default(10) }}"
- name: sshd configuration file
template:
src: sshd_config.metal.j2
dest: "{{ sshdconfig }}"
owner: 0
group: 0
mode: 0600
validate: "sshd -T -f %s"
backup: yes
vars:
sshd_allow_users: "{{ (admin_users + autom_users) | join(' ') }} "
notify:
- restart sshd
# when: ansible_virtualization_role != "guest" or ansible_virtualization_type != "docker"
GitLab je DevOps platforma! Hurá.
Server pro Git repositáře. Web UI. Správa issues. A taky hlavně CI
(Jiné: Github https://github.com, Gitea https://gitea.io ).
Deklarativní Continuous Delivery. Pro GitOps. Rozumíme...
"infrastruktura = kód"; Šroubovák co instaluje ten kód do Kubernetes.
https://argoproj.github.io/cd/
Původně "SoundCloud" (2012).
multi-dimensional data model, operational simplicity, scalable data collection, and a powerful query language
{
name: 'etcd-service',
rules: [
{
alert: 'EtcdDatabaseSpaceFilling',
expr: |||
100 - (etcd_mvcc_db_total_size_in_bytes / etcd_server_quota_backend_bytes) * 100 < 15
||| % $._config,
'for': '10m',
labels: {
severity: 'major',
identifier: '{{ $labels.instance }}',
group: 'Infrastructure',
service_name: 'etcd',
tenant: 'ves-sre',
},
annotations: {
display_name: 'Database Error',
description: 'Etcd database {{ $labels.instance }} is filling up. Only {{ $value }}% of space is available.',
},
docs:: {
name: |||
TODO
|||,
description: |||
Etcd database is filling up.
|||,
action: |||
This has to be solved in next business day working hours by L2.
|||,
},
},
# Aggregating up requests per second that has a path label:
- record: instance_path:requests:rate5m
expr: rate(requests_total{job="myjob"}[5m])
- record: path:requests:rate5m
expr: sum without (instance)(instance_path:requests:rate5m{job="myjob"})
Recording Rule Example 2
================================
# Calculating a request failure ratio and aggregating up to the job-level failure ratio:
- record: instance_path:request_failures:rate5m
expr: rate(request_failures_total{job="myjob"}[5m])
- record: instance_path:request_failures_per_requests:ratio_rate5m
expr: |2
instance_path:request_failures:rate5m{job="myjob"}
/
instance_path:requests:rate5m{job="myjob"}
# Aggregate up numerator and denominator, then divide to get path-level ratio.
- record: path:request_failures_per_requests:ratio_rate5m
expr: |2
sum without (instance)(instance_path:request_failures:rate5m{job="myjob"})
/
sum without (instance)(instance_path:requests:rate5m{job="myjob"})
Treasure Data "fluentd --> fluentbit" (2014).
super fast, lightweight, and highly scalable logging and metrics processor
## Fluent source
{{- if has .vector_enable_fluent $enabled }}
[sources.in_fluent]
type = "fluent"
address = "0.0.0.0:24224"
connection_limit = 2000
keepalive.time_secs = 30
receive_buffer_bytes = 1048576
tls.enabled = true
tls.verify_certificate = true
tls.ca_file = "/corp/secrets/identity/client_ca_with_fluent.crt"
tls.crt_file = "/corp/secrets/identity/server.crt"
tls.key_file = "/corp/secrets/identity/server.key"
{{- else }}
[sources.in_fluent]
type = "file"
include = ["/dev/null"]
{{- end }}
[transforms.remap_fluent_throttle_key]
type = "remap"
inputs = ["in_fluent"]
source = '''
if match(.tag, r'^alert\..*$') ?? false {
._throttle_key, err = join([.labels.cluster_name, .labels.alertname], separator: "_")
if err != null {
log("Unable to construct throttle key for alert cluster_name=" + to_string!(.labels.cluster_name) + ", alertname="+ to_string!(.labels.alertname) +". Dropping invalid event. " + err, level: "error", rate_limit_secs: 10)
abort
}
} else {
._throttle_key, err = join([.cluster_name, .tag], separator: "_")
if err != null {
log("Unable to construct throttle key for cluster_name=" + to_string!(.cluster_name) + ", tag="+ to_string!(.tag) +". Dropping invalid event. " + err, level: "error", rate_limit_secs: 10)
abort
}
}
'''