Data Engineering

Starter Curriculum

Overview

  • Linux Shell
  • Python
  • R

Linux Shell

Linux Data Manipulation 1

join -t: -1 3 -2 4 \
  <(sort -t: -k 3 /etc/group) \
  <(sort -t: -k 4 /etc/passwd)

Create a single file where the group id column of the /etc/group file is joined to the group id column of the /etc/passwd file.

Linux Data Manipulation 2

cut -d: -f4 /etc/group \
  | grep -Ev "^$" \
  | tr -s ',' '\n' \
  | sort \
  | uniq \
  | tr '\n' ','

Extract lines from the /etc/group file that have one or more members of the group and create a comma-separated list of unique member names.

deck

By naiveroboticist