Introspecting your machine with osquery

texas linux fest 2025

Slides at https://ian.im/txlf25

Ian Littman / @ian@phpc.social / @ian.im / @iansltx

osquery: query systems with SQL(ite)

installable via apt/yum/tarball

let's get the sha1 of all open files <= 100KiB

SELECT path, sha1 FROM hash WHERE path IN (
  SELECT path FROM process_open_files
  JOIN file USING (path) WHERE file.size <= 1024 * 100
);

Why and when?

Key Tenets

  • Write SQL, not procedural code
    • osquery pulls and parses requisite data
    • SQLite (embedded in osquery) queries data via virtual tables
    • If you know how to write SQL, you can figure out how to use osquery
  • Reliable and performant enough to run on corporate machines and in prod
    • Worker process does the heavy lifting
    • Watchdog process kills/respawns the worker if resource limits are exceeded
    • Heavy queries are denylisted
    • Some tables require WHERE clauses because e.g. hashing files is expensive
  • Dual-licensed open-source GPLv2 + Apache 2.0

270+ tables built in

Hardware

  • acpi_tables
  • battery
  • block_devices
  • chassis_info
  • connected_displays
  • cpu_info
  • cpuid
  • device_firmware
  • fan_speed_sensors
  • hardware_events
  • ibridge_info
  • intel_me_info
  • memory_devices
  • nvram
  • pci_devices
  • power_sensors
  • temperature_sensors
  • usb_devices
  • video_info
  • drivers (Windows)

software*

  • apps (macOS)
  • apt_sources
  • chocolatey_packages
  • deb_packages
  • gatekeeper
  • gatekeeper_approved_apps
  • homebrew_packages
  • package_bom
  • package_install_history
  • package_receipts
  • python_packages
  • yum_sources
  • authenticode (Windows)
  • startup_items
  • launchd
  • chrome_extensions
  • firefox_addons
  • safari_extensions
  • patches (Windows)
  • npm_packages
  • programs (Windows)
  • rpm_packages
  • rpm_package_files
  • windows_update_history

Auth and certs

  • authorized_keys
  • certificates
  • groups
  • keychain_items
  • known_hosts
  • last (logins + logouts)
  • logged_in_users
  • logon_sessions (Windows)
  • password_policy (macOS)
  • security_profile_info (Windows)
  • selinux_*
  • shadow
  • ssh_configs
  • sudoers
  • suid_bin
  • user_events
  • user_groups
  • user_ssh_keys
  • users

Disk encryption

  • disk_encryption
  • bitlocker_info

Networking

  • connectivity
  • dns_cache (Windows)
  • dns_resolvers
  • etc_hosts
  • interface_addresses
  • interface_details
  • interface_ipv6
  • listening_ports
  • process_open_sockets
  • routes
  • wifi_*
  • curl

Firewalls

  • alf
  • alf_exceptions
  • alf_explicit_auths
  • windows_firewall_rules
  • windows_security_center
  • iptables

What's running?

  • crontab
  • scheduled_tasks (Windows)
  • startup_items
  • processes
  • process_*
  • running_apps
  • systemd_units

Logging

  • windows_event_logs
  • syslog_events
  • unified_log
  • asl
  • shell_history

Files (including file integrity monitoring)

  • carves
  • disk_*
  • file
  • file_events
  • hash
  • md_* (software RAID)
  • mdls / mdfind (Spotlight)
  • mounts
  • nfs_shares
  • ntfs_*
  • office_mru
  • process_file_events
  • signature

Containers

  • docker_*
  • lxd_*
sudo ln -s $HOME/.docker/run/docker.sock /var/run/docker.sock

Cloud

  • azure_instance_metadata
  • azure_instance_tags
  • ec2_instance_metadata
  • ec2_instance_tags

configuration

  • plist
  • registry
  • kernel_*

Time

  • time
  • uptime
  • load_average

But wait, with ATC, there's more!

if you need sqlite in your sqlite

But wait, with extensions, there's more!

for example, fleetd adds 30+ tables

Comprehensiveness over consistency.

you might want an abstraction layer on top.

...at least likely join keys tend to be named the same (pid, uid, path)

Nuance #1: Cross joins for users

  • osquery queries as the current user
  • osqueryd generally runs as root
  • Retired: SELECT * FROM chrome_extensions
  • Inspired: SELECT cx.* FROM users CROSS JOIN chrome_extensions cx USING (uid)
  • Some tables work without the cross join, but give incomplete results

 

Bonus tip: Check out the SQLite docs on SELECT and the SQLite tutorial

Nuance #2: Computation effort

  • ORDER/LIMIT tend not to reduce computation effort
  • WHERE, depending on table, does
  • Some tables (e.g. hash) require a WHERE condition
  • For file paths, % == glob *; use %% for recursive subdirectory scans
  • Sometimes you can get a hash more cheaply (e.g. signature table)
  • Sometimes you need subqueries rather than JOINs

Displaying results

  • Provide query as an osqueryi argument to output and quit

  • At the CLI: --json, --csv

  • Inside the shell: .mode line, .mode column

osqueryd: JSON query log to disk

can also push externally over TLS with host identification

Can run scheduled over time with either the entire result or diffs

Threat Hunting && Indicators of Compromise

  • SELECT * FROM processes
    WHERE on_disk = 0
  • SELECT COUNT(*) > 0 FROM files
    WHERE path = "/my/malware"
  • SELECT * FROM process_open_sockets
    WHERE remote_address = ""
  • YARA allows for more file contents searching
  • Use diff logging to catch anomalies

Threat Hunting && Indicators of Compromise

Thanks!

Slides at https://ian.im/txlf25

Questions? Find me here / @ian@phpc.social / @ian.im / @iansltx