Intro to Apache


The web server that revolutionized internet

Lamp

  • Architecture to use loosely coupled components
  • Linux, Apache, Mysql, php/perl/python

What's a web server




It's what makes the internet run
But you already know that
     So can you name a few web servers?

HTTP PRotocol

Http request:
GET /index.html HTTP/1.1
Host: www.example.com 
       Http response:
HTTP/1.1 200 OK
Date: Mon, 23 May 2005 22:38:34 GMT
Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)
Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
ETag: "3f80f-1b6-3e1cb03b"
Content-Type: text/html; charset=UTF-8
Content-Length: 131
Accept-Ranges: bytes
Connection: close 

<head>
  <title>An Example Page</title>
</head>
<body>
  Hello World, this is a very simple HTML document.
</body>
</html>

Why Apache?

  • Its open source
  • Its fast, secure and reliable
  • It has modular architecture
  • Its concurrent
  • More than half the internet runs on it.
  • Fun fact: the word apache comes from "a patchy server", a reference to its initial beginnings.

High Level ARchitecture

Apache Core

  • Apache core takes care of the configuration
  • Controls the modules
  • And takes care of the serving the static content
  • Taker care of the logging

Apache Core Architecture

Apache MOdules

  • Access control: mod_authz_host
  • Authorization/Authentication: mod_auth
  • Dynamic Content eg. mod_php, mod_fcgi
  • Performace: mod_pagespeed,mod_spdy

LoadModule auth_module_modules/mod auth.so

Architecture of apache Module

Overall Architecture

Concurrency

  • For concurrency, apache uses something called multi-processing modules or MPMs
  • Each MPMs have different way of managing http requests
  • There are 3 most common MPMs for apache , mod_prefork, mod_worker, mod_event
  • Only one MPM can be used at a time

MPMs

Apache Start and REstart

  • Since apache requires access to port 80, it needs to run as root
  • It spawns all the worker/fork process for serving requests (as mentioned in MPMs) as some other user specified in httpd conf
  • To restart/reload apache you send signal to the root apache process and it takes care of the child/spawned processes accordingly.
  • When apache is restarted, the parent apache is never killed unless you specifically stop it, only the child processes are stopped.

Restart

  • apchectl -k restart
  • apachectl -k  graceful
  • the init script /etc/init.d/httpd may behave differently depending upon distribution

STopPing

  • similar to graceful restart, we have graceful-stop and normal stop
  • they can be invoked via "apachectl -k graceful-stop" and "apachectl -k stop" respectively
  • You can read more about starting, restarting and stopping here on apache docs. I have added links in the end.

Configuration Context



  • Server config
  • Virtualhosts
  • Directory/Location/File
  • htaccess file
VirtualHosts

  • Allow for serving multiple websites on a single server
  • A single server may have thousands of virtualhosts
  • Virtual hosts are either IP based, or Name based
  • You can define various configuration in the context of  a virtualhost

Virtual Hosts

  • Name Based
  • IP based

mod_php

  • Simple module for serving php pages via apache
  • Loads up a php interpreter as an apache module
  • Has some performance penalties if compared to mod_fcgi but is simple to use if you want to run a small scale website

Some other MOdules

mod_authz_host for access_control

  • specify ip based control which sites


mod_rewrite/mod_alias

  • map a  url to a script or another file


mod_status

  • to view info on the server status




Troubleshooting

  • two logs access_log  and error_log
  • access_log contains all the http requests processed by apache
  • error_log contains all the errors encountered while handling errors
  • we can enhance logs using various modules as well, eg mod_firstbyte

Apache VS Nginx

  • Nginx achieves better performance using an event model
  • Its asynchronous and tries to do non-blocking IO
  • Use multiplexing on different worker processes using underlying implementation using epoll,kqueue
  • apache support .htaccess and auth

SOurces

  • http://www.shoshin.uwaterloo.ca/~oadragoi/cw/CS746G/a1/apache_conceptual_arch.html
  • http://httpd.apache.org/docs/2.2/invoking.html
  • http://httpd.apache.org/docs/2.2/stopping.html
  • http://aosabook.org/en/nginx.html (Nginx Architecture)
  • Previous year's presentation by deepak.b

Intro to Apache

By Ayush Goyal

Intro to Apache

  • 2,040