Lecture 4:

Signals. Hardware and software interrupts, their nature. Top and bottom halves. Signals and system calls, signal context.

Version: 3

System programming

Education

Lecture plan

  • Interrupts in the processor
  • Signals in the kernel
  • Signals in userspace

Interrupts [1]

Examples of processor interrupts?

  • From a hardware clock
  • Page fault:
    • Write to a Copy-On-Write page
    • Foreign memory access
    • No page in TLB
  • Debugging
  • System calls

1 point

Interrupts [2]

Hardware

Software

For asynchronous I/O with hardware

The same as exceptions, traps

For asynchronous handling of exceptions

Hardware interrupts [1]

Read from a disk?

syscall(read)

File system

Driver

Bus

How to find when the read is finished?

How to do it for each device?

Hardware interrupts [2]

PIC

CPU

Dev 1

Have something to tell the CPU?

Programmable Interrupt Controller

00000

Dev 2

00001

00001

00001

00010

Are generated by hardware to attract CPU's attention

Software interrupts [1]

...
push   %rbp
mov    %rsp,%rbp
sub    $0x10,%rsp
lea    0x1c1(%rip),%rax
mov    %edi,-0x4(%rbp)
mov    %esi,-0x8(%rbp)
mov    -0x8(%rbp),%esi
mov    -0x4(%rbp),%edx
mov    %rax,%rdi
mov    $0x0,%al
callq  100001df2 <_recursive+0x82>
mov    %eax,-0xc(%rbp)
callq  100001440 <_coro_yield>
cmpl   $0x5,-0x4(%rbp)
...

Are generated by CPU to process complex unexpected situations

...
mov    %eax,-0x18(%rbp)
callq  100001660 <_coro_new>
mov    %rax,-0x20(%rbp)
callq  100001440 <_coro_yield>
mov    -0xc(%rbp),%eax
add    $0x20,%rsp
pop    %rbp
retq
...

Software interrupts [2]

Errors (exceptions)

Traps

Handler

If fixed, then return back before the instruction to redo it

  • No page in TLB
  • Write to COW memory
  • Broken hardware
  • Zero division
  • Unknown instruction
  • Unaligned address
  • Forbidden memory

Handler

...
mov    %edi,-0x4(%rbp)
mov    %esi,-0x8(%rbp)
mov    -0x8(%rbp),%esi
mov    -0x4(%rbp),%edx
mov    %rax,%rdi
mov    $0x0,%al
...
...
mov    %edi,-0x4(%rbp)
mov    %esi,-0x8(%rbp)
int    0x03
mov    -0x4(%rbp),%edx
mov    %rax,%rdi
mov    $0x0,%al
...

Return after the instruction, it is not redone

  • Instruction 'int'
  • syscall/sysenter
  • "Lock nop"

Interrupts [1]

Interrupt handling

Save the context to the main memory. Find the function in a handler table.

1 point

  1. How to find a handler function?
  2. How to interrupt the current work?

Interrupts [2] Handling

  1. Context = registers. Save it on a special stack created by the kernel for each core in advance.
     
  2. Find a handler in IDT - Interrupt Descriptor Table.
     
  3. IDT is stored in regular main memory. Address of the IDT beginning is in idtr register.
     
  4. By the address from IDT jump to .text section (machine code) of the handler.
     
  5. Restore the context when the handler is done.

IDT

idtr

0

255

Context

Memory

Handler start addr

Stack to save the context on

Interrupts [3] IDT

Index Purpose
0 0 division
1 Stepwise execution
2 NMI - Non-Maskable Interrupt
3 Breakpoint
6 Invalid instruction
8 Double page fault
14 Page fault
NA Triple page fault
​32 - 255 ​- User-defined interrupts -

Interrupts in kernel [1]

  • Interrupt - IRQ, Interrupt ReQuest
  • Handler - ISR, Interrupt Service Routine
  • Handlers have dedicated stack, context
  • Divided into Top Half and Bottom Half

Top Half

  • Find the reason of the interrupt
  • Save urgent data
  • Schedule Bottom Half
  • Unblock new interrupts

Bottom Half

  • Unpack and check top-half data
  • Deliver it to the kernel/user
  • Wakeup waiting threads, scheduler

Interrupts in kernel [2] API

int
request_threaded_irq(unsigned int irq,
                     irq_handler_t handler,
                     irq_handler_t thread_fn,
                     unsigned long irqflags,
                     const char *devname,
                     void *dev_id);

Line number on PIC

Interrupt handler

Various internal flags

Arbitrary data used by the handler. Usually something related to the device.

Interrupts in kernel [2] Time

static irqreturn_t rtc_interrupt(int irq, void *dev_id)
{
	spin_lock(&rtc_lock);
	rtc_irq_data += 0x100;
	rtc_irq_data &= ~0xff;
	if (is_hpet_enabled())
		rtc_irq_data |= (unsigned long)irq & 0xF0;
	else
		rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0);

	if (rtc_status & RTC_TIMER_ON)
		mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);

	spin_unlock(&rtc_lock);
	wake_up_interruptible(&rtc_wait);
	kill_fasync(&rtc_async_queue, SIGIO, POLL_IN);
	return IRQ_HANDLED;
}

/* ... */

static int __init rtc_init(void)
{
        /* ... */
	if (request_irq(rtc_irq, rtc_interrupt, IRQF_SHARED, "rtc",
			(void *)&rtc_port)) {
		rtc_has_irq = 0;
		printk(KERN_ERR "rtc: cannot register IRQ %d\n", rtc_irq);
		return -EIO;
	}
        /* ... */
}

Interrupt handler signature - line number, and what was saved in request_threaded_irq

Timer ticks with HZ frequency - is used to update time counter

Scheduler wakeup

Handler registration - flag that the line is shared, name "rtc", ...

Interrupts in kernel [3] /proc/interrupts

/proc/interrupts - registered interrupt handlers

           CPU0       CPU1       
  0:         30          0   IO-APIC   2-edge      timer
  1:       7418          0   IO-APIC   1-edge      i8042
  8:          0          0   IO-APIC   8-edge      rtc0
  9:          0          0   IO-APIC   9-fasteoi   acpi
 12:          0       2216   IO-APIC  12-edge      i8042
 14:     129007          0   IO-APIC  14-edge      ata_piix
 15:          0          0   IO-APIC  15-edge      ata_piix
 18:          0          0   IO-APIC  18-fasteoi   vboxvideo
 19:      92901        588   IO-APIC  19-fasteoi   enp0s3
 20:          0      69311   IO-APIC  20-fasteoi   vboxguest
 21:      13110      44644   IO-APIC  21-fasteoi   ahci[0000:00:0d.0], snd_intel8x0
 22:         29          0   IO-APIC  22-fasteoi   ohci_hcd:usb1
NMI:          0          0   Non-maskable interrupts
LOC:    2967493    4647820   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:     340157     261945   Rescheduling interrupts
CAL:      22307      12237   Function call interrupts
TLB:       3463       4360   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
DFR:          0          0   Deferred Error APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:        418        418   Machine check polls
HYP:          0          0   Hypervisor callback interrupts
           CPU0       CPU1       
  0:         30          0   IO-APIC   2-edge      timer
  1:       7430          0   IO-APIC   1-edge      i8042
  8:          0          0   IO-APIC   8-edge      rtc0
  9:          0          0   IO-APIC   9-fasteoi   acpi
 12:          0       2240   IO-APIC  12-edge      i8042
 14:     129069          0   IO-APIC  14-edge      ata_piix
 15:          0          0   IO-APIC  15-edge      ata_piix
 18:          0          0   IO-APIC  18-fasteoi   vboxvideo
 19:      92932        588   IO-APIC  19-fasteoi   enp0s3
 20:          0      69584   IO-APIC  20-fasteoi   vboxguest
 21:      13110      44695   IO-APIC  21-fasteoi   ahci[0000:00:0d.0], snd_intel8x0
 22:         29          0   IO-APIC  22-fasteoi   ohci_hcd:usb1
NMI:          0          0   Non-maskable interrupts
LOC:    2970432    4649435   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:     342167     263540   Rescheduling interrupts
CAL:      22316      12237   Function call interrupts
TLB:       3463       4360   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
DFR:          0          0   Deferred Error APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:        419        419   Machine check polls
HYP:          0          0   Hypervisor callback interrupts
           CPU0       CPU1       
  0:         30          0   IO-APIC   2-edge      timer
  1:       7442          0   IO-APIC   1-edge      i8042
  8:          0          0   IO-APIC   8-edge      rtc0
  9:          0          0   IO-APIC   9-fasteoi   acpi
 12:          0       2296   IO-APIC  12-edge      i8042
 14:     129125          0   IO-APIC  14-edge      ata_piix
 15:          0          0   IO-APIC  15-edge      ata_piix
 18:          0          0   IO-APIC  18-fasteoi   vboxvideo
 19:      92960        588   IO-APIC  19-fasteoi   enp0s3
 20:          0      70033   IO-APIC  20-fasteoi   vboxguest
 21:      13110      44733   IO-APIC  21-fasteoi   ahci[0000:00:0d.0], snd_intel8x0
 22:         29          0   IO-APIC  22-fasteoi   ohci_hcd:usb1
NMI:          0          0   Non-maskable interrupts
LOC:    2972773    4652064   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:     346096     265997   Rescheduling interrupts
CAL:      22322      12258   Function call interrupts
TLB:       3463       4381   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
DFR:          0          0   Deferred Error APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:        419        419   Machine check polls
HYP:          0          0   Hypervisor callback interrupts
           CPU0       CPU1       
  0:         30          0   IO-APIC   2-edge      timer
  1:       7451          0   IO-APIC   1-edge      i8042
  8:          0          0   IO-APIC   8-edge      rtc0
  9:          0          0   IO-APIC   9-fasteoi   acpi
 12:          0       2352   IO-APIC  12-edge      i8042
 14:     129171          0   IO-APIC  14-edge      ata_piix
 15:          0          0   IO-APIC  15-edge      ata_piix
 18:          0          0   IO-APIC  18-fasteoi   vboxvideo
 19:      92995        588   IO-APIC  19-fasteoi   enp0s3
 20:          0      70370   IO-APIC  20-fasteoi   vboxguest
 21:      13110      44769   IO-APIC  21-fasteoi   ahci[0000:00:0d.0], snd_intel8x0
 22:         29          0   IO-APIC  22-fasteoi   ohci_hcd:usb1
NMI:          0          0   Non-maskable interrupts
LOC:    2974684    4654345   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:     349138     268939   Rescheduling interrupts
CAL:      22334      12260   Function call interrupts
TLB:       3463       4383   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
DFR:          0          0   Deferred Error APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:        419        419   Machine check polls
HYP:          0          0   Hypervisor callback interrupts

Interrupts in kernel [4] Bottom Half

Solutions to schedule bottom half:

  • softirq
  • tasklet
  • workqueue
$ ps aux | grep softirq
root         7  0.0  0.0      S    mar20   0:00 [ksoftirqd/0]
root        16  0.0  0.0      S    mar20   0:00 [ksoftirqd/1]

Worker threads in the kernel to process softirq and tasklets

Software interrupts - mostly processed in a kernel, no Bottom Half

Hardware interrupts - drivers, have Bottom Half

Questions [1]

Two types of interrupts, when are used?

Hardware and software. Hardware for asynchronous work with periphery. Software to handle exceptions.

1 point

Questions [2]

Where are hardware interrupts stored before processing?

PIC or APIC - Programmable Interrupt Controller. Only from there they are forwarded to CPU, one by one.

1 point

Questions [3]

Where and how the kernel finds an interrupt handler?

From IDT - Interrupt Descriptor Table. It is stored in the main memory, its beginning is saved in idtr register.

1 point

Signals [1]

Interrupts - kernel only.

Signals - an analogue of interrupts in userspace. Signal saves context of the process and calls a handler.

Signal Purpose Action by default
SIGABRT Abort Terminate process
SIGALRM Timeout Terminate process
SIGINT Stopped by a user Terminate process
SIGSEGV Memory error Terminate process
SIGUSR1, SIGUSR2 User signals Terminate process
SIGCHLD A child process is terminated Ignore
SIGKILL Forced termination Terminate process
... - Tens of other signals - ...

Signals [2] signal()

#include <signal.h>

typedef void (*sighandler_t)(int);

sighandler_t signal(int signum, sighandler_t handler);

"Old, but not obsolete"

Signal number

Handler function. Takes a signal number

Signals [3] signal()

static void
on_new_signal(int signum)
{
	switch(signum) {
	case SIGINT:
		printf("caught sigint\n");
		break;
	case SIGUSR1:
		printf("caught usr1\n");
		break;
	case SIGUSR2:
		printf("caught usr2\n");
		break;
	default:
		printf("caught unknown signal\n");
		exit(-1);
	}
}

int
main(void)
{
	printf("my pid: %d\n", (int) getpid());
	signal(SIGINT, on_new_signal);
	signal(SIGUSR1, on_new_signal);
	signal(SIGUSR2, on_new_signal);
	while(true) pause();
	return 0;
}
$> gcc 1_basic_signal.c
$> ./a.out
my pid: 42645
$> kill -s SIGUSR1 42645
^C caught sigint
caught usr1
$> kill -s SIGUSR2 42645
caught usr2
$> kill 42645
Terminated
$>

Signals [4] signal()

signal() - deprecated

  • In early versions each signal delivery dropped the handler to default
  • Undefined behaviour in multithreaded program
  • Does not allow to block a signal (not the same as "ignore a signal")

Signals [5] signal

static bool is_sig_received = false;

static void
on_new_signal(int signum)
{
	printf("Received a signal\n");
	is_sig_received = true;
}

static void
interrupt(void)
{
	getchar();
}

int
main(void)
{
        printf("my pid: %d\n", (int) getpid());
	signal(SIGUSR1, on_new_signal);
	printf("Installed a handler\n");
	interrupt();
	while (! is_sig_received) {
		interrupt();
		printf("Wait for a signal\n");
		pause();
	}
	printf("Finish\n");
	return 0;
}
$> gcc 2_pause_hangs.c
$>./a.out
my pid: 42718
Installed a handler
$> kill -s SIGUSR1 42718
$> kill -s SIGUSR1 42769
$> kill 42769
Received a signal
                        <press Enter>
Finish
$>
$> ./a.out
my pid: 42769
Installed a handler
                        <press Enter>
Received a signal
                        <press Enter>
Wait for a signal
                        <press Enter>
                        <press Enter>
Terminated
$>

Signals [6]

Common problems of signals:

  • A signal can interrupt a system call
  • A signal can interrupt a function working with a globally visible state
ssize_t rc;
while ((rc = read(fd, buf, size) == -1 && errno == EINTR);

Usually "slow" system call should be checked if the error is EINTR - the call has not failed, but was interrupted. Might retry.

Signals [7]

static int alarm_cnt = 0;

static void
do_malloc(void)
{
	for (int i = 0; i < 10; ++i)
		free(malloc(1024 * 1024 * i));
}

static void
on_new_signal(int signum)
{
	printf("Interrupted %d times\n", ++alarm_cnt);
	do_malloc();
}

int
main(void)
{
	printf("my pid: %d\n", (int) getpid());
	signal(SIGUSR1, on_new_signal);
	while (true)
		do_malloc();
	return 0;
}
$> gcc 3_intr_malloc.c
$>./a.out
my pid: 42888
$> while kill -s SIGUSR1 42888; do sleep 0; done
$>
Interrupted 1 times
Interrupted 2 times
Illegal instruction: 4
$>

Signals [8]

int kill(pid_t pid, int sig);

int raise(int sig);

int pause(void);

unsigned int alarm(unsigned int seconds);

Send a signal to a process

Receiver's PID

Signal number. Not necessary to be SIGKILL

Send a signal to self

Stop the thread until a signal is received

Send SIGALRM to self with a delay

After this number of seconds

Signals [9] sigprocmask

int sigprocmask(int how, const sigset_t *set, sigset_t *oldset);

Thread signal mask management. If a signal is present in the mask, it is blocked.

What to do with the mask in the argument?

  • SIG_BLOCK - add it to the thread mask
  • SIG_UBLOCK - subtract from the thread mask
  • SIG_SETMASK - set as a new mask

Mask to use in 'how'-argument's action

Out-parameter to get the old mask

int sigemptyset(sigset_t *set);
int sigfillset(sigset_t *set);
int sigaddset(sigset_t *set, int signum);
int sigdelset(sigset_t *set, int signum);
int sigismember(const sigset_t *set, int signum);

Filling and checking of the mask

int sigpending(sigset_t *set);

Check what signals have arrived and are pending

Signals [10] sigprocmask

static int cnt = 0;

static void on_new_signal(int signum)
{
	printf("Processed a signal %dth time\n",
               ++cnt);
}

int main(void)
{
	printf("Start\n");
	signal(SIGINT, on_new_signal);
	sigset_t oldm, newm, pendingm;
	sigemptyset(&newm);
	sigaddset(&newm, SIGINT);
	printf("Block SIGINT\n");
	sigprocmask(SIG_BLOCK, &newm, &oldm);
	getchar();

	sigpending(&pendingm);
	if (sigismember(&pendingm, SIGINT))
		printf("SIGINT is pending\n");

	printf("Unblock SIGINT\n");
	sigprocmask(SIG_SETMASK, &oldm, NULL);
	while(cnt < 4)
		pause();
	return 0;
}
$> gcc 4_sigprocmask.c
$>./a.out
Start
Block SIGINT
^C ^C ^C ^C
                             <press Enter>
SIGINT is pending
Unblock SIGINT
Processed a signal 1th time
^C Processed a signal 2th time
^C Processed a signal 3th time
^C Processed a signal 4th time
$>

Sigaction [1]

int sigaction(int signum, const struct sigaction *act,
              struct sigaction *oldact);

struct sigaction {
        void (*sa_handler)(int);
        void (*sa_sigaction)(int, siginfo_t *, void *);
        sigset_t sa_mask;
        int sa_flags;
};

Signal number

Set a signal handler for the whole process

New handler

Out-parameter to get the old handler

Two ways to process a signal

Very detailed information about the signal origins

ucontext_t * - saved context of an interrupted thread

Temporary mask to add to the thread mask during the signal processing. The signum signal is added implicitly by default

Handler behaviour management:

  • SA_NODEFER - don't block this signal during processing
  • SA_ONSTACK - choose another stack for the handler
  • SA_RESETHAND - reset the handler after each signal
  • SA_SIGINFO - use sa_sigaction handler
    ...

Sigaction [2]

static void
on_new_signal(int signum, siginfo_t *info,
              void *context)
{
	printf("Segfault: signum = %d, si_signo = "\
               "%d, si_addr = %p, si_code = \n",
	       signum, info->si_signo, info->si_addr);
	switch (info->si_code) {
	case SEGV_MAPERR:
		printf("not mapped\n");
		break;
	case SEGV_ACCERR:
		printf("no permission\n");
		break;
	default:
		printf("%d\n", info->si_code);
		break;
	}
	exit(-1);
}

int
main(void)
{
	struct sigaction act;
	act.sa_sigaction = on_new_signal;
	sigemptyset(&act.sa_mask);
	act.sa_flags = SA_SIGINFO;
	sigaction(SIGSEGV, &act, NULL);
	char *ptr = (char *) 10;
	*ptr = 100;
	return 0;
}
$> gcc 5_sigaction_info.c
$>./a.out
Segfault: signum = 11, si_signo = 11, si_addr = 0xa, si_code = not mapped
$>

To use an extended handler - SA_SIGINFO

The pointer, obviously, points at an invalid address

This is SIGSEGV handler which prints the exact place and the reason of the error

Sigaction [3.1]

static int finished = 0;
static int id = 0;

static int is_usr1_blocked(void)
{
	sigset_t old;
	sigprocmask(0, NULL, &old);
	return sigismember(&old, SIGUSR1);
}

static void on_new_signal(int signum)
{
	int my_id = ++id;
	printf("Begin processing %d, "\
               "usr1 block = %d\n", my_id,
	       is_usr1_blocked());
	if (my_id == 1)
		raise(signum);
	printf("End processing %d\n", my_id);
	++finished;
}

Check that SIGUSR1 is blocked

Signal handler sends its own signal again

int main(int argc, char **argv)
{
	struct sigaction act;
	act.sa_handler = on_new_signal;
	sigemptyset(&act.sa_mask);
	act.sa_flags = 0;
	for (int i = 1; i < argc; ++i) {
		if (strcmp(argv[1], "nodefer") == 0)
			act.sa_flags |= SA_NODEFER;
	}
	sigaction(SIGUSR1, &act, NULL);
	printf("Before raises, usr1 block = %d\n",
               is_usr1_blocked());
	raise(SIGUSR1);
	while (finished != 2)
		sched_yield();
	printf("After raises, usr1 block = %d\n",
               is_usr1_blocked());
	return 0;
}

If an argument nodefer is passed to main then the signal handler is called with SA_NODEFER - don't block the signal in handler

Sigaction [3.2]

$> gcc 6_sigaction_mask.c
$>./a.out
Before raises, usr1 block = 0
Begin processing 1, usr1 block = 1
End processing 1
Begin processing 2, usr1 block = 1
End processing 2
After raises, usr1 block = 0
$>
$>
$> ./a.out nodefer
Before raises, usr1 block = 0
Begin processing 1, usr1 block = 0
Begin processing 2, usr1 block = 0
End processing 2
End processing 1
After raises, usr1 block = 0
$>

Sigaction [4.1]

static int finished = 0;
static jmp_buf buf;

static int is_usr1_blocked(void)
{
	sigset_t old;
	sigprocmask(0, NULL, &old);
	return sigismember(&old, SIGUSR1);
}

static void on_new_signal(int signum)
{
	printf("Process signal, usr1 block = %d\n",
               is_usr1_blocked());
	++finished;
	longjmp(buf, 1);
}

Jump from the handler instead of return

int main(void)
{
	struct sigaction act;
	act.sa_handler = on_new_signal;
	sigemptyset(&act.sa_mask);
	act.sa_flags = 0;
	sigaction(SIGUSR1, &act, NULL);
	printf("Before raise, usr1 block = %d\n",
               is_usr1_blocked());
	if (setjmp(buf) == 0)
		raise(SIGUSR1);
	while (finished != 1)
		sched_yield();
	printf("After raise, usr1 block = %d\n",
               is_usr1_blocked());
	return 0;
}

When a jump is done to here, setjmp returns 1. On the first (real) call it is 0, and this is when the signal is sent. So basically 'setjmp' "returns" multiple times

Sigaction [4.2]

$> gcc 7_sigaction_jmp.c
$>./a.out
Before raise, usr1 block = 0
Process signal, usr1 block = 1
After raise, usr1 block = 1​
$>
$> gcc 7_sigaction_jmp.c
$>./a.out
Before raise, usr1 block = 0
Process signal, usr1 block = 1
After raise, usr1 block = 0
$>

Linux

Mac

Sigaction [5.1]

static int finished = 0;
static sigjmp_buf buf;

static int is_usr1_blocked(void)
{
	sigset_t old;
	sigprocmask(0, NULL, &old);
	return sigismember(&old, SIGUSR1);
}

static void on_new_signal(int signum)
{
	printf("Process signal, usr1 block = %d\n",
               is_usr1_blocked());
	++finished;
	siglongjmp(buf, 1);
}

Jump from the handler instead of return

int
main(void)
{
	struct sigaction act;
	act.sa_handler = on_new_signal;
	sigemptyset(&act.sa_mask);
	act.sa_flags = 0;
	sigaction(SIGUSR1, &act, NULL);
	printf("Before raise, usr1 block = %d\n",
               is_usr1_blocked());
	if (sigsetjmp(buf, 1) == 0)
		raise(SIGUSR1);
	while (finished != 1)
		sched_yield();
	printf("After raise, usr1 block = %d\n",
               is_usr1_blocked());
	return 0;
}

When a jump is done to here, sigsetjmp returns 1 and reverts signal mask to the moment when sigsetjmp was called and returned first time.

Sigaction [5.2]

$> gcc 8_sigaction_sigjmp.c
$>./a.out
Before raise, usr1 block = 0
Process signal, usr1 block = 1
After raise, usr1 block = 0
$>
$> gcc 8_sigaction_sigjmp.c
$>./a.out
Before raise, usr1 block = 0
Process signal, usr1 block = 1
After raise, usr1 block = 0
$>

Linux

Mac

Sigsuspend [1]

int sigsuspend(const sigset_t *mask);

Atomic change of the signal mask and wait for any signal

A temporary mask to wait for a signal with

Sigsuspend [2.1]

static int finished = 0;

static int is_int_blocked(void)
{
	sigset_t old;
	sigprocmask(0, NULL, &old);
	return sigismember(&old, SIGINT);
}

static void on_new_signal(int signum)
{
	printf("Process signal, int block = %d\n",
               is_int_blocked());
	++finished;
}
int main(void)
{
	printf("Before main, int block = %d\n",
               is_int_blocked());
	sigset_t block, old;
	sigemptyset(&block);
	sigaddset(&block, SIGINT);
	sigprocmask(SIG_BLOCK, &block, &old);

	struct sigaction act;
	act.sa_handler = on_new_signal;
	sigemptyset(&act.sa_mask);
	act.sa_flags = 0;
	sigaction(SIGINT, &act, NULL);
	sigemptyset(&block);
	while (finished != 1)
		sigsuspend(&block);
	sigprocmask(SIG_SETMASK, &old, NULL);
	printf("After main, int block = %d\n",
               is_int_blocked());
	return 0;
}

SIGINT was blocked, but each sigsuspend temporary unblocks it in an attempt to receive the needed signal

Sigsuspend [2.2]

$> gcc 9_sigsuspend.c
$>./a.out
Before main, int block = 0
^C Process signal, int block = 1
After main, int block = 0
$>

Sigaltstack [1]

int sigaltstack(const stack_t *ss, stack_t *old_ss);

typedef struct {
        void *ss_sp;
        int ss_flags;
        size_t ss_size;
} stack_t;

An alternative stack for the signal handler

New stack

Out parameter, old stack

Stack memory and its size.

  • SIGSTKSZ - advised size
  • MINSIGSTKSZ - minimal possible size

Stack management flags:

  • SS_DISABLE - turn off the alternative stack, use the stack of a thread which received the signal

Sigaltstack [2.1]

static int finished = 0;

#define handle_error() ({printf("error = %s\n", \
                         strerror(errno)); exit(-1); })

static void
on_new_signal(int signum)
{
	int local_var;
	printf("Process signal, stack = %p\n", &local_var);
	++finished;
}

static void
stack_create(int size)
{
	int page_size = getpagesize();
	size = size + (page_size - size % page_size);
	if (size < SIGSTKSZ)
		size = SIGSTKSZ;
	stack_t s;
	s.ss_sp = malloc(size);
	printf("Page size = %d, new stack begin = %p, "\
               "stack end = %p\n", page_size, s.ss_sp,
               s.ss_sp + size);
	s.ss_flags = 0;
	s.ss_size = size;
	if (sigaltstack(&s, NULL) != 0)
		handle_error();
}
static void
stack_destroy(void)
{
	stack_t s;
	if (sigaltstack(NULL, &s) != 0)
		handle_error();
	s.ss_flags = SS_DISABLE;
	if (sigaltstack(&s, NULL) != 0)
		handle_error();
	printf("Destroy stack %p\n", s.ss_sp);
	free(s.ss_sp);
}

Stack size should be multiple of page size, and not less than recommended

Where to get the memory from? - not important. From heap, from the current stack, from mmap, from anywhere

Now any sinal with the established sigaction with SA_ONSTACK flag is going to be processed on that stack

Stack turn off is not simple. NULL as a first argument won't help. The old stack should be extracted more explicitly

Set flag SS_DISABLE - it unbinds the old stack from the kernel internal structures, and now it can be freed

Sigaltstack [2.2]

static void
wait_signal(void)
{
	int need = finished + 1;
	raise(SIGUSR1);
	while (finished < need)
		sched_yield();
}

int
main(int argc, char **argv)
{
	struct sigaction act;
	printf("Main, stack = %p\n", &act);
	act.sa_handler = on_new_signal;
	sigemptyset(&act.sa_mask);

	act.sa_flags = SA_ONSTACK;
	stack_create(1024 * 64);
	if (sigaction(SIGUSR1, &act, NULL) != 0)
		handle_error();
	wait_signal();
	stack_destroy();

	act.sa_flags = 0;
	if (sigaction(SIGUSR1, &act, NULL) != 0)
		handle_error();
	wait_signal();
	return 0;
}

Try to handle on the alternative stack

And on a regular. The addresses should be very different

$> 
            gcc 10_sigaltstacks.c
$>./a.out
Main, stack = 0x7ffeea01aa30

Page size = 4096, new stack begin = 0x105bfd000, stack end = 0x105c1d000

Process signal, stack = 0x105c1ca98

Destroy stack 0x105bfd000

Process signal, stack = 0x7ffeea01a438
$>

        

After the new stack is set, it is clear that the handler uses it

When the stack is dropped, the regular one is used again

Libcoro [1]

With sigaction, sigaltstack, sigsuspend и siglongjmp a coroutine library can be created

But! There is a problem - first return after setjmp invalidates its result, jmp_buf. Why? Doesn't it make impossible to use it in our planned library?

Libcoro [2]

static volatile jmp_buf buf;
static volatile bool stop = false;
static volatile int total = 0;

static void
testf2(void)
{
	setjmp(buf);
	if (++total > 10) {
		printf("Bad exit\n");
		exit(-1);
	}
}

static void
testf(void)
{
	testf2();
}

int
main(void)
{
	testf();
	if (! stop) {
		stop = true;
		printf("After first call\n");
		longjmp(buf, 1);
	}
	return 0;
}
$> gcc 11_setjmp_problem.c
$>./a.out
After first call
Segmentation fault
$>

Why?

<Answer is revealed on the lecture>

2 points

Libcoro [3]

static volatile jmp_buf buf;
static volatile bool stop = false;
static volatile int total = 0;

static void
testf2(void)
{
	setjmp(buf);
	if (++total > 10) {
		printf("Bad exit\n");
		exit(-1);
	}
}

static void
testf(void)
{
	testf2();
}

int
main(void)
{
	testf();
	if (! stop) {
		stop = true;
		printf("After first call\n");
		longjmp(buf, 1);
	}
	return 0;
}

Stack

-

<ret addr to main>

<ret addr to testf>

-

-

-

-

Error! There is no return address on the stack!

Libcoro [4]

A possible solution - never do returns. All return addresses are stored manually in a stack of jmp_buf objects

Problems:

  • Memory overhead to store each jmp_buf
  • C++ destructors won't be called
  • Can't use local variables, because the stack is shared by all coroutines
  • Easy to make a mistake

Libcoro [5]

Coroutine creation:

  • Create a stack, initialize it as a "signal handler stack"
  • Send a signal to self, the kernel will throw us on top of the signal handler stack
  • Remember that place via setjmp, jump back from the handler
  • Detach the stack from the signal handler
  • Now you have a stack not used for anything, on top of which you can jump and do whatever you want

Now you can return to that stack back, using longjmp. But can't do return from it, because there is no return address already. It means, that you can only do jumps. To where? - this is where a scheduler appears. User will catch finished coroutines in that scheduler.

Libcoro [6]

Libcoro [7.1]

#define coro_count 3
static struct coro *coros[coro_count];

static void
recursive(int deep, int id)
{
	printf("%d: recursive step %d\n", id, deep);
	coro_yield();
	if (deep < 2)
		recursive(deep + 1, id);
	printf("%d: finish recursive step %d\n", id, deep);
}

static int
coro_func(void *ptr)
{
	int id = (int) ptr;
	printf("%d: coro is started\n", id);
	coro_yield();
	recursive(0, id);
	return id;
}

static int
coro_tree_func(void *ptr)
{
	int id = (int) ptr;
	printf("%d: coro line is started\n", id);
	coro_yield();
	if (id > 1) {
		printf("%d: coro line end\n", id);
	} else {
		printf("%d: coro line next\n", id);
		coro_new(coro_tree_func, (void *) (id + 1));
	}
	coro_yield();
	return id;
}

int
main(void)
{
	printf("Start main\n");
	coro_sched_init();
	for (int i = 0; i < coro_count; ++i)
		coros[i] = coro_new(coro_func, (void *) i);
	struct coro *c;
	while ((c = coro_sched_wait()) != NULL) {
		printf("Finished %d\n", coro_status(c));
		coro_delete(c);
	}
	printf("Finished simple\n");

	coro_new(coro_tree_func, (void *) 0);
	while ((c = coro_sched_wait()) != NULL) {
		printf("Finished %d\n", coro_status(c));
		coro_delete(c);
	}
	printf("Finish main\n");
	return 0;
}

Libcoro [7.2]

$> gcc 12_libcoro_example.c 12_libcoro.c
$>./a.out
Start main
2: coro is started
1: coro is started
0: coro is started
2: recursive step 0
1: recursive step 0
0: recursive step 0
2: recursive step 1
1: recursive step 1
0: recursive step 1
2: recursive step 2
1: recursive step 2
0: recursive step 2
2: finish recursive step 2
2: finish recursive step 1
2: finish recursive step 0
Finished 2
1: finish recursive step 2
1: finish recursive step 1
1: finish recursive step 0
Finished 1
0: finish recursive step 2
0: finish recursive step 1
0: finish recursive step 0
Finished 0
Finished simple

0: coro line is started
0: coro line next
1: coro line is started
Finished 0
1: coro line next
2: coro line is started
Finished 1
2: coro line end
Finished 2
Finish main

Thoughts

Is it possible to implement a full featured preemptive scheduler in user-space using the signals hack?

For coroutines - yes, but it is not worth doing. There is no legal and clear way to bypass the problem of interruption in the middle of a critical or a globally visible section. All such functions need to block signals for the work time, what makes 'preemption' useless. Because literally any malloc/new in such a case would require signals blocked.

For threads - yes, but also not worth doing. But by another problem - it is already implemented in the kernel. The problem of global and critical sections doesn't exist here, because usually standard library functions are thread safe, and work ok even the current thread is paused.

Summary

Interrupts are of 2 types: software for exceptions on CPU and hardware for handling events from devices (disks, RAM, time source, etc).

Signals are similar to interrupts, but for processes. Raised by the kernel on certain events. Most of the signals can be intercepted and processed.

sigaction() is new and good, signal() is old and bad. Both allow to control signal handling per process.

Most of the signals can be blocked or ignored by threads of your choice via sigprocmask().

You can use your own stack for signal handling and do funny stuff with it like coroutines.

Signal handler is a very dangerous place where it is not safe to do almost anything. The list of safe-to-use standard functions is here: man7.org/linux/man-pages/man7/signal-safety.7.html

Conclusion

Next time:


Press on the heart, if like the lecture

File system. Virtual FS in the kernel. Files, their types. I/O operations and their kernel schedulers. Page cache. Modes of work with file.

System programming 4

By Vladislav Shpilevoy

System programming 4

Interrupts. Hardware and software, their nature, purpose. Interrupt handling. Purpose of signals, how they work. Signals handling, execution context, longjmp, top and bottom halves. /proc/interrupts. signal, sigaction.

  • 1,448