"A man who dares to waste one hour of time has not discovered the value of life."
Charles Darwin
Virtual-Time Execution
Virtual-Time Execution
Not about Lorentz transformations
$$ t' = \gamma \left( t - \frac{v x}{c^2} \right) $$
$$ x' = \gamma \left( x - v t \right) $$
Virtual-Time Execution
Not about neuroscience:
"Time can warp when our brain receives much more or less input than usual in a three- second span."
Virtual-Time Execution
Really about developing software with performance in mind and identifying optimisation opportunities with the highest impact.
void execute_query() {
// todo
}
// your growing code-base
void main() {
parse_request();
prepare_query();
lock();
execute_query();
unlock();
prepare_response();
send_response();
}int tense_move(const struct timespec * delta);
int tense_move_ns(unsigned long long delta_ns);
// todo: new cache policy will make this 50% faster
void execute_query() {
...
}
// your large codebase
void main() {
parse_request();
prepare_query();
lock();
execute_query();
unlock();
prepare_response();
send_response();
}int tense_scale_percent(int percent);
int tense_scale_clear(void);int tense_sleep(const struct timespec * sleep);
int tense_sleep_ns(unsigned long long sleep_ns);int tense_time(struct timespec * now);Time-scaling is dynamic
Interesting applications run more than two threads on more than two cores
Multitasking is the ability to execute a task without waiting for the current one to finish
Processes are tasks. Threads are tasks, too.
Tasks are the leaves of a hierarchy of scheduling queues.
The hierarchy is weighted based on process priorities or group CPU shares.
Execution time from the tasks is summed up the hierarchy.
TLDR: A bit more complex than round-robin.
The real execution time is in fact scaled by the weight of the entity it belongs to. This value is called "virtual runtime".
The next task to run is picked by traversing the hierarchy of scheduling entities and choosing the one with the lowest virtual runtime at each level.
There is a separate hierarchy for each CPU.
Tasks can migrate between hierarchies.
TLDR: A lot more complex than round-robin.
Priority (above) vs. Virtual runtime (below)
Dynamic linking tricks
libtense
tense file in debugfs
tense.ko
Patches to the Linux kernel
Static time dilation and two processes:
The problem is solved by scaling the sleep duration by the time dilation of the other process before setting the wake-up timer.
Dynamic time dilation and N processes:
Sleep until now < wake-up time < now + virtual tick, then set a timer as above assuming the next process to run is the only other process and it won't change its time dilation. If the assumption is violated, restart the timer.
Experimental setup
Producer thread:
1) sample sleep time from an exponential distribution
2) nanosleep (which is replaced
by tense_sleep_ns dynamically)
3) put a job in a shared ring
buffer
Consumer thread:
1) busy-wait for jobs while the ring buffer is empty
2) serve a job by calling a deterministic workload (loop)
3) measure waiting and service times
Analytical mean waiting time (Pollaczek-Khintchine formula): \[E(W)=\frac{\rho\mu}{2(1-\rho)}\left(1+\frac{\sigma^2}{\mu^2}\right)\]
We need to synchronise multiple timelines on multi-core.
This is quite a dramatic change to the uniprocessor algorithm.
There are some tricky details - catch-up time, deadlocks, migrations.
The goal is to evaluate Tense with a large application.
We choose the dedup workload because it is CPU-bound. We speed up a compression function that can also be made faster in real time by changing the algorithm.
We can accurately predict the relative speed-up, but the absolute execution time is overestimated by 10% in both cases.
Avoid running a custom kernel.
Fix known bugs in the SMP algorithm and discover new ones.
Improve user-space features.