Streams à la carte
Extensible Pipelines with Object Algebras
Aggelos Biboudis¹, Nick Palladinos², George Fourtounis¹, Yannis Smaragdakis¹
University of Athens¹
Nessos Information Technologies²
29th European Conference on Object-Oriented Programming (ECOOP 2015)
Stream Libraries
- functional-inspired pipelines
- lazy
- fixed behavior and operators e.g.,
- C# (LINQ), F#(Seq), Scala(Views) implement Pull-streams
- Java 8 implement Push-streams
- Java 8 doesn't accept custom operators
Why un-fix the behavior?
- operators naturally push or pull ⇒ variable performance
- to mix-in behaviors e.g.:
- log with push
- fuse with pull
- blocking or not with push or pull
avoiding other pathological cases
Iterator<Long> iterator = Stream
.of(v)
.flatMap(x -> Stream.iterate(0L, i -> i + 2).map(y -> x * y))
.iterator();
iterator.hasNext(); // Out-of-memory :-(
Expression problem
Object Algebras: A design pattern to the rescue
An abstract factory
interface ExpFactory {
Exp lit(int x);
Exp add(Exp e1, Exp e2);
}
A generic factory
interface ExpFactory<Exp> {
Exp lit(int x);
Exp add(Exp e1, Exp e2);
}
An expression
<Exp> Exp mkAnExp(ExpFactory<Exp> f) {
return f.add(f.lit(1),
f.add(f.lit(2), f.lit(3)));
}
Algebraic Signatures
signature\ Exp
signature Exp
add : Exp \times Exp \rightarrow Exp
add:Exp×Exp→Exp
lit : Int \rightarrow Exp
lit:Int→Exp
in the Object algebras realm
- interfaces are named algebras
- implementations are named factories
- new cases (by extending the algebra)
- new functions (by implementing the algebra)
we propose
- A library
- Inspired by Object Algebras
- Provide extensible streams with:
- Pluggable operators
- Pluggable behaviors
- Mixedin behaviors
- Affect performance (in a good way)
What is the object algebra of Streams?
interface StreamAlg<C<_>> {
<T> C<T> source(T[] array);
<T, R> C<R> map(Function<T, R> f, C<T> s);
<T, R> C<R> flatMap(Function<T, C<R>> f, C<T> s);
<T> C<T> filter(Predicate<T> f, C<T> s);
}
(for intermediate operators)
What is the object algebra of Streams?
interface ExecStreamAlg<E<_>, C<_>> extends StreamAlg<C> {
<T> E<Long> count(C<T> s);
<T> E<T> reduce(T identity, BinaryOperator<T> acc, C<T> s);
}
(for terminal operators)
How do you extend streams?
Add new operators (by extending the algebra)
interface TakeStreamAlg<C<_>> extends StreamAlg<C> {
<T> C<T> take(int n, C<T> s);
}
Add new behavior (by implementing the algebra)
class PushFactory implements StreamAlg<Push>
Let's use a stream
PushFactory alg = new PushFactory();
int sum = alg.sum(
alg.map(x -> x * x,
alg.filter(x -> x % 2 == 0,
alg.source(v)))).value;
Streams a la carte
<E, C> E<Long> cart(ExecStreamAlg<E, C> alg) {
return alg.reduce(0L, Long::sum,
alg.flatMap(x ->
alg.map(y -> x * y, alg.source(v2)),
alg.source(v1)));
}
Declaring streams: reducing a Cartesian product
cart(new ExecPushFactory()).value;
cart(new ExecPullFactory()).value;
cart(new ExecFusedPullFactory()).value;
cart(new LogFactory<>(new ExecPushFactory())).value;
cart(new LogFactory<>(new ExecPushFactory())).value;
cart(new ExecFutureFactory<>(new ExecPushFactory())).get();
cart(new ExecFutureFactory<>(new ExecPullFactory())).get();
Using streams with various factories
Push
Pull
Pull<T> source(T[] array) {
return new Pull<T>() {
int cursor = 0;
boolean hasNext() {
return cursor != array.length;
}
T next() {
if (cursor >= size)
throw new NoSuchElementException();
return array[cursor++];
}
};
}
Pull<T> map (Function<T, R> f, Pull<T> s) {
return new Pull<T>() {
/* calls to s */}
boolean hasNext() { }
T next() { }
}
Push<T> source(T[] array) {
return k -> {
for (int i = 0; i < array.length; i++) {
k(array[i]);
}
};
}
Push<T> map(Function<T, R> f, Push<T> s) {
return k -> s(i -> k(f(i)));
}
object algebras are for construction
an algebra that fuses maps&filters
sometimes we need fully fledged pull
our pathogenic case from earlier with large nested stream
How did we encode higher-kinded types?
Clever technique, already used in Java and C# libraries
- Gronau: HighJ
- Magi
Also recently presented in an OCaml publication
How did we encode higher-kinded types?
interface StreamAlg<C> {
<T> App<C,T> source(T[] array);
<T, R> App<C,R> map(Function<T,R> f, App<C,T> s);
<T, R> App<C,R> flatMap(Function<T, App<C,R>> f, App<C,T> s);
<T> App<C,T> filter(Predicate<T> f, App<C,T> s);
}
interface StreamAlg<C<_>> {
<T> C<T> source(T[] array);
<T, R> C<R> map(Function<T, R> f, C<T> s);
<T, R> C<R> flatMap(Function<T, C<R>> f, C<T> s);
<T> C<T> filter(Predicate<T> f, C<T> s);
}
interface App<C, T> {}
Types
- Id (a type level X => X)
- Push (T -> Unit)->Unit
- Pull (extends Iterator)
- Future (extends FutureTask)
To sum up
- A library implementation
- Inspired by Object Algebras
- Extensible operators
- Pluggable behaviors
- Mixedin behaviors
- Performance is still there
Thank you
This deck: http://slides.com/biboudis/streamalg-ecoop15
The code: http://github.com/biboudis/streamalg
Streams a la carte
By Aggelos Biboudis
Streams a la carte
- 1,484