On Parallelizing Streaming Algorithms
Calvin Lab Auditorium
We study the complexity of parallelizing streaming algorithms (or equivalently, branching programs). If M(f) denotes the minimum average memory required to compute a function f(x1,x2,...,xn) how much memory is required to compute f on k independent streams that arrive in parallel? We show that when the inputs (updates) are sampled independently from some domain and M(f)=(n), then computing the value of f on k streams requires average memory at least knM(f).
Our results are obtained by defining new ways to measure the information complexity of streaming algorithms. We define two such measures: the \emph{transitional} and \emph{cumulative} information content. We prove that any streaming algorithm with transitional information content I can be simulated using average memory (n(I+1)). On the other hand, if every algorithm with cumulative information content I can be simulated with average memory (I+1), then computing f on k streams requires average memory at least (k(M(f)−1)).
Joint work with Anup Rao.
Attachment | Size |
---|---|
On Parallelizing Streaming Algorithms | 4.23 MB |