Efficient Processing of Massive Data Streams for Mining and Monitoring
Mirek Riedewald, (Cornell University), firstname.lastname@example.org,
Johannes Gehrke, (Cornell University), email@example.com,
Alan Demers, (Cornell University), firstname.lastname@example.org,
Abhinandan Das, (Cornell University), email@example.com, and
Alin Dobra, (Cornell University), firstname.lastname@example.org
Data arriving as high-speed data streams poses a serious challenge for data management as the traditional DBMS paradigm of set-oriented processing of disk-resident records does not apply. Especially problematic are blocking operators and operators with unbounded state for infinite input, because their memory footprint might grow without bounds. At Cornell University we are designing a system for distributed data stream mining and monitoring. In this talk, we will first overview some open research challenges in processing data streams, and then we will describe algorithms for the approximate computation of set-valued query results.