# Event № 236

Event № 236

Type: Lecture

Name: Colloquium

Title: Learning patterns in Big data from small data using core-sets

Speaker: Dan Feldman

Place:
Taub Building, Floor 3, room 337, Technion

*Abstract:*

When we need to solve an optimization problem we usually use the best available algorithm/software or try to improve it. In recent years we have started exploring a different approach: instead of improving the algorithm, reduce the input data and run the existing algorithm on the reduced data to obtain the desired output much faster. A core-set for a given problem is a semantic compression of its input, in the sense that a solution for the problem with the (small) coreset as input yields a provable approximate solution to the problem with the original (Big) Data. Core-set can usually be computed via one pass over a streaming input, manageable amount of memory, and in parallel. For real time performance we use Hadoop, Clouds and GPUs. In this talk I will describe how we applied this magical paradigm to obtain algorithmic achievements with performance guarantees in iDiary: a system that combines sensor networks, robotics, differential privacy, and text mining. It turns large signals collected from smart-phones or robots into maps and textual descriptions of their trajectories. The system features a user interface similar to Google Search that allows users to type text queries on their activities (e.g., Where did I have dinner last time I visited Paris?) and receive textual answers based on their signals. Short Bio: Dan Feldman is a post-doc at MIT in the Distributed Robotics Lab, where he develops systems for handling streaming Big data from sensors, smartphones, images, and robots. He got his Ph.D. from Tel-Aviv University in 2010, under the supervision of Prof. Micha Sharir and Prof. Amos Fiat. He then was a postdoc at the Center for the Mathematics of Information at Caltech for a year and a half, where he started to reduce the gap between theoretical computational geometry and practical machine learning. He is specialized in developing software for scalable data compression, based on core-set constructions with provable guarantees. His coresets were implemented in several start-ups, banks, super-markets, and internet search companies over the recent years, to name just a few. When he is not working, Dan is building robots with his very own coresets, Ariel and Eleanor.

SubmittedBy:
Hadas Heier , heier@cs.technion.ac.il

EventLink: Event № 236