Say you go fishing in a pond. There are many different types of fish and you catch a bounty. At the end of the day, how can you quickly estimate the fish types you have caught that are only a few in your catch? Suppose now that your colleague goes fishing as well and she also has a bountiful catch. At the end of the day, how do you quickly check if the fish types you have caught are similar or dissimilar?
When you mine stream data from sensors, Internet routers, and other sources, the problems one faces are not unlike the fishing examples above. I will present solutions to these problems, and present open problems that arise in dealing with data streams. Problems become hard when one throws some of the catch back into the pond!