Rackspace needed a metrics system that could ingest 30 million signals generated from the Cloud Monitoring system. It had to offer custom data retention levels and still be able to offer graphs to customers in real-time. Gary and his team created a distributed system of shared-nothing nodes on top of Cassandra that split the responsibilities of: ingesting data, processing rollups, servicing data points for reads. Depending on the need, nodes can be easily reconfigured to support all or some of those functions. In this session you you will learn about techniques for scheduling rollups and still maintaining numerical accuracy, how to handled non-numerical data points, how to utilize open-source technology (Apache Cassandra, Scribe, Thrift, and Node.js) to deliver results relatively quickly and much more.
Apache Cassandra Committer and Systems Architect, Rackspace Hosting
An Apache Cassandra committer and PMC member, Gary Dusbabek is a life-long programmer specializing in distributed systems. His past experience includes working with large-scale text and image indexes in the newspaper industry and high-volume advertisement booking software. Recent work at Rackspace includes working on Cassandra full-time and being a founding member of the Cloud Monitoring team. Gary currently works on the Rackspace Service Registry.