Big Data
Scaling Time Series Databases
We collect a lot of metrics about our production systems using Graphite Times Series Databases. In order to improve performance of Graphite and reduce the load on our SAN we purpose-built and tuned some very vast dedicated hardware for our Graphite Databases.
16 minute read
Our Top 10 Big Data News Sources
Keeping on top of an area of technology that is as rapidly moving as the big data ecosystem is hard. Our data tribe share some of their resources for keeping up to date.
5 minute read
Big Data Spain or how I used my Tech Ninja Fund
We sent Software Engineer Iker Gomez to Big Data Spain conferences in Madrid to learn more about big data technologies and real-time processing.
9 minute read
Kafka Cluster Sizing
We’re starting to use Kafka for a number of projects. We can start off on virtual machines on our shared VMWare cluster, but we expect the disk IO to soon reach levels that will make it unsuitable for running on our shared storage. This post looks at some techniques for sizing up a physical Kafka cluster.
5 minute read
When Hadoop tools disagree with each other
We recently saw an 8-year spike on one of our graphs recently. It caused much amusement when it was tweeted out, but there’s actually a good story behind this apparent 8-year lag in data processing.
6 minute read