About me
Geek of all trades, just another Perl hacker, tech enthusiast, Mac user, Vim lover, open source sustainer, Usenet lurker. AKA larsenOn twitter
My book
Archives
- January 2012
- December 2011
- September 2011
- August 2011
- May 2011
- April 2011
- January 2011
- November 2010
- October 2010
- September 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- August 2008
- June 2008
- May 2008
- April 2008
- March 2008
Category Archives: Links
Linkslinks for 2011-09-22
September 22, 2011 – 4:02 pm
links for 2011-09-18
September 18, 2011 – 4:02 pm
-
The scikits.timeseries module provides classes and functions for manipulating, reporting, and plotting time series of various frequencies. The focus is on convenient data access and manipulation while leveraging the existing mathematical functionality in numpy and scipy.
links for 2011-09-17
September 17, 2011 – 4:01 pm
-
At Cloudkick we track a ton of metrics about our customer's servers and it's quite a challenge to store such massive amounts of data. Early on, we made the decision to avoid using tools like RRDTool, so we could provide a more holistic look at infrastructure.
-
I'm writing this up because there's always quite a bit of discussion on both the Cassandra and Hector mailing lists about indexes and the best ways to use them.
-
When building a Cassandra cluster, the “key” question (sorry, that’s weak) is whether to use the RandomPartitioner (RP), or the OrderPreservingPartitioner (OPP). These control how your data is distributed over your nodes. Once you have chosen your partitioner, you cannot change without wiping your data, so think carefully!
For Cassandra newbies, like me and my team of HBasers wanting to try a quick port of our project (more on why in another post) nailing the exact issues is quite daunting. So here is a quick summary.
-
Based on Ronald Mathies’ intro articles to Cassandra and a few other resources I’ve been gathering, I thought I should put together a detailed guide to getting started with Cassandra.
-
This article discuss about running mapreduce jobs using the apache tools called pig and hive.Before we can process the data we need to upload the files to be processed to HDFS/S3. We recommend uploading to hdfs and keeping the important files in s3 for backup is a better practice. s3 is easily accessible from commandline using tools like s3cmd. HDFS is a failover cluster filesystem which provides enough protection to your data over instance failures.
-
Cube is an open-source system for visualizing time series data, built on MongoDB, Node and D3. If you send Cube timestamped events (with optional structured data), you can easily build realtime visualizations of aggregate metrics for internal dashboards.
-
Here at Flickr, we’re pretty nerdy. We like to measure stuff. We love measuring stuff. The more stuff we can measure, the better our understanding of how different parts of the website work with each other gets. There are two types of measurement we especially like to do – counting and timing. These exciting activities help us to know what is happening when things break – if a page is taking a long time to load, where is that time being spent and what task have we started to do more of.
links for 2010-11-20
November 20, 2010 – 4:01 pm
-
At the root of this the real problem remains. Programmers should read code. Lots of it. They should be actively seeking to improve their code reading skills by examining what others have managed to do. They should scan existing code bases, and spend some time thinking about what works and what doesn’t. What is readable and what is horrible. They shouldn’t need a tour guide to examine an algorithm, nor should they rely on some hastily written summary of the internals
links for 2010-11-19
November 19, 2010 – 4:02 pm
-
Munin is a networked resource monitoring tool that can help analyze resource trends and "what just happened to kill our performance?" problems
links for 2010-10-06
October 6, 2010 – 4:02 pm

About