I started hearing about Cassandra recently. Cassandra is an open source distributed database management system designed to handle very large amounts of data spread out across many commodity servers while providing a highly available service with no single point of failure. The short-list of adopters is impressive; Facebook, Digg, Twitter, Rackspace, Reddit, etc.
The Digg staff in particular have been blogging their progress and use of this new database technology and their reasons for adoption. Basically, the penalties for using MySQL had become burdensome given the large amount of data Digg handles and the difficulties in scaling MySQL to meet the demand.
Our primary motivation for moving away from MySQL is the increasing difficulty of building a high performance, write intensive, application on a data set that is growing quickly, with no end in sight. This growth has forced us into horizontal and vertical partitioning strategies that have eliminated most of the value of a relational database, while still incurring all the overhead.
In September of 09 Digg evaluated Cassandra and were very successful. On the heels of their success they are replacing most of their infrastructure components and moving away from LAMP and towards NoSQL. Soon, Digg will unveil it’s overhaul of the site, presumably running on the new platform they’ve built.
NPR features The Jobs Of Yesteryear: Obsolete Occupations
As computers and automated systems increasingly take the jobs humans once held, entire professions are now extinct.
A friend recently pointed me at a project called mcollective. The Marionette Collective aka. mcollective is a framework to build server orchestration or parallel job execution systems. An introduction to mcollective is on the projects wiki page.
We’ve attempted to think out of the box a bit designing this system by not relying on central inventories and tools like SSH, we’re not simply a fancy SSH “for loop”.
If you’re a system administrator and have lots of systems to manage you will immediately see how this software could be useful. If you’ve ever used tools like shmux for parallelized execution of commands via ssh, you’ll note the benefit of being able to use discovered meta-data with mcollective. That means you can break from using hostnames or a centralized document as the ultimate source of truth about your environment when running a parallelized operation. With tools like Facter the information lives on the servers to which it is applicable. You’ll always be able to rely on the most up-to-date information about your environment.
CHART OF THE DAY via Business Insider – Silicon Alley Insider.
…Its profits are still being generated by the same engines that have driven Microsoft for years: Office, Windows, and its server division. (Meanwhile, its entertainment and devices division is only recently profitable again, and its online division is a money pit.)
The first thing I thought about after seeing the new Apple iPad is what a great interface it could be for music creation. It’s large enough that it could be a perfect virtual mixing board, drum machine or remote control over MIDI/OSC. I could see a Apple producing an iPad Logic application to allow for some automation or remote control. The initial reaction by the tech blogs have been lackluster, but I don’t think they’re thinking big enough about this product. Everyone is thinking of it in terms of the iPhone because it’s an easy reference point for what is possible. When you really think about all the types of applications available for the iPhone and the creativeness that has driven, I’m sure it’s only a matter of time before we see some game-changing applications on iPad.
I’m looking forward to trying one out at the local Apple store.