News & Blogs
October 1, 2012 – Facebook knows something about big data — it collects more data and has built more tools than almost anybody else. Here, Facebook’s Jay Parikh and Accel Partners’ Ping Li talk about what lessons big data startups can take from Facebook to build businesses that can succeed.
The New York Times
October 1, 2012 – Leandre Nsabi, a senior at Rainier Beach High School here, received some bluntly practical advice from an instructor recently. “My teacher said there’s a lot of money to be made in computer science,” Leandre said. “It could be really helpful in the future.”
Silicon Valley Research Group
October 1, 2012 – In an era where Big Data is the new big thing, it was refreshing to hear some interesting thoughts during a recent briefing by Gartner and Dell (see link below) by Praveen Asthana VP Enterprise Strategy and Marketing at Dell, and Cameron Haight VP at Gartner Research
Search Data Management
September 28, 2012 – Hybrids are all the rage in automotive circles, but the term is also gaining currency in data warehousing. A new style of hybrid or logical infrastructure, combining traditional enterprise data warehouses with emerging big data technologies, is being eyed to optimize how organizations process, manage and gain insights from their burgeoning stockpiles of both structured and unstructured data.
The Wall Street Journal
September 28, 2012 – In the pantheon of Next Big Thing trends, the concept of "smart cities" is one of the trendiest. The idea is that by harvesting the incredible amount of data "exhaust" that every one of us generates as we traverse a city, planners can optimize services in the city to make them more efficient, cleaner and cheaper. But there is a fear that such top-down programs may threaten the very vitality that attracts people to cities in the first place.
September 27, 2012 – Big Data doesn’t only create new opportunities for enterprises and organizations; it opens up new horizons to artists, designers, journalists, storytellers, and practitioners of new hybrid forms of communication. Data visualizations abound on the web these days, but there’s a big difference between pie charts or simple heat maps and the work of the the emerging masters of the form. Visualized is a two-day conference on November 8th and 9th at the Times Center in New York City, featuring a wide array of data storytellers. Speakers include “Cyborg Anthropologist” Amber Case, Google’s Alexander Chen, data journalist Simon Rogers, Datakind‘s Jake Porway, and Sven Ehmann of Gestalten Publishing, known for Data Flow, the company’s series of handsome hardbound data visualization compendiums.
September 27, 2012 – Until the NOSQL wave hit a few years ago, the least fun part of a project was dealing with its database. Now there are new technologies to keep the adventuresome developer busy. The catch is, most of these post-relational databases, such as MongoDB, Cassandra, and Riak, are designed to handle simple data. However, the most interesting applications deal with a complex, connected world.
September 26, 2012 – It seems like websites, magazines, and TV shows all over the place are building top ten lists (or top-k lists) these days. The top ten science fiction movies of all time, the best places to live, etc. Top-ten lists are not only a lot of fun because of our seemingly primal need to create categories and hierarchies — they can actually be a useful way to analyze your data. A lot of times, the most interesting records in your data set are the ones with the most extreme values. It’s mind-expanding to think about building a top ten from billions and trillions of records, but it’s also a remarkable achievement for those in the list. Here’s a design pattern you can use to develop a MapReduce job that produces a top-ten list from your data.
September 25, 2012 – Here’s what we know. Data volumes are growing exponentially – a.k.a. Big Data. And we need hardware – lots of hardware – to store it, crunch it and deliver all that data to hungry end-users – both business users and consumers. The problem, as pointed out in a New York Times piece published over the weekend, is that all that hardware requires significant power to operate, including industrial cooling equipment and back-up generators that spew diesel exhaust and other pollutants into the atmosphere. Meanwhile, datacenter operators that prize continuous uptime over energy efficiency routinely run their operations at full power even while many servers sit idle or significantly underutilized.
A diary of a young data scientist
September 25, 2012 – Some months ago, I wrote a post dedicated to new Data Scientists, giving my personal recommendation about several books that are pure gold, and great tools like Python, R, and Apache Hadoop. Right now, today is a new day for this kind of professional; yes, because, the Harvard Business Review (HBR) published a great article talking about the Data Scientist, written by Thomas H. Davenport and D.J. Patil; and I think that both did an incredible job in this writing, believe me, you should read it, you will not regreat. So, I want to dedicate these lines to the raising quantity of jobs with a shining title: "Data Scientist".