Delve Into the Deep Blue Sea of Oceanic Data with Marinexplore

marineexplore_thumb

It’s widely known that most of the Earth is covered in water; the ocean alone covers 71% of the planet’s surface to be exact. The ocean contains fathoms of data, and with over 90% of it still to be explored, its processing and analysis is the very model of a Big Data problem. Marinexplore is a new open data collaboration platform and community containing 463,447,500 oceanographic measurements collected from 23,422 sensors.

Read more »

The Answer Isn’t Less Data, It’s More Data Science

Photo by Sarah Joy via Flickr. (CC BY-SA 2.0)

How much data is too much? Depending on who’s answering, the answer may be “there’s never enough.” Many don’t share that perspective, however, and are instead overwhelmed by the amount of data available at their fingertips. It’s a growing concern for consumers of online media, engorging themselves on the endless buffet of information served through social media, smartphones, and news aggregators.

Read more »

Pivotal Labs Empowers Data-Driven Enterprises to Become Agile and Collaborative

Image by Sticker Giant via Flickr. (CC BY 2.0)

Data is Big, the predictive enterprise is the way of the future, and data scientists are in high demand: you can’t glance at technology news sites in 2012 without being aware of these developments. But there’s another challenge facing organizations as they deal with the influx of data, one which receives less attention: a lack of the custom applications, skills, and development methodologies necessary to tap into its value.

Read more »

Facebook’s Data Team Challenges Fears of an Online Echo Chamber

Image by Facebook's Data Team.

Online pundits and media critics warn that as social media increasingly becomes a dominant source of news, and aggregators like Google News develop algorithms to surface stories that are presumably more interesting to users, we’re participating in an echo chamber where self-selected social groups and online habits reinforce our existing beliefs.

Read more »

Data Science Meets CSI

Photo by Jeffrey Beall via Flickr. (CC BY-SA 2.0)

I wouldn’t hold my breath for CSI: Palo Alto quite yet, but as Jon Bruner at O’Reilly Radar observes, data scientists could serve the public good as data-diving amateur sleuths.  Bruner proposes this after reading the disturbing story of Javier Reveron, who went missing in 2004, was reported to a missing person’s database in 2010, and whose long-dead body was only identified by authorities two weeks ago.

Read more »

Can Anyone Become a Data Scientist? Oxdata Believes So

Visualization for Popular Science magazine by Jer Thorp via Flickr. CC BY 2.0 license.

Data science is a sophisticated and complex discipline, but since it’s still an emerging field, its practitioners come from a wide variety of backgrounds. Typically, though, a background in working with large data sets in a research setting is advantageous. This is why you may find yourself mingling with a former physicist or immunologist at the next data hackathon you attend.

Read more »

Big Business Knows What You’re Buying and Has an eScore You Can’t Know

Shopping Bag

Just when you think you have a handle on your credit score, number of Facebook friends, LinkedIn connections, Twitter followers, and most recently your Klout score, there’s now an eScore to worry about. You probably have a pretty good idea of where you stand with each rating: I’m satisfied with my average standing in every category (378 LinkedIn connections, mid-50′s Klout score, median credit score, and I went rogue and quit Facebook in 2011).

Read more »

Researchers Encode Entire Book Into DNA

1531699476_40142bfecb_z

As Big Data grows and storage moves to the cloud, it’s easy to forget that all that information still takes up physical space, even if that space is a server farm half a world away. But as data grows bigger, storage is becoming smaller.

Read more »

Delivering the Predictive Enterprise

bigdata_periodicchart

Did you know that it takes at least two data scientists to determine the validity of a three-dimensional bar graph? That was my first takeaway from a recent Harris/SAP poll asking what percentage of enterprise Big Data solutions deployed data warehousing versus the cloud.

Read more »

Hadoop and Disparate Data Stores

elephant_rgb_sq

Through our experiences in working with customers on Big Data platforms, we’ve come to notice that there are fundamentally two types of Hadoop users out there; the first type being “Hadoop-centric” users who are building platforms completely off of Hadoop and no longer want to leverage relational database technologies for analytics (these tend to be the early adopters of Hadoop), and the second type being users who are leveraging Hadoop as an augmentation to existing systems and are focused on integrating the technology with existing analytical databases and workflows (these tend to be the later adopters who are still building their Hadoop skills internally).

Read more »