Data visualization, by definition, involves making a two- or three-dimensional picture of data, so when the data being visualized inherently has many more dimensions than two or three, a big component of data visualization is dimensionality reduction. Dimensionality reduction is also often the first step in a big-data machine-learning pipeline, because most machine-learning algorithms suffer from the Curse of Dimensionality: more dimensions in the input means you need exponentially more training data to create a good model. Datacratic’s products operate on billions of data points (big data) in tens of thousands of dimensions (big problem), and in this post, we show off a proof of concept for interactively visualizing this kind of data in a browser, in 3D (of course, the images on the screen are two-dimensional but we use motion and perspective to evoke a third dimension).
Video and slides from my talk at the kickoff of Big Data Week Montreal 2014.
The data journalists over at FiveThirtyEight recently posted a controversial article entitled The Hidden Value of the NBA Steal whose central thesis – that NBA players good at something other than scoring can be as valuable to their teams as high-scorers – is a great analogy for explaining the value of multivariate audience data modelling using first and third party data.
If you’ve ever been browsing the web and been annoyed by those One Weird Trick ads, or by ads for that product you looked at online last month and then bought offline, you’ve probably given a thought to blocking ads altogether. The response to this idea, from people who run websites for a living, ranges from “it’s unethical” to “it’s stealing!”. According to them, the reason you get to use a website without paying for it yourself is that in exchange you see ads and website owners gets paid by the advertisers. That’s a polite summary of the great Ad-Blocking Debate, which has been going on since the early days of the commercial web. I’m not going to take sides here; rather I’ll propose a compromise enabled by a recent development in online advertising technology. I’m going to describe a “weird trick,” if you will: how to use the same system as those ads that follow you around to block ads, all the while ensuring that the websites you frequent have nothing to complain about.
Nicolas Kruchten is a Staff Software Engineer at Datacratic in Montréal, Québec, Canada.
His interests include data visualization, robotics, economics and making the world a better place.