• submit to reddit
Alec Noller01/01/14
0 replies

DRM and W3C Standards: Will the Web Stay Open?

A recent article from Danny O'Brien at the Electronic Frontier Foundation reported that the proposed Encrypted Media Extension (EME), which focuses on the protection of video content, could potentially be incorporated into W3C's HTML5.1 standard.

John Cook01/01/14
0 replies

Sensitive Dependence on Initial Conditions

The following problem illustrates how the smallest changes to a problem can have large consequences. As explained at the end of the post, this problem is a little artificial, but it illustrates difficulties that come up in realistic problems.

Rob Galanakis12/31/13
3 replies

TDD via Tic-Tac-Toe

I’ve tried out lots of different subject matter for teaching TDD, but my favorite has been Tic-Tac-Toe (or whatever your regional variation of it is). It has these benefits:

Chase Seibert12/31/13
10 replies

Development on a Mac versus Linux

I love the Mac computing experience. Even though I use a Mac as my home laptop, I prefer a Linux machine for work. Here are the key differences between developing on a Mac and on Linux.

Vlad Mihalcea12/31/13
2 replies

NoSQL is Not Just About Big Data

After publishing a small experiment with MongoDB, the author was challenged by the JOOQ team to match his results against Oracle. He will explore the specifics of that challenge in a later post, and in this one, he discusses a number of Small Data use-cases in which MongoDB was the right tool for the job.

Alec Noller12/31/13
0 replies

Lark: A "RESTy" Interface for Redis

Redis users might be interested in Lark, a new Python library designed to transform an HTTP request into a Redis command and provide a "RESTy" interface. Features include automatic JSON serialization and deserialization for Redis values, adapters for Flask and Django, and more.

Alec Noller12/31/13
1 replies

Are You Really a Data Scientist?

According to this recent post, you're not a data scientist just because you work with Hadoop a bit, and know some Python, and have some chops when it comes to databases. According to the author, it takes more than that, and in this article, he provides some resources to help you get there.

Gareth Rushgrove12/30/13
0 replies

Making the Web Secure, One Unit Test at a Time

Writing automated tests for your code is one of those things that, once you have gotten into it, you never want to see code without tests ever again. Why write pages and pages of documentation about how something should work when you can write tests to show exactly how something does work?

Lukas Eder12/30/13
0 replies

The Great SQL Implementation Comparison Page

Fortunately, we have SQL standards. Or do we? It’s a well-known secret (or cynical joke) that the SQL standard is yet another SQL dialect among peers.

Joshua Gross12/30/13
24 replies

Top Posts of 2013: Please stop using Twitter Bootstrap

Let’s be honest: a great many of us are tired of seeing the same old Twitter Bootstrap theme again and again. Twitter Bootstrap’s success has turned it into the Times New Roman of design.

John Sonmez12/30/13
20 replies

Top Posts of 2013: There Are Only 2 Roles of Code

All code can be classified into two distinct roles; code that does work (algorithms) and code that coordinates work (coordinators). I would say that 90% of the code I have written does not nicely divide my classes into algorithms and coordinators.

Mikio Braun12/30/13
0 replies

Top Posts of 2013: Big Data Beyond MapReduce: Google's Big Data Papers

Mainstream Big Data is all about MapReduce, but when looking at real-time data, limitations of that approach are starting to show. In this post, I’ll review Google’s most important Big Data publications and discuss where they are.

Lukas Eder12/30/13
0 replies

MongoDB “Lightning Fast Aggregation” Challenged with Oracle

What does “Scale” even mean in the context of databases? When talking about scaling, people have jumped to the vendor-induced conclusion that SQL doesn’t scale, while NoSQL scales. In this article, the author takes a look at database scalability by comparing Oracle benchmarks to MongoDB.

Arthur Charpentier12/30/13
0 replies

100 Blogs Worth Reading: R, Probability, Data Analysis and Visualization, and More

For the 100th installment of Arthur Charpentier's collections of data science-related links, he has decided to instead provide a list of 100 blogs worth reading. Topics covered include statistics, probability, R, data analysis, graphs, maps, visualization, sciences, economics, and more.

Ayende Rahien12/30/13
0 replies

Reducing the Cost of Writing to Disk

So, we found out that the major cost of random writes in our tests was actually writing to disk. Writing 500K sequential items resulted in about 300 MB being written. Writing 500K random items resulted in over 2.3 GB being written. So the obvious thing to do would be to use compression