• submit to reddit
Matthew Dubins11/06/13
9219 views
0 replies

Topic Modeling in Python and R: A Rather Nosy Analysis of the Enron Email Corpus

What better corpus to do topic modeling on than the Enron email dataset? The author tried to do all of the processing and analysis in R, but it was just too difficult and took too much time. So he dusted off his Python skills and did the bulk of the data processing/preparation in Python, and the text mining in R.

Michal Bachman11/06/13
9863 views
0 replies

Modeling Data in Neo4j: Bidirectional Relationships

In this article - the first in the "Modelling Data in Neo4j for Beginners" series - we look at a common mistake made when modeling bidirectional relationships.

Patrik Antonsson11/06/13
5862 views
0 replies

Creating Development Environments with Vagrant

I've been using Vagrant for a couple of years now and this is a good book for beginners. The book goes through most of the things you need to know to get your environment up and running. The chapters...

Sasha Goldshtein11/06/13
5557 views
0 replies

Modern Garbage Collection in Theory and Practice

Last week I delivered a very interesting session on modern garbage collection. I was invited to give a talk on garbage collection theory and its practical applications in modern managed languages. The slides from the talk are below – they are quite detailed.

A. Jesse Jiryu Davis11/05/13
4105 views
1 replies

Day Of The Thread

If you think you’ve found a bug in Python, what’s next? I'll guide you through the process of submitting a patch, so you can avoid its pitfalls and find the shortest route to becoming a Python contributor!

Chase Seibert11/05/13
2786 views
0 replies

Hacking Django Runserver to Run Multiple Django Instances

Recently at work we’ve been on a “servicifying” kick, meaning we’re slowly converting our monolithic Django app into separate services. To start, this just means breaking up the existing runtime into pieces. Instead of one logical web process, we now have different ones for the web app, admin, login, apis, etc.

Mark Needham11/05/13
4839 views
0 replies

Neo4j: A First Attempt at Retail Product Substitution

One problem for online retailers is working out whether there is a suitable substitute product if an ordered item isn’t in stock. Since this problem brings together three types of data – order history, stock levels and products – it should be a nice fit for Neo4j, so the author ‘graphed up’ a quick example.

Darshan Bobra11/04/13
9276 views
0 replies

Understanding the Concept of Functional Programming

In functional programming, programs are executed by evaluating expressions, in contrast with imperative programming where programs are composed of statements which change global state when executed. This article goes into detail on the concept of functional programming and why one might use it.

Abraham Otero11/04/13
18727 views
3 replies

Quality Levels: the Hole in Software Methodologies

Not all software we develop requires the same quality. It is not the same to develop software that will run only once, and will never need to be changed, that software that is expected to be used for years

Doug Turnbull11/04/13
2798 views
0 replies

Async Solr Queries in Python

Being able to parallelize work is helpful with scripts that index documents into Solr. Unfortunately, Python isn’t exactly JavaScript or Go when it comes to doing asynchronous programming. But the gevent coroutine library can help us a bit with that.

Seth Proctor11/04/13
3362 views
0 replies

ZFS Support in Blackbirds 2.0

You've probably read about the new features in Blackbirds Release 2.0. The big ticket items include geo-distribution, automation, and java stored procedures. In addition to these awesome new features, NuoDB slipped in support for ZFS, specifically Native ZFS on Linux.

Alec Noller11/03/13
5228 views
0 replies

The Best of the Week (Oct. 25): NoSQL Zone

Make sure you didn't miss anything with this list of the Best of the Week in the NoSQL Zone. This week's best include a how-to for using MongoDB as a pure in-memory database, a tutorial to help you make major speed increases in MongoDB, and a discussion of labels, indexes, and so on in Neo4j 2.0.

Zac Gery11/01/13
9100 views
0 replies

Searching For Nails: A Hammer's Story

The truth is, the number of struggling projects far outweigh their successful counterparts. Why is this? Because sometimes developers are too smart for their own good.

Mitch Pronschinske11/01/13
9989 views
0 replies

Amazon Previews JavaScript SDK

Recognizing the swelling popularity of JavaScript, Amazon has finally released a developer preview SDK that can populate S3 buckets, manage SQS message queues, create, populate, and query DynamoDB tables, and much more!

Ian Mitchell11/01/13
13188 views
0 replies

Scrumban - or How to get Leaner by Sprinting Less

In this article we consider a hybrid agile approach known as Scrumban, which can potentially address both project and BAU work. Scrumban is becoming increasingly popular and has significant ramifications for project scalability.