Paper review: “Learning to Represent Programs with Graphs”

A review of the recent ML-on-Code paper from Microsoft Research.

Keep reading

Deduplicating files in Public Git Archive

We describe how we ran apollo on PGA, in order to find communities of duplicate files.

Keep reading

Paper review: “Lessons from Building Static Analysis Tools at Google”

Review of a recent scientific paper by Google on the experience of building large-scale static analysis tools.

Keep reading

Machine Learning on Git: introducing Hercules v4

Hercules is an open source project started in late 2016 with the goal to speed up collecting line burndown statistics from Git repositories. It has transformed into a general purpose Git repository mining framework with several cool use cases: ownership through time, file and people embeddings, structural hotness and even comment sentiment estimation. This post presents the latest ‘v4’ release of Hercules and gives some insights into how Git works.

Keep reading

Announcing Public Git Archive

Announcing Public Git Archive, the largest dataset of git repositories in the world.

Keep reading

Detecting licenses in code with Go and ML

Detecting the license of an open source projects is harder than it seems. We have created go-license-detector, a Go library and command line application to solve that task.

Keep reading

Calling C functions from BigQuery with Web Assembly

As part of our experimentations at source{d}, we decided to try and run a C library on BigQuery. Learn this blog post to see how web assembly came to the rescue, and what other improvements we had to apply to achieve decent performance.

Keep reading

Measuring code sentiment in a Git repository

This is the transcript of our MLonCode talk on GopherCon Russia. The idea is to combine the technologies we’ve developed to solve a toy problem: find funny comments.

Keep reading

Why did I join source{d}? - Francesc Campoy

The first post of a series on why multiple employees joined source{d}. This one is by Francesc Campoy.

Keep reading

source{d} does FOSDEM 2018

Almost every source{d} employee just came back from FOSDEM 2018 and we have so much to tell you!

Keep reading