Source Code Identifier Embeddings

‘Embed and conquer’, they say. Everything which has a context can be embedded. word2vec, node2vec, product2vec… id2vec! We take source code identifiers, introduce the context as the scope in the Abstract Syntax Tree, and find out that ‘send’ is to ‘receive’ as ‘push’ is to ‘pop’.

Keep reading

enry: detecting languages

Announcing enry, a faster implementation of github/linguist in Go for programming language detection

Keep reading

source{d} tech talks, frontend series

Once every few months, source{d} organizes small conferences around very specific topics. On June 24th, the topic was frontend and the talks were hosted in our Madrid office. Almost 50 people joined us for a day full of things to learn about frontend technologies.

Keep reading

Analyzing GitHub, how developers change programming languages over time

This post is inspired by “The eigenvector of why we moved from language X to language Y”, by Erik Bernhardsson. Based on GitHub repositories, we build our own transition matrix after solving the flow optimization problem. The results are reflecting the history of programming language competition in the open source world.

Keep reading

Announcing Babelfish

Announcing Babelfish, the project we are developing to build representations of source code.

Keep reading

source{d} tech talks, Moscow 2017

On June, 3-2017, source{d} dedicated their regular source{d} tech talks to Machine Learning and we chose to host the event in Moscow, Russia. For this conference, we invited speakers from Russia and abroad and gathered about 80 neural network aficionados in a former industrial area of the city. Let’s make now a brief follow-up of this day.

Keep reading

Daily commit activity on GitHub

This post was inspired by ‘What programming languages are used late at night?’ by StackOverflow. We take our commits dataset, combine it with repositories languages dataset and plot circular histograms for PST/PDT zones with Python.

Keep reading

Kallax: Why we built yet another ORM for Go

We are releasing the first stable version of Kallax, our typesafe, fast PostgreSQL ORM with support for JSON operators, arrays and slices.

Keep reading

Using Docker & CoreOS For GPU Based Deep Learning

A GPGPU computing environment can be set up nicely inside a Docker container using Container Linux by CoreOS. Our way to setup deep learning is efficient and brings benefits to devops and data scientists.

Keep reading

Comparing Git trees in Go

If you use Git, you probably compare commits on a daily basis. This blog post explains the data structures and algorithms involved in such task, in an intuitive way. After reading this blog post you will have a nice understanding of how to use prefix trees and Merkle trees and a good intuition of how to solve similar problems whenever they come up.

Keep reading