Welcome to source{d} bi-weekly, a newsletter with the latest news, resources and events related to Code as Data and Machine Learning on Code. Sign up for source{d} bi-weekly newsletter.

Unlocking Engineering Observability with advanced IT analytics

Last week, we hosted a session on the newly released source{d} Enterprise Edition (EE) with source{d} co-founder and CEO, Eiso Kant. Learn more and watch the recap video.

source{d} News

Identifying collaborators in large codebases [Blog]
by Waren Long, Vadim Markovtsev, Hugo Mougard, Egor Bulychev, Jan Hula

At the recent Machine Learning for Software Engineering workshop in Montreal, we had a team of engineers present their research about how to gain a better understanding of the collaboration that goes on in the software development process.

source{d} delivers Enterprise Edition of SDLC platform [Article]
by Mike Vizard

source{d} has made available an enterprise edition of its software development lifecycle (SDLC) platform that includes visualization and analytics tools along with additional management capabilities. A free community edition of the core source{d} platform, which is stripped of most of the enterprise-class tools, also is now available in beta.

Mining software development history: approaches and challenges [Slides]
by Vadim Markovtsev

Software development history, typically represented as a Version Control System log, is a rich source of insights into how the project evolved as well as how its developers work. What’s probably more important is events from the past can predict the future.

Community News

AI Makes New Scientific Discoveries by Analyzing Old Research Papers [Article]
by Kimberley Mok

Artificial intelligence could potentially be used to automate new scientific discoveries, as researchers from the U.S. Department of Energy’s Lawrence Berkeley National Laboratory recently found out when they let an unsupervised AI loose to analyze millions of old scientific papers.

10 tips for reviewing code you don’t like [Blog]
by David Lloyd

As a frequent contributor to open source projects (both within and beyond Red Hat), I find one of the most common time-wasters is dealing with code reviews of my submitted code that are negative or obstructive and yet essentially subjective or argumentative in nature.

Revisiting GNN: All We Have is Low-Pass Filters​​​​​​[Research Paper]
by Hoang NT, Takanori Maehara

Graph neural networks have become one of the most important techniques to solve machine learning problems on graph-structured data. In this paper, they develop a theoretical framework based on graph signal processing for analyzing graph neural networks. The results indicate that graph neural networks only perform low-pass filtering on feature vectors and do not have the non-linear manifold learning property. The paper further investigates their resilience to feature noise and propose some insights on GCN-based graph neural network design.

Program Understanding Synthesis, and Verification with GNN [Video]
by Alex Polozov

The aims of this workshop are to bring together researchers to dive deeply into some of the most promising methods which are under active exploration today, discuss how we can design new and better benchmarks, identify impactful application domains, encourage discussion and foster collaboration.

Visualizing and Measuring the Geometry of BERT [Research Paper]
by Andy Coenen, Emily Reif, Ann Yuan, Been Kim, Adam Pearce, Fernanda Viégas, Martin Wattenberg

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT.

Events

July 26th: source{d} paper reading club (Online)

August 9th: source{d} paper reading club (Online)

October 9-11th: DevFest (Nantes, France)

Featured Community Member

Foutse Khomh is an associate Professor at the Polytechnique Montréal (Canada) where I lead the SWAT Lab. on software analytics and cloud engineering research. I am also FRQ-IVADO Research Chair on Software Quality Assurance for Machine Learning Applications. Make sure to follow Foutse on twitter @SWATLab or visit his website to stay up to date with his latest publications.