Earlier this month, we announced the release of a new format analyzer powered by Machine Learning in source{d} Lookout, our brand new assisted code review framework. source{d} Lookout is our first step towards a full suite of Machine Learning on Code applications. It’s a framework to develop and deploy new code analyzers for assisted code review on GitHub pull requests. Analyzers benefit from language agnostic representations of source code with Universal Abstract Syntax Trees (UASTs) available in source{d} Engine, avoiding the need for multiple parsing steps.

For more information on source{d} Lookout and the underlying architecture, you can watch the video recording below from our last source{d} Online meetup.

Available Analyzers

This is the list of the known implemented analyzers for source{d} Lookout:

While there are only a handful of analyzers available at the moment, we’re actively working on adding more and invite developers to create their own based on their own use cases. Here is a quick tutorial on how to create your own source{d} Lookout analyzer.

Implementing Your Own Analyzer

For a brief description about what is an analyzer, you can read source{d} Lookout Analyzers documentation

Please refer to the official Protocol Buffers documentation to learn how to get started with Protocol Buffers.

To implement your own analyzer you only need to create a gRPC service implementing the Analyzer service interface:

service Analyzer {
 rpc NotifyReviewEvent (ReviewEvent) returns (EventResponse);
 rpc NotifyPushEvent (PushEvent) returns (EventResponse);
}

You can create a new analyzer in any language that supports protocol buffers, generating code from the .protodefinitions. The resulting code will provide data access classes, with accessors for each field, as well as methods to serialize/parse the message structures to/from bytes.

Caveats

All the analyzers should consider the caveats described by the SDK.

Fetching Changes, UASTs or Languages from DataService

source{d} Lookout will take care of dealing with Git repositories, UAST extraction, programming language detection, etc. Your analyzer will be able to use the DataService to query all this data.

You can read more about it in the source{d} Lookout Server section.

How to Test an Analyzer Locally

Please refer to lookout-sdk docs to see how to locally test an analyzer without accessing GitHub at all.

Using Pre-generated Code from the SDK

If you're creating your analyzer in Golang or Python, you'll find pre-generated libraries in the lookout-sdk repository. The SDK libraries also come with helpers to deal with gRPC caveats.

lookout-sdk repository contains a quickstart example —implemented in Go and in Python— of an Analyzer that detects the language and number of functions for every file.

Learn More about source{d} Lookout and MLonCode: