Top 10 GitHub Repositories for Machine Learning

Top 10 GitHub Repositories for Machine Learning

Find the best among the crowd of thousands of platforms, we present a list of the 10 most popular GitHub repositories for machine learning.

GitHub, a collaboration platform for developers, is by far the most popular platform of its kind. At least, its popularity has grown manifold in the last few years, owing to the developments in the domain of Artificial Intelligence. Last year, GitHub went on to set new records by hosting more than 100 million repositories and also with regard to its user base.

Since its inception in 2008, GitHub has been of immense significance to developers, by enabling them to collaborate with greater ease and efficiency. Over the years, GitHub has hosted many eminent repositories, including those from tech-giants like Microsoft.

In this article, I present a list of some of the best GitHub repositories for machine learning.


Bidirectional Encoder Representations, abbreviated as BERT and developed by Transformers, is an unprecedented mode of training language representations. BERT is the first-of-its-kind bidirectional system which has significantly revolutionized the field of Natural Language Processing or NLP. Using the BERT platform, developers have been achieving better and newer results on 11 tasks of NLP. TensorFlow codes and certain pre-trained BERT models are included in the repository.

2. DeepCreamPy

This is one of the best AI-based tools, which has transformed the field of digital image editing. Primarily, this tool functions by altering censored parts of an image with its acceptable alternatives. Most importantly, all of this is done automatically and without any intervention from the developer. It not only works of images of all sizes and shapes but also allows for mosaic de-censoring.

3. Horizon

Horizon, an end-to-end platform, is being increasingly used for Applied Reinforcement Learning (Applied RL). Built in Python, Horizon uses PyTorch in order to model and train algorithms. It also allows for Caffe2 model serving. The most popular instance of this platform’s usage is Facebook. Apart from this, some of the algorithms which Horizon supports are Soft Actor-Critic (SAC), DDPG, DQN and so on.


Commonly referred to as Truffle, this TensorFlow based library is used to develop blocks. These blocks, in turn, enable developers to program agents for reinforcement learning (RL). The platform allows for optimum flexibility and can be used in either of the two versions of TensorFlow – CPU and GPU.

5. DeOldify

Developed by Jason Antic, this project literally does what its name suggests, i.e., de-oldifies images to make them new. Based on deep learning, this repository enables developers to transform black and white images by colourizing and certain other forms of restoration.

6. AdaNet

AdaNet is a network based on TensorFlow and is extremely lightweight. Building on AutoML efforts, this network transforms the process of learning complex models. Most importantly, the network allows for complex machine learning with minimal interference on the part of the developer. By automating the process, the network strives to achieve greater usability, speed, flexibility, as well as, security.

7. Graph Nets

True to its name, the repository functions by accepting graphical inputs and also by returning outputs graphically. The library was released in October, last year, by the London based DeepMind, which is owned by Google. Owing to its graphical representations, this library incorporates an understanding of relations, entities and rules into machine learning. The scope of this repository is heightened even further by the fact that it can be used in TensorFlow.

8. MAME RL Algorithm Toolkit

Basically, this repository is a Python Library, as well as, an algorithm training toolkit. Presently, most, if not all, arcade games make use of this toolkit to train an AI-based reinforcement learning algorithm. The toolkit functions by enabling the algorithm to navigate the gameplay, akin to the gaming conditions of the user. In doing so, the Linux 3.6+ based toolkit also allows the algorithm to interact with the gameplay by exchanging actions with the developer.

9. PocketFlow

This repository has significantly transformed deep learning models, primarily, by making them more compact and faster. Broadly speaking, this open source platform has been developed with the aim to improve the overall efficiency of AI development. In itself, it can be regarded as an easy-to-use toolkit which lowers the risk of errors and heightens the performance of developers.

10. Maskrcnn – Benchmark

Working in PyTorch 1.0, the primary purpose of this repository is to enable faster R-CNN and Mask R-CNN. It strives for the creation of fundamental blocks for building segmentation and detection models and was released under the MIT license. One of the most distinguishing features of this repository is that it is highly user-friendly and, consequently, has made life much easier for developers.

Each of these repositories, in their own ways, have brought significant changes into the domain of machine learning and has allowed developers to solve issues which hitherto seemed unsolvable. The growing popularity of GitHub and the perpetual rise in the number of innovative platforms like these are evidenced enough of the bright future that awaits the realm of Artificial Intelligence.

Prateek Arora

Contributing Editor at Wimoxez. Apart from this, I'm big into books and love reading books in different niche.

wimoxez: Data, Insights and Intelligence

Data, Insights and Intelligence media platform and bring the best resources to explore valuable technologies which will shape tomorrow.