Overview
- Understand what is Google Colab
- Get a list of top Alternatives to Google Colab
- By no means is this list exhaustive. Feel free to add more in the comments below
Introduction
For anyone who has storage issues or is not in a position to afford a system compatible to their requirements for data science work, Google Colab has been a blessing.
Working with Colab has opened up so many avenues for me that I thought weren’t possible before. We no longer have the restriction of poor computational power on our machines. And Free GPUs are at our fingertips.
But, as a data scientists it is important to know all the alternatives available for a particular tool. So in this article, we will be explring some of the widely used alternatives to google colab.
Table of Contents
1- What is Google Colaboratory?
2- Alternatives of Google Colab
- Amazon SageMaker
- CoCalc
- Kaggle Kernel
- Binder
3- Other Alternatives
1- What is Google Colaboratory?
Google Colaboratory or Google Colab is a free Jupyter Notebook environment. It is a free cloud-based service by Google which means you don’t have to pay anything. One of the best things about Colab is that you don’t need to install anything beforehand. In fact, many of the Data Science and Machine Learning libraries such as Pandas, NumPy, Tensorflow, Keras, OpenCV come pre-installed with Colab.
The notebooks you create are saved on your Google Drive. So Colab also leverages the collaboration features of Google Docs, where you can share your notebook with multiple people easily and all of you can work on the same notebook at the same time without any issue.
Google also provides the use of a free NVIDIA Tesla K80 GPU. If you connect Colab to Google Drive, that will give you up to 15 GB of disk space for storing your datasets. You can run the session in an interactive Colab Notebook for 12 hours, which is enough for a beginner. Google has its self-made custom chips called TPUs.
One more thing to keep in mind is that the dataset you upload in the Colab notebook gets deleted once the session is ended.
However, you can upgrade to the Pro version, which supposedly gives you access to faster GPUs – NVIDIA TESLA T4 or P100 GPUs, longer runtimes up to 24 hours, and more RAM.
Alternatives of Google Colab
In the following section, we discuss the top 5 alternatives to google colab.
1. Amazon SageMaker
Amazon SageMaker is also a cloud-based Machine Learning platform developed by Amazon in November 2017. It provides hosted Jupyter notebooks that require no setup. But it is not free. Yes, you have to pay for its services, though the trial is free(for the initial two months).
“Using SageMaker Studio is free, you only pay for the AWS services that you use within Studio.”
Pros:
- Along with the Deep Learning frameworks like Tensorflow, scikit learn, PyTorch, and XGBoost, which are provided by Google Colab, SageMaker provides MXNet, Chainer, and SparkML too.
- It offers the following features- Amazon SageMaker Ground Truth, Amazon Augmented AI, Amazon SageMaker Studio Notebooks, Preprocessing, Amazon SageMaker Experiments, and many more.
Cons:
- If you train your model using built-in algos of SageMaker, you cannot deploy it outside SageMaker. This is also the case for Google’s AutoML, though all models trained on ML-engine (including those using Google’s TensorFlow-hub modules) can be deployed anywhere.
- The automatic hyperparameter optimization works better in Colab, in terms of results produced and time taken.
- You get new versions of Tensorflow on SageMaker weeks after you get them on Colab.
Here is the guide on how to use SageMaker and its features.
2. CoCalc
CoCalc or Collaborative Calculation is a web-based cloud computing (SaaS) and course management platform for computational mathematics. It is an open-source software hosted by SageMath Inc. The creator and lead developer of CoCalc are William Stein, a former professor of mathematics at the University of Washington. Along with the Jupyter notebook it supports editing of Sage worksheets and LaTeX documents.
Pros:
- It offers real-time collaboration, which means you can share your notebook with others and you all can edit it at the same time.
- For the free plan offered by CoCalc- Sessions will shut down after 30 minutes of inactivity, though they can run for up to 24 hours, which is twice the time offered by Colab.
- It has a history recording feature that records all of your changes to the notebook in fine detail and allows you to browse those changes using an intuitive slider control.
- Languages offered- Python, Sage, R, Octave, and many more.
Cons:
- The service is not free. However there is a free plan but a Trial Project with certain restrictions – most notably your project runs with lower hosting quality and has no access to the internet to download data from other servers.
- GPU is not available, neither in the free plan nor in the upgraded version.
You can get started with CoCalc from here.
3. Kaggle Kernel
Kaggle is a popular platform for its Data Science Competitions, however, they also provide free Kernels or Notebooks for performing all the Machine Learning and Data Science tasks, independent of the competitions. Kaggle Kernels is a free platform to run Jupyter notebooks in the browser. Both Colab and Kaggle are the product of Google and have many similarities.
Kaggle has updated its kernels to have more computation power and memory. 20GB Dataset, 5GB Disk Space, 9 hours run time, and 4 CPUs w/ 16GB RAM or when the GPU is turned on it is 2 CPU cores w/ 13GB RAM.
Pros:
- Kaggle provides free access to NVIDIA TESLA P100 GPUs in kernels. This benchmark shows that enabling a GPU to your Kernel results in a 12.5X speedup during the training of a deep learning model.
- It supports two of the main languages in the field of Data Science- R and Python.
- Most keyboard shortcuts from Jupyter Notebook are almost similar in Kaggle Kernels, which makes it easier for a person working in Jupyter Notebooks to work in Kaggle.
- Kaggle has a large community to support, learn, and validate data science skills.
Cons:
- In general, Kaggle has a lag while running and is slower than Colab.
- Kaggle typically limits kernel running time to 9 hours, with time out after 1 hour of inactivity.
- A major drawback of both platforms is that the notebooks cannot be downloaded into other useful formats.
4. Binder
Binder is powered by BinderHub, which is an open-source tool that deploys the Binder service in the cloud. Binder allows you to create custom computing environments that can be shared and used by many remote users. It allows you to input the URL of any public Git repository, and it will open that repository within the native Jupyter Notebook interface. You can run any notebooks in the repository, though any changes you make will not be saved back to the repository.
It can be helpful when you have a repository full-on Jupyter Notebooks. Although there is a user limit of 100 users for a repository(which is enough I guess).
Pros:
- Languages supported- Python, R, and Julia.
- Since it is an open-source project, it is free.
- Binder can run your notebooks directly from GitHub.
Cons:
- Collaboration with others is not available.
- Sessions will shut down after 20 minutes of inactivity, though they can run for 12 hours or longer.
- Not suitable while working with large datasets.
Other alternatives
Some of the other alternatives which I didn’t mention in this article can be-
1- Saturn Cloud
3- Datalore
Do check them out.
End Notes
The purpose of this article was just to give an idea about the possible alternatives of Google Collaboratory, the end decision is up to you which one would you prefer according to your need. I hope you will explore all of these platforms and identify the pros and cons for your line of work.
Also do let me know which platform do you prefer/use and why.