Python for Data Science using Anaconda

A few years ago, I had a complate setup environment for Python data analysis on my Macbook. I had Python with python virtual environments (virtualenv), of course, one of my virtualenv has complete data science libraries. I used this virtualenv when I want to do data analysis. I also had experiences using docker-machine to be more productive and reproducible in analyzing data but it is too heavy on my Laptop so I kept using virtualenv. I have heard about Anaconda or conda which is the platform that bundles all data science libraries to one plate, but I never tried yet. It is hard for me to move on.

After I start my PhD study in UQ Australia, I have a new computer in my office and it is Windows. I wanted to start working with data using Python in that computer and I tried to remember all steps that I have to do,  so I can do data analysis on my office computer: Installing Python, installing PIP, installing virtualenv, installing virtualenv wrapper, installing all data science libraries to one of my virtualenv, and start working!

As a lazy guy, I do not want to do that. I went to anaconda website and I decided to try Anaconda and tarrraa!!!!!!!!

Just go to download the installer which matches with your operating system, install it then launch the Anaconda Navigator.


It was surprising me, I even can run Rstudio using Anaconda Navigator. If you enjoy using Jupyter (IPython Notebook), just press launch for the Jupyter. It is also has one of beautiful IDE to do data analysis in Python which is Spider. It is really cool! Previously when I used R, I always use Rstudio as my IDE to do data analysis and now if you want to move to Python, you can use a similar IDE which is Spider.

If you want to know what kind of data science libraries that you need to install manually if you don’t want to use Anaconda, please visit this link. The picture below describes some Python libraries for Data Science which may useful for you: