Plot multiple lines in one chart with different style Python matplotlib

Sometimes we need to plot multiple lines in one chart using different styles such as dot, line, dash, or maybe with different colour as well. It is quite easy to do that in basic python plotting using matplotlib library.

We start with the simple one, only one line:


import matplotlib.pyplot as plt
plt.plot([1,2,3,4])

# when you want to give a label
plt.xlabel('This is X label')
plt.ylabel('This is Y label')
plt.show()

 

Let’s go to the next step, several lines with different colour and different styles.


import numpy as np
import matplotlib.pyplot as plt

# evenly sampled time at 200ms intervals
t = np.arange(0., 5., 0.2)

# red dashes, blue squares and green triangles
plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')
plt.show()

If only three lines, it seems still easy, how if there are many lines, e.g: six lines.


import matplotlib.pyplot as plt
import numpy as np

x=np.arange(6)

fig=plt.figure()
ax=fig.add_subplot(111)

ax.plot(x,x,c='b',marker="^",ls='--',label='Greedy',fillstyle='none')
ax.plot(x,x+1,c='g',marker=(8,2,0),ls='--',label='Greedy Heuristic')
ax.plot(x,(x+1)**2,c='k',ls='-',label='Random')
ax.plot(x,(x-1)**2,c='r',marker="v",ls='-',label='GMC')
ax.plot(x,x**2-1,c='m',marker="o",ls='--',label='KSTW',fillstyle='none')
ax.plot(x,x-1,c='k',marker="+",ls=':',label='DGYC')

plt.legend(loc=2)
plt.show()

Now, we can plot multiple lines with multiple styles in one chart.

These are some resources from matplotlib documentation that may useful:

  1. Marker types of matplotlib https://matplotlib.org/examples/lines_bars_and_markers/marker_reference.html
  2. Line styles matplotlib https://matplotlib.org/1.3.1/examples/pylab_examples/line_styles.html
  3. Matplotlib marker explanation https://matplotlib.org/api/markers_api.html

In this experiment, we define each line manually while it can be hard if we want to generate line chart from dataset. In the next experiment, we use real an Excel dataset and plot the data to line chart with different markers without defining one by one for each line -> just check it here https://pydatascience.org/2017/12/05/read-the-data-and-plotting-with-multiple-markers/

*Some part of the codes, I took from StackOverflow

Python for Data Science using Anaconda

A few years ago, I had a complate setup environment for Python data analysis on my Macbook. I had Python with python virtual environments (virtualenv), of course, one of my virtualenv has complete data science libraries. I used this virtualenv when I want to do data analysis. I also had experiences using docker-machine to be more productive and reproducible in analyzing data but it is too heavy on my Laptop so I kept using virtualenv. I have heard about Anaconda or conda which is the platform that bundles all data science libraries to one plate, but I never tried yet. It is hard for me to move on.

After I start my PhD study in UQ Australia, I have a new computer in my office and it is Windows. I wanted to start working with data using Python in that computer and I tried to remember all steps that I have to do,  so I can do data analysis on my office computer: Installing Python, installing PIP, installing virtualenv, installing virtualenv wrapper, installing all data science libraries to one of my virtualenv, and start working!

As a lazy guy, I do not want to do that. I went to anaconda website and I decided to try Anaconda and tarrraa!!!!!!!!

Just go to https://www.anaconda.com/download/ download the installer which matches with your operating system, install it then launch the Anaconda Navigator.

 

It was surprising me, I even can run Rstudio using Anaconda Navigator. If you enjoy using Jupyter (IPython Notebook), just press launch for the Jupyter. It is also has one of beautiful IDE to do data analysis in Python which is Spider. It is really cool! Previously when I used R, I always use Rstudio as my IDE to do data analysis and now if you want to move to Python, you can use a similar IDE which is Spider.

If you want to know what kind of data science libraries that you need to install manually if you don’t want to use Anaconda, please visit this link. The picture below describes some Python libraries for Data Science which may useful for you:

Python for Data Science

I have been two years doing processing and manipulating data using R and mostly I use this language for my research project. I only heard and never tried Python for my work before. But now, after I use Python, I really fall in love with this language. Python is very simple and it is been known that this language is the easiest one to be learned. The reason why previously I used R was this language is supported by tons of libraries for scientific analysis and all of those are open source. Now, with the popularity of Python, I can find easily all libraries that I need in Python and all of them open source as well.

There are core libraries that you must know when you start to do data analytics using Python:

  1. NumPy, it stands for Numerical Python. Python is different with R, the purpose of R language is for scientist. On the other side, Python is just general programming language. That’s why Python needs a library to handle numerical things such as complex arrays and matrics. Repo project link: https://github.com/numpy/numpy
  2. SciPy, this library is for scientific and it handles such as statistic computing, linear algebra, optimation etc. Repo project link: https://github.com/scipy/scipy
  3. Pandas, if you have experiences with R, it is very similar to DataFrame. Using DataFrame, we can easily manipulate, aggregate, and doing analysis on our dataset. The data will be shown in a table similar to Excel Spreadsheet or DataFrame in R and it convenient to access the data by columns, rows or else. Repo project link: https://github.com/pandas-dev/pandas
  4. Matplotlib, Plotting is very important for data analysis. Why we need plotting? the simple answer is to make anyone easier and we know that one picture can descript 1000 words. To generate visualization from dataset, we absolutely need data visualization tools. If you have experiences with Excel, it is very easy, just block the table that you want to plot and select the plotting types such as Bar chart, line chart, etc. In R, the most popular tools for plotting is ggplot, basically, you can use standard library ‘plot’ in R but if you want more advanced and more beautiful figure you need to use ggplot.  How about in Python? Matplotlib is the basic library for visualization in Python, Repo project link: https://github.com/matplotlib/matplotlib

Those are the core basic libraries that you need when you start to use Python for data analytics. There are tons of Python libraries out there, here some of them that may useful for you:

  1. SciKit-Learn, when you want to apply machine learning, you have to understand this.
  2. Scrapy, to scrap the data from the Web, when you want to gather the data from websites for your analysis. For instance, collecting tweets data from Twitter.
  3. NLTK, if you want to do natural language processing.
  4. Theano, Tensorflow, Keras, when you are not satisfied with NumPy performance or want to apply neural network algorithms or doing deep learning stuff, you have to understand these libraries.
  5. Interactive Visualization Tools, matplotlib is basic plotting tool and it is enough for me as researcher especially for publications, but when we want a dynamic plotting or more interactive, we can use Seaborn, Ploty, or Bokeh.

pythonenvironment

If you do not want to think too much about how to install all of those libraries, just try to use Anaconda, it is really cool. 

See ya next time

Brisbane, 24 November 2017

Python for Data Science Cheat Sheet

Berikut adalah Python for Data Science Cheat Sheet yang cukup membantu untuk merefresh ingatan kita atau bagi yang baru awal menggunakan Python untuk analisis data, mining data atau data science bisa dijadikan bahan bacaan.

 

Python Basic for Data Science

Berikut Cheat Sheet nya:
Python basic

Untuk file PDF dengan kualitas bagus bisa didownload di sini

 

Python NumPy Cheat Sheet

Berikut Cheat Sheet gambarnya:

Numpy Basic

Untuk file PDF dengan kualitas bagus bisa didownload di sini

 

Python Pandas Cheat Sheet

Berikut Cheat Sheet gambarnya:

Pandas Basic

Untuk file PDF dengan kualitas bagus bisa didownload di sini

 

Python Bokeh Interactive Visualization Cheat Sheet

Berikut Cheat Sheet gambarnya:

Bokeh

Untuk file PDF dengan kualitas bagus bisa didownload di sini

 

Cheat Sheet di atas saya dapatkan dari DataCamp.

Semoga bermanfaat.