Just assume we have excel data and we want to plot it on a line chart with different markers. Why markers? just imagine, we have plotted a line chart with multiple lines by different colour, but we only have black and white ink, after printing, all lines will be in black colour. That’s why we need markers.
For instance, our data can be seen in the Table above, this data about algorithms performance vs. the number of k. I want to plot this data to the line chart. We already have the previous experiment, how to plot the line chart with multiple lines and multiple styles. However, in the previous experiment, we used the static declaration for each line. It will be hard if we have to declare one by one for each line.
Let’s just start to code..
The first step is to load Excel data to the DataFrame in pandas.
import pandas as pd import numpy as np import matplotlib as mpl xl = pd.ExcelFile("Experiment_results.xlsx") df = xl.parse("Sheet2", header=1, index_col=0) df.head()
It’s very easy to load the Excel data to DataFrame, we can use some parameters which very useful such as sheet name, header, an index column. In this experiment, I use “Sheet2″ due to my data in the Sheet2, and I use ”1″ as the header parameter which means I want to load the header to the DataFrame, and if you don’t want to load it just fill it with ”0″. I also use index_col equal to “0”, which means I want to use the first column in my Excel dataset as the index in my DataFrame. Now we have a dataframe that can be seen in the Table above.
The second step is how to set the markers. As I said in the previous experiment that matplotlib supports a lot of markers. Of course, I don’t want to define one by one manually. Let see the code below:
# create valid markers from mpl.markers valid_markers = ([item for item in mpl.markers.MarkerStyle.markers.items() if item is not 'nothing' and not item.startswith('tick') and not item.startswith('caret')]) # valid_markers = mpl.markers.MarkerStyle.filled_markers markers = np.random.choice(valid_markers, df.shape, replace=False)
Now, we have a list of markers inside the ‘markers’ variable. We need to select the markers randomly which are defined by df.shape (the number of columns). Let start to plot the data.
ax = df.plot(kind='line') for i, line in enumerate(ax.get_lines()): line.set_marker(markers[i]) # adding legend ax.legend(ax.get_lines(), df.columns, loc='best') plt.show()
Taraaaa!!!!, it’s easy, right?