Profile plot

Profile histograms are used to display the mean value of Y and its error for each bin in X. (definition from ROOT: TProfile Class Reference)

Matplotlib library does not have a built-in solution, nevertheless a profile plot can be done.

How to built a profile plot with matplotlib

In [1]:
import numpy as np
import matplotlib.pyplot as plt

# Some random data

N = 10000
x = np.random.rand(N)*2*np.pi # flat distribution between 0 and 2pi
y = np.sin(x) + np.random.normal(0, 0.1, N) +  np.random.normal(0, 0.3, N)*x/5 # sin wave + some spread which increases with x 

plt.hist2d(x,y, 50, cmap='GnBu')
plt.colorbar()
plt.show()

The hist2d function returns several arrays: the bins' content (a 2D array), x edges (1D), y edges (1D) and an image. In order to build a profile plot those arrays are necessary.

For example, let's plot

In [2]:
h, xe, ye, im = plt.hist2d(x,y, 50, cmap='GnBu')

# bin width
xbinw = xe[1]-xe[0]

# getting the mean and RMS values of each vertical slice of the 2D distribution
x_slice_mean, x_slice_rms = [], []
for i,b in enumerate(xe[:-1]):
    x_slice_mean.append( y[ (x>xe[i]) & (x<=xe[i+1]) ].mean())
    x_slice_rms.append( y[ (x>xe[i]) & (x<=xe[i+1]) ].std())
    
x_slice_mean = np.array(x_slice_mean)
x_slice_rms = np.array(x_slice_rms)
    
plt.errorbar(xe[:-1]+ xbinw/2, x_slice_mean, x_slice_rms,fmt='_', ecolor='k', color='k')

plt.colorbar()
plt.show()

Defining a convenient function

Function to create the profile quantities

In [3]:
def compute_profile(x, y, nbin=(100,100)):
    
    # use of the 2d hist by numpy to avoid plotting
    h, xe, ye = np.histogram2d(x,y,nbin)
    
    # bin width
    xbinw = xe[1]-xe[0]

    # getting the mean and RMS values of each vertical slice of the 2D distribution
    # also the x valuse should be recomputed because of the possibility of empty slices
    x_array      = []
    x_slice_mean = []
    x_slice_rms  = []
    for i in range(xe.size-1):
        yvals = y[ (x>xe[i]) & (x<=xe[i+1]) ]
        if yvals.size>0: # do not fill the quanties for empty slices
            x_array.append(xe[i]+ xbinw/2)
            x_slice_mean.append( yvals.mean())
            x_slice_rms.append( yvals.std())
    x_array = np.array(x_array)
    x_slice_mean = np.array(x_slice_mean)
    x_slice_rms = np.array(x_slice_rms)

    return x_array, x_slice_mean, x_slice_rms
In [4]:
#compute the profile
p_x, p_mean, p_rms = compute_profile(x,y,(30,30))

plt.errorbar(p_x, p_mean, p_rms,fmt='_', ecolor='r', color='r')
plt.show()

Example with some correlation data from the lab

In [5]:
# reading the data
x, y = np.loadtxt('profile_paw/correl.dat', unpack=True)

# zoom on x and y
zoom = (x>6000) & (x<16000) & (y>10000) & (y<19000)

# computing the profile
p_x, p_mean, p_rms = compute_profile(x[zoom],y[zoom],(20,20))
# linear fit
pars, cov = np.polyfit(p_x, p_mean, 1, w=1/p_rms, cov='unscaled')

## PLOT
plt.subplots(dpi=100)
plt.scatter(x[zoom],y[zoom], marker='.', c='gray', s=0.01)
plt.errorbar(p_x, p_mean, p_rms,fmt='_', ecolor='r', color='r', label='profile')
plt.plot(p_x, np.polyval(pars, p_x), '--',color='lime', lw=2, label = 'fit with Ax + B (A={:.1f}, B={:.1f})'.format(pars[0],pars[1]))
plt.legend()
plt.show()

Other methods

Source: Stack Overflow Plotting profile hitstograms in python

  • Seaborn library and the regplot function (example)
  • Numpy digitize and a Pandas Dataframe object (example)
In [ ]: