Sometimes it is necessary to save the analysis results in the form of histograms. This page describes how to do it using 3 file formats:
Basically a histogram can be represented by 2 arrays: the histogram content and the binning. More complicate considerations can be done on the errors but here this will be negleted and a simple sqrt(N)
has to be considered as implicit. No errors-dedicated array will be treated.
This is not really useful and not recommended. It is not efficient and things can get very complicated if several histograms has to be saved. Just use it in case you want to inspect the histo content in a text file for some reason or you have to avoid any Python and/or ROOT related object.
import matplotlib.pyplot as plt
import numpy as np
import mplhep as hep # <-- nice histogram plotting
# histo creation
# data & plot
data = np.random.normal(10,2,1000)
h, bins = np.histogram(data, bins=20)
hep.histplot(h,bins)
plt.show()
# saving txt files. The array size must be the same
np.savetxt('histos.txt', (h, bins[:-1]) )
The resulting file contains the 2 arrays in separate rows, which is the natural format for later import in loadtxt
.
%cat histos.txt
If you want an array per column, just pack the 2 arrays and transpose the resulting matrix
stack = np.stack((h, bins[:-1]), axis=1) # use axis=0 to stack along rows and axis=1 for columns
np.savetxt('histos_stacked.txt', stack)
%cat histos_stacked.txt
H = np.loadtxt('histos.txt')
plt.plot(H[1], H[0], drawstyle='steps-pre')
plt.show()
# note that the hep.histplot cannot be used because it requires bins array to be N+1 wrt to content arrays.
In the case the histo arrays are saved in columns ('histo_stacked.txt) the unpack=True
is necessary
H = np.loadtxt('histos_stacked.txt', unpack=True)
plt.plot(H[1], H[0], drawstyle='steps-pre')
plt.show()
Numpy files (usually npy
extention, or npz
when the compression is used) are more fast to read/write.
data = np.random.normal(10,2,1000)
h, bins = np.histogram(data, bins=20)
hep.histplot(h,bins)
plt.show()
Use savez
to exploit compression. Moreover:
np.savez('histos',histo=h, binning=bins)
%ls -hrlt histos.*
The arrays are accessed in a dictionary way (key-value)
data = np.load('histos.npz')
# printout the arrays contained in the compressed archive
print(data.files)
hep.histplot(data['histo'], data['binning'])
Several histos can be packed as needed. For instance, these are 10 histos with the same binning (total = 11 arrays)
histolist=[]
for i in range(10):
data = np.random.normal(10,2,1000)
h, bins = np.histogram(data, bins=20)
histolist.append(h)
np.savez('manyhistos.npz', binning=bins, *histolist) # a total or 11 arrays is saved. For the first one the name is specified, the following will take the default name
Reopen and plot data
data = np.load('manyhistos.npz')
print(data.files)
fig, ax = plt.subplots()
for i in range(10):
hep.histplot(data[f'arr_{i}'], data['binning'], label=f'arr_{i}')
plt.legend()
plt.show()
data = np.random.normal(10,2,1000)
h, bins = np.histogram(data, bins=20)
import uproot
# opening a file for writing (open is readonly)
outfile = uproot.recreate('histos.root')
outfile['myhisto'] = (h, bins)
Opening with ROOT:
root -l histos.root
root [0] myhisto->Draw()
For several histos:
outfile = uproot.recreate('manyhistos.root')
for i in range(10):
data = np.random.normal(10,2,100000)
h, bins = np.histogram(data, bins=20)
# names will be myhisto0, myhisto1, ...
outfile[f'myhisto{i}'] = (h, bins)
Opening and plotting with ROOT
root -l -b -q open_multi_histo.C
Macro content:
void open_multi_histo(){
TFile * f = new TFile("manyhistos.root");
TCanvas * c = new TCanvas();
c->Divide(2,5);
TH1F * h[10] ;
for (int i=0;i<10;i++) {
h[i] = (TH1F *) f->Get(Form("myhisto%d", i));
c->cd(i+1);
h[i]->Draw();
}
c->SaveAs("many.png");
return;
}
inputfile = uproot.open('manyhistos.root')
print(inputfile.keys())
for i in range(10):
hep.histplot(inputfile[f'myhisto{i}'], label=f'myhisto{i}')
plt.legend()
plt.show()