[Shorts-2] Papermill=>Adding Parameters to Python Notebooks & executing them like a function
Python Notebooks are great when you are experimenting/ideating. You can quickly test your ideas. And before you realize, you’ll end-up writing the entire code in a notebook. The biggest pain point is to convert the code spanned over 40-50 cells into a python function for looping it over multiple times. This is where PaperMill helps.
Let us understand this with a sample script,
Problem
If we want to run the entire script with multiple names = ["def.csv","ghi.csv","abc.csv"]
,
- We will have to push all the code into a function with
name
as the argument. OR Restart & Run
the notebook while you change the variablename
for every file.
Papermill Solution
- Papermill tells you to tag the cells which think you are parameters. You can tag your variables’ cell the following way,
Now, use the following code to execute the notebook with different arguments
import papermill as pm
names = ["abc.csv","bcd.csv","efg.csv"]
for name in names:
pm.execute_notebook(
'papermill-in.ipynb', ## input notebook
f'out_pm_{name}.ipynb', ## output notebook
parameters=dict(name=name) ## parameters
)
Above code executes the notebooks by injecting parameters. You can look at the injected parameters
in the output notebooks. For ex, in out_pm_bcd.csv.ipynb
:
Leave a comment