[Shorts-2] Papermill=>Adding Parameters to Python Notebooks & executing them like a function
Python Notebooks are great when you are experimenting/ideating. You can quickly test your ideas. And before you realize, you’ll end-up writing the entire code in a notebook. The biggest pain point is to convert the code spanned over 40-50 cells into a python function for looping it over multiple times. This is where PaperMill helps.
Let us understand this with a sample script,
![](http://gitlostmurali.com//assets/images/papermill_samplecode.png)
Problem
If we want to run the entire script with multiple names = ["def.csv","ghi.csv","abc.csv"]
,
- We will have to push all the code into a function with
name
as the argument. OR Restart & Run
the notebook while you change the variablename
for every file.
Papermill Solution
- Papermill tells you to tag the cells which think you are parameters. You can tag your variables’ cell the following way,
![](http://gitlostmurali.com//assets/images/papermill-tags.png)
![](http://gitlostmurali.com//assets/images/papermill-tag1.png)
![](http://gitlostmurali.com//assets/images/papermill-tag2.png)
Now, use the following code to execute the notebook with different arguments
import papermill as pm
names = ["abc.csv","bcd.csv","efg.csv"]
for name in names:
pm.execute_notebook(
'papermill-in.ipynb', ## input notebook
f'out_pm_{name}.ipynb', ## output notebook
parameters=dict(name=name) ## parameters
)
Above code executes the notebooks by injecting parameters. You can look at the injected parameters
in the output notebooks. For ex, in out_pm_bcd.csv.ipynb
:
![](http://gitlostmurali.com//assets/images/pm-injectedparams.png)
Leave a comment