Convert IPYNB to PDF in Python: A Programmatic Guide with nbconvert
Beyond the command line: convert Jupyter Notebooks to PDF directly in Python using nbconvert's API. Includes batch conversion, custom templates, and embedding in Flask/Django apps.
- #python
- #nbconvert
- #api
- #guide
The jupyter nbconvert command-line tool is great for one-off conversions, but sometimes you need to convert notebooks programmatically — batch processing, embedded in a web app, or as part of a larger Python pipeline. nbconvert exposes a full Python API for exactly this.
This guide covers the practical patterns: single conversion, batch conversion, custom templates, execution, and integration with Flask/Django.
Setup
pip install nbconvert nbformat
# For webpdf (no LaTeX needed)
pip install "nbconvert[webpdf]"
playwright install chromium
You'll also want nbformat for reading notebook files directly.
Pattern 1 — Single Notebook Conversion
The simplest programmatic conversion:
import nbformat
from nbconvert import PDFExporter
# Load the notebook
with open("notebook.ipynb") as f:
nb = nbformat.read(f, as_version=4)
# Configure the exporter
pdf_exporter = PDFExporter()
pdf_exporter.latex_command = "xelatex {filename} -quiet" # Unicode-safe
# Convert
body, resources = pdf_exporter.from_notebook_node(nb)
# Save
with open("notebook.pdf", "wb") as f:
f.write(body)
The body is the PDF as bytes; resources contains metadata and any side files (like embedded images extracted separately).
Pattern 2 — Using webpdf (No LaTeX)
If you don't have LaTeX installed, use WebPDFExporter:
import nbformat
from nbconvert import WebPDFExporter
with open("notebook.ipynb") as f:
nb = nbformat.read(f, as_version=4)
exporter = WebPDFExporter()
body, resources = exporter.from_notebook_node(nb)
with open("notebook.pdf", "wb") as f:
f.write(body)
This uses headless Chromium under the hood. Slower than HTML export but much faster than LaTeX.
Pattern 3 — Executing the Notebook During Conversion
If the notebook doesn't have outputs saved, execute it as part of conversion:
import nbformat
from nbconvert import PDFExporter
from nbconvert.preprocessors import ExecutePreprocessor
with open("notebook.ipynb") as f:
nb = nbformat.read(f, as_version=4)
# Execute first
executor = ExecutePreprocessor(timeout=600, kernel_name="python3")
executor.preprocess(nb, {"metadata": {"path": "."}})
# Then convert
exporter = PDFExporter()
body, resources = exporter.from_notebook_node(nb)
with open("notebook.pdf", "wb") as f:
f.write(body)
Set timeout to avoid hanging on cells that wait for input or have infinite loops.
Pattern 4 — Batch Conversion
Convert every notebook in a directory:
import nbformat
from nbconvert import WebPDFExporter
from pathlib import Path
exporter = WebPDFExporter()
for nb_path in Path("notebooks").glob("*.ipynb"):
with open(nb_path) as f:
nb = nbformat.read(f, as_version=4)
try:
body, _ = exporter.from_notebook_node(nb)
out_path = nb_path.with_suffix(".pdf")
with open(out_path, "wb") as f:
f.write(body)
print(f"Converted: {nb_path.name}")
except Exception as e:
print(f"Failed: {nb_path.name}: {e}")
Wrap each conversion in a try/except so one bad notebook doesn't kill the batch.
Pattern 5 — Custom Templates
nbconvert uses Jinja2 templates. You can customize them to change the PDF's appearance:
import nbformat
from nbconvert import PDFExporter
from nbconvert.templateengine import TemplateExporter
with open("notebook.ipynb") as f:
nb = nbformat.read(f, as_version=4)
# Use a custom template
exporter = PDFExporter(template_name="classic")
# Or point to a custom template file
# exporter.template_file = "my_template.tpl"
body, resources = exporter.from_notebook_node(nb)
with open("notebook.pdf", "wb") as f:
f.write(body)
Built-in templates include classic, lab, and html. You can write your own by extending one of these.
Pattern 6 — Excluding Specific Cells
Use cell tags to exclude cells from the PDF. Tag a cell with remove_cell in the notebook editor, then:
import nbformat
from nbconvert import PDFExporter
# This is the default behavior — cells tagged 'remove_cell' are excluded
exporter = PDFExporter()
with open("notebook.ipynb") as f:
nb = nbformat.read(f, as_version=4)
body, _ = exporter.from_notebook_node(nb)
with open("notebook.pdf", "wb") as f:
f.write(body)
Other useful tags: remove_input (hide code, keep output), remove_output (hide output, keep code).
Pattern 7 — Integration with Flask
Expose conversion as a web endpoint:
# app.py
from flask import Flask, request, send_file
import nbformat
from nbconvert import WebPDFExporter
import io
import tempfile
app = Flask(__name__)
@app.route("/convert", methods=["POST"])
def convert():
if "notebook" not in request.files:
return "No notebook uploaded", 400
file = request.files["notebook"]
nb = nbformat.reads(file.read().decode("utf-8"), as_version=4)
exporter = WebPDFExporter()
body, _ = exporter.from_notebook_node(nb)
return send_file(
io.BytesIO(body),
mimetype="application/pdf",
as_attachment=True,
download_name="converted.pdf",
)
if __name__ == "__main__":
app.run(debug=True)
Test with:
curl -F "[email protected]" http://localhost:5000/convert -o output.pdf
Pattern 8 — Integration with Django
A Django view doing the same thing:
# views.py
from django.http import HttpResponse
import nbformat
from nbconvert import WebPDFExporter
def convert_notebook(request):
if request.method != "POST":
return HttpResponse(status=405)
uploaded = request.FILES.get("notebook")
if not uploaded:
return HttpResponse("No notebook", status=400)
nb = nbformat.reads(uploaded.read().decode("utf-8"), as_version=4)
exporter = WebPDFExporter()
body, _ = exporter.from_notebook_node(nb)
response = HttpResponse(body, content_type="application/pdf")
response["Content-Disposition"] = 'attachment; filename="converted.pdf"'
return response
Pattern 9 — Asynchronous Batch with Celery
For high-volume conversion, offload to a worker:
# tasks.py
from celery import Celery
import nbformat
from nbconvert import WebPDFExporter
app = Celery("tasks", broker="redis://localhost:6379")
@app.task
def convert_notebook(nb_path, output_path):
with open(nb_path) as f:
nb = nbformat.read(f, as_version=4)
exporter = WebPDFExporter()
body, _ = exporter.from_notebook_node(nb)
with open(output_path, "wb") as f:
f.write(body)
return output_path
This lets your web app return immediately and email the user when the PDF is ready.
Pattern 10 — Streaming Conversion for Large Notebooks
For huge notebooks, convert cell-by-cell to avoid memory spikes:
import nbformat
from nbconvert import HTMLExporter
from nbconvert.preprocessors import ClearOutputPreprocessor
with open("huge.ipynb") as f:
nb = nbformat.read(f, as_version=4)
# Process in chunks
exporter = HTMLExporter()
# Convert to HTML first (lighter than PDF)
body, _ = exporter.from_notebook_node(nb)
# Then print HTML to PDF via headless browser in chunks
# (Implementation depends on your browser automation tool)
Performance Tips
-
Reuse exporter instances. Creating a new
PDFExporterfor every conversion is expensive — it reloads templates. Cache it.# Module-level singleton _exporter = None def get_exporter(): global _exporter if _exporter is None: _exporter = WebPDFExporter() return _exporter -
Use
WebPDFExporteroverPDFExporter. No LaTeX dependency, faster, and the output is just as good for most use cases. -
Set timeouts. Notebooks with
input()calls or infinite loops will hang forever. Always useExecutePreprocessor(timeout=600). -
Process in parallel. For batch conversion, use
multiprocessing.Pool:from multiprocessing import Pool def convert_one(nb_path): # ...conversion logic... with Pool(processes=4) as pool: pool.map(convert_one, Path("notebooks").glob("*.ipynb")) -
Watch memory with LaTeX.
PDFExporterlaunches a separatexelatexprocess per conversion. Under load, this can OOM your server.WebPDFExporteris gentler.
Conclusion
The nbconvert Python API unlocks everything from simple batch scripts to fully integrated web apps. For most use cases, WebPDFExporter is the right choice — no LaTeX dependency, fast, and produces clean output. Reach for PDFExporter only when you need maximum fidelity or LaTeX-specific features like custom document classes.
Once you've internalized the patterns above, you can stop thinking about "converting notebooks" as a manual step and start treating it as just another function call in your pipeline.