Analyze 2.5 TB of Python Notebooks
In this example we:
Download/unzip the dataset which is served as 27 zipfiles between 20 and 200GB!
For each .ipynb (5 million) extract the date and python packages used.
Graph trends in python package popularity over time.
Example coming soon!
Last updated