rogdown

2 min read Original article ↗
                   |
 ,_    __   __,  __|   __           _  _
/  |  /  \_/  | /  |  /  \_|  |  |_/ |/ |
   |_/\__/ \_/|/\_/|_/\__/  \/ \/    |  |_/
             /|
             \|  "rogdown" - read-only Google Drive PDF downloader


*What is this?* This is a simple script that allows you to download read-only
Google Drive PDF files to your local machine.

*How does it work?* The script uses Python 3 and Selenium to download the PDF
file. Read-only PDF documents are fetched as images in the browser, and the
script will try to redraw the image to a canvas element and save it as a JPEG
image. The images are then combined into a PDF file.

*What if the file is private or requires authentication?* By default, the
script will run the Chrome webdriver using the
`--user-data-dir=rogdown-chrome-profile` option. If you want to log in to your
Google account, just run Chrome manually using the
`--user-data-dir=rogdown-chrome-profile` option and log in to your account. The
script will then use the existing session to download the file.  Simply, just
hack your way through the authentication process.

*How do I use it?* The script requires a few Python dependencies like selenium
and pillow. You can install it by running the `pip install .` command.

*Tested on:*
1. Debian 12
2. Python 3.11.7
3. Output of `uv tree`
   .-------------------------------------------.
   | rogdown v0.1.0                            '
   | ├── pillow v11.0.0                        '
   | └── selenium v4.26.1                      '
   |     ├── certifi v2024.8.30                '
   |     ├── trio v0.27.0                      '
   |     │   ├── attrs v24.2.0                 '
   |     │   ├── idna v3.10                    '
   |     │   ├── outcome v1.3.0.post0          '
   |     │   │   └── attrs v24.2.0             '
   |     │   ├── sniffio v1.3.1                '
   |     │   └── sortedcontainers v2.4.0       '
   |     ├── trio-websocket v0.11.1            '
   |     │   ├── trio v0.27.0 (*)              '
   |     │   └── wsproto v1.2.0                '
   |     │       └── h11 v0.14.0               '
   |     ├── typing-extensions v4.12.2         '
   |     ├── urllib3[socks] v2.2.3             '
   |     │   └── pysocks v1.7.1 (extra: socks) '
   |     └── websocket-client v1.8.0           '
   `-------------------------------------------'
4. Date/time: 2024-11-06

Potential future improvements:
1. Better experience and error handling
2. More stable way to interact with Google Drive's potentially changing "DOM
   interface"

*Disclaimer:* This script is only a mere proof of concept and should not be
used for any malicious purposes. The author is not responsible for any misuse
of this script.