← Writing

Cleaning Up a 20-Year-Old Gmail Inbox With 1 Lakh+ Unread Emails

Cleaning Up a 20-Year-Old Gmail Inbox With 1 Lakh+ Unread Emails

My Gmail account is twenty years old. Over those two decades it had quietly accumulated more than 1,00,000 unread emails, sitting on top of an inbox that totalled well into the tens of thousands of conversations. Every “fresh start” attempt over the years had failed for the same reason: Gmail’s web interface gives up on you at scale.

If you’ve ever tried to bulk-delete from a very large Gmail inbox, you know the problem. You select all 50 visible conversations, expect Gmail to offer the friendly “Select all conversations that match this search” link, and… nothing. The link doesn’t appear. You’re stuck deleting 50 at a time, which would take literal weeks to clear an inbox this old.

This post walks through the strategy I used, with a reusable Python script and the full Google Cloud / OAuth setup so you can do the same.

The Strategy

The cleanup ran in three phases, in this order:

  1. Take a full Google Takeout backup first. No deletion happens until I have a complete mbox file of every email on disk. This is the safety net.
  2. Delete all unread mail. Twenty years of unread newsletters, transaction alerts, and notifications. If it was never opened, it almost certainly isn’t important.
  3. Delete everything in the inbox that isn’t marked “Important”. Gmail’s importance markers are a decent proxy for “actually mattered”. What’s left is the signal.

After all three phases, the inbox went from a hundred thousand-plus unread on top of tens of thousands of conversations down to roughly 4,000 emails — twenty years of meaningful correspondence, distilled.

The script we’ll use runs against the Gmail API, which doesn’t have the “Select all” limitation that breaks the web UI. It pages through every matching message and moves them to Trash (recoverable for 30 days) or permanently deletes them.

Phase 0: Back Up With Google Takeout

Before touching anything, get a complete backup. If something goes wrong — wrong query, wrong account, second thoughts a year later — the mbox file is the only way back.

  1. Go to https://takeout.google.com
  2. Click “Deselect all”
  3. Scroll to Mail and check it
  4. Click “All Mail data included” if you want to filter to specific labels (otherwise it backs up everything)
  5. Next → Choose delivery method (download link by email is fine), file type (.zip), and size (50 GB is safest for big inboxes)
  6. Create export

For a 20-year-old inbox, this can take hours or even a day to assemble. Wait for the email with download links. You’ll get one or more zip files containing an .mbox file — a standard email archive format that Thunderbird, Apple Mail, or any mbox viewer can open. Treat this file like a photo of your inbox at that exact moment. Stash it somewhere safe.

Only proceed once you have the mbox file downloaded and verified.

Phase 1: Set Up Google Cloud and the Gmail API

The script uses Google’s official Gmail API, which means a little one-time setup in Google Cloud Console. None of this costs money — the Gmail API is free for personal use.

Step 1: Create a Google Cloud project

Go to https://console.cloud.google.com/projectcreate and create a new project. Name it anything, e.g. gmail-cleanup. Wait a few seconds for it to provision, then make sure it’s selected in the top bar of the console.

Step 2: Enable the Gmail API

Go to https://console.cloud.google.com/apis/library/gmail.googleapis.com, confirm your project is selected, and click Enable.

Go to https://console.cloud.google.com/apis/credentials/consent. On the newer “Google Auth Platform” UI this lives under Audience in the sidebar.

Step 4: Create an OAuth client (credentials.json)

Go to https://console.cloud.google.com/apis/credentials and click Create Credentials → OAuth client ID.

A popup appears with your client ID and secret. Click Download JSON, rename the downloaded file to exactly credentials.json, and put it in the folder where you’ll run the script.

That’s it for the Cloud setup. You now have the credentials the script needs to talk to your Gmail.

Phase 2: Set Up the Script

Grab the script from this Gist: gmail_bulk_cleanup.py. Click Raw in the top right of the file, then save the page as gmail_bulk_cleanup.py.

Create a working folder and drop both credentials.json and gmail_bulk_cleanup.py into it.

mkdir -p ~/Desktop/gmail-cleanup
cd ~/Desktop/gmail-cleanup
# move credentials.json and gmail_bulk_cleanup.py into this folder
ls
# should show: credentials.json  gmail_bulk_cleanup.py

Install the Python dependencies:

pip3 install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib

If you get an “externally-managed-environment” error (common on newer macOS Python), add --user:

pip3 install --user --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib

First run: authenticate

Run the script with no arguments. It will do a dry run (count only, no deletion) but also trigger the OAuth flow on first launch.

python3 gmail_bulk_cleanup.py

Your browser will open. Sign in with the same Gmail address you added as a test user. Google will show a scary warning that says “Google hasn’t verified this app.” That’s normal — the OAuth client is yours and isn’t published. Click Advanced → Go to GmailCleanup (unsafe) and approve the scopes.

A token.json will be saved next to the script. You won’t need to re-authenticate after this.

The script will then list how many unread emails it found and exit without touching anything. Note the count.

Phase 3: Delete Unread Mail

The dry run already showed how many unread emails exist. Now actually move them to Trash:

python3 gmail_bulk_cleanup.py --trash

The script asks you to type YES to confirm, then pages through every matching message and applies the TRASH label in batches of 1000.

For an inbox of mine with 1,00,000+ unread, this took a few minutes. Watch Gmail refresh in your browser — the unread counter will plummet.

Important: --trash is reversible. Trashed messages stay in Gmail’s Trash for 30 days and can be restored. There’s also a --delete flag for permanent deletion, but skip it unless you’re sure.

Phase 4: Delete “Everything Else”

With unread mail gone, the inbox still likely contains a large pile of older read-but-irrelevant messages — old transaction alerts, expired offers, mailing list digests you actually opened once. Gmail’s “Important first” inbox layout helpfully splits these into two sections:

The plan: delete everything in the second bucket. The query is in:inbox -is:important — “in inbox AND not marked important.”

Dry run first:

python3 gmail_bulk_cleanup.py --query "in:inbox -is:important"

Then trash:

python3 gmail_bulk_cleanup.py --query "in:inbox -is:important" --trash

A note on counts: Gmail’s “Everything else” section header may show a different number than the API returns. In my case, the section showed 8,225 but the query returned 4,571. That’s because Gmail’s is:important search semantics don’t perfectly match what’s displayed in the section header (and the section was probably showing the total inbox count). Don’t worry about it — just run the trash, refresh Gmail, and look at what’s left.

After this phase, my inbox was essentially empty of clutter. The 4,197 “Important” emails remained untouched, and that’s the starting point for whatever filtering or further cleanup you want to do manually.

Final Tally

The 4,000 emails left are the actual signal in 20 years of correspondence. From here, you can browse, label, or archive at your leisure.

Useful Query Variations

The script’s --query flag accepts any Gmail search query. A few that might be useful:

Always dry-run first.

Caveats

The Script

The complete script lives here: gmail_bulk_cleanup.py on GitHub Gist.

To embed it inline in your own blog, drop this snippet wherever you want the code to appear:

<script src="https://gist.github.com/youngindian/98cb0a93ad9f023bc1d40999bfb1d521.js"></script>

Save it as gmail_bulk_cleanup.py in the same folder as your credentials.json and run as described above.