Cleaning Up a 20-Year-Old Gmail Inbox With 1 Lakh+ Unread Emails
My Gmail account is twenty years old. Over those two decades it had quietly accumulated more than 1,00,000 unread emails, sitting on top of an inbox that totalled well into the tens of thousands of conversations. Every “fresh start” attempt over the years had failed for the same reason: Gmail’s web interface gives up on you at scale.
If you’ve ever tried to bulk-delete from a very large Gmail inbox, you know the problem. You select all 50 visible conversations, expect Gmail to offer the friendly “Select all conversations that match this search” link, and… nothing. The link doesn’t appear. You’re stuck deleting 50 at a time, which would take literal weeks to clear an inbox this old.
This post walks through the strategy I used, with a reusable Python script and the full Google Cloud / OAuth setup so you can do the same.
The Strategy
The cleanup ran in three phases, in this order:
- Take a full Google Takeout backup first. No deletion happens until I have a complete mbox file of every email on disk. This is the safety net.
- Delete all unread mail. Twenty years of unread newsletters, transaction alerts, and notifications. If it was never opened, it almost certainly isn’t important.
- Delete everything in the inbox that isn’t marked “Important”. Gmail’s importance markers are a decent proxy for “actually mattered”. What’s left is the signal.
After all three phases, the inbox went from a hundred thousand-plus unread on top of tens of thousands of conversations down to roughly 4,000 emails — twenty years of meaningful correspondence, distilled.
The script we’ll use runs against the Gmail API, which doesn’t have the “Select all” limitation that breaks the web UI. It pages through every matching message and moves them to Trash (recoverable for 30 days) or permanently deletes them.
Phase 0: Back Up With Google Takeout
Before touching anything, get a complete backup. If something goes wrong — wrong query, wrong account, second thoughts a year later — the mbox file is the only way back.
- Go to https://takeout.google.com
- Click “Deselect all”
- Scroll to Mail and check it
- Click “All Mail data included” if you want to filter to specific labels (otherwise it backs up everything)
- Next → Choose delivery method (download link by email is fine), file type (.zip), and size (50 GB is safest for big inboxes)
- Create export
For a 20-year-old inbox, this can take hours or even a day to assemble. Wait for the email with download links. You’ll get one or more zip files containing an .mbox file — a standard email archive format that Thunderbird, Apple Mail, or any mbox viewer can open. Treat this file like a photo of your inbox at that exact moment. Stash it somewhere safe.
Only proceed once you have the mbox file downloaded and verified.
Phase 1: Set Up Google Cloud and the Gmail API
The script uses Google’s official Gmail API, which means a little one-time setup in Google Cloud Console. None of this costs money — the Gmail API is free for personal use.
Step 1: Create a Google Cloud project
Go to https://console.cloud.google.com/projectcreate and create a new project. Name it anything, e.g. gmail-cleanup. Wait a few seconds for it to provision, then make sure it’s selected in the top bar of the console.
Step 2: Enable the Gmail API
Go to https://console.cloud.google.com/apis/library/gmail.googleapis.com, confirm your project is selected, and click Enable.
Step 3: Configure the OAuth consent screen
Go to https://console.cloud.google.com/apis/credentials/consent. On the newer “Google Auth Platform” UI this lives under Audience in the sidebar.
- User type: External
- App name: anything (e.g.
GmailCleanup) - Support email and developer contact: your own email
- Scopes: skip for now (Save and Continue)
- Test users: click ”+ Add Users” and add the Gmail address you want to clean up. This is the step everyone forgets, and without it Google blocks your own OAuth flow with “Access blocked: app has not completed verification.”
Step 4: Create an OAuth client (credentials.json)
Go to https://console.cloud.google.com/apis/credentials and click Create Credentials → OAuth client ID.
- Application type: Desktop app
- Name: anything
- Create
A popup appears with your client ID and secret. Click Download JSON, rename the downloaded file to exactly credentials.json, and put it in the folder where you’ll run the script.
That’s it for the Cloud setup. You now have the credentials the script needs to talk to your Gmail.
Phase 2: Set Up the Script
Grab the script from this Gist: gmail_bulk_cleanup.py. Click Raw in the top right of the file, then save the page as gmail_bulk_cleanup.py.
Create a working folder and drop both credentials.json and gmail_bulk_cleanup.py into it.
mkdir -p ~/Desktop/gmail-cleanup
cd ~/Desktop/gmail-cleanup
# move credentials.json and gmail_bulk_cleanup.py into this folder
ls
# should show: credentials.json gmail_bulk_cleanup.py
Install the Python dependencies:
pip3 install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
If you get an “externally-managed-environment” error (common on newer macOS Python), add --user:
pip3 install --user --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
First run: authenticate
Run the script with no arguments. It will do a dry run (count only, no deletion) but also trigger the OAuth flow on first launch.
python3 gmail_bulk_cleanup.py
Your browser will open. Sign in with the same Gmail address you added as a test user. Google will show a scary warning that says “Google hasn’t verified this app.” That’s normal — the OAuth client is yours and isn’t published. Click Advanced → Go to GmailCleanup (unsafe) and approve the scopes.
A token.json will be saved next to the script. You won’t need to re-authenticate after this.
The script will then list how many unread emails it found and exit without touching anything. Note the count.
Phase 3: Delete Unread Mail
The dry run already showed how many unread emails exist. Now actually move them to Trash:
python3 gmail_bulk_cleanup.py --trash
The script asks you to type YES to confirm, then pages through every matching message and applies the TRASH label in batches of 1000.
For an inbox of mine with 1,00,000+ unread, this took a few minutes. Watch Gmail refresh in your browser — the unread counter will plummet.
Important: --trash is reversible. Trashed messages stay in Gmail’s Trash for 30 days and can be restored. There’s also a --delete flag for permanent deletion, but skip it unless you’re sure.
Phase 4: Delete “Everything Else”
With unread mail gone, the inbox still likely contains a large pile of older read-but-irrelevant messages — old transaction alerts, expired offers, mailing list digests you actually opened once. Gmail’s “Important first” inbox layout helpfully splits these into two sections:
- Important — what Gmail’s importance heuristic flagged as mattering (replies, people you correspond with, etc.)
- Everything else — the rest
The plan: delete everything in the second bucket. The query is in:inbox -is:important — “in inbox AND not marked important.”
Dry run first:
python3 gmail_bulk_cleanup.py --query "in:inbox -is:important"
Then trash:
python3 gmail_bulk_cleanup.py --query "in:inbox -is:important" --trash
A note on counts: Gmail’s “Everything else” section header may show a different number than the API returns. In my case, the section showed 8,225 but the query returned 4,571. That’s because Gmail’s is:important search semantics don’t perfectly match what’s displayed in the section header (and the section was probably showing the total inbox count). Don’t worry about it — just run the trash, refresh Gmail, and look at what’s left.
After this phase, my inbox was essentially empty of clutter. The 4,197 “Important” emails remained untouched, and that’s the starting point for whatever filtering or further cleanup you want to do manually.
Final Tally
- Started: 1,00,000+ unread on top of tens of thousands of inbox conversations, spread over 20 years
- After Phase 3 (unread deletion): unread count down to nearly zero
- After Phase 4 (Everything else deletion): ~4,000 emails total, all marked Important
- Time spent actively: maybe 30 minutes once the setup was done
- Cost: ₹0
The 4,000 emails left are the actual signal in 20 years of correspondence. From here, you can browse, label, or archive at your leisure.
Useful Query Variations
The script’s --query flag accepts any Gmail search query. A few that might be useful:
category:promotions— marketing emailcategory:promotions older_than:1y— old marketing emailcategory:social— Facebook, LinkedIn, etc. notificationsfrom:noreply— automated emailsis:unread before:2024/01/01— unread mail older than a datehas:attachment larger:10M— big attachments hogging storagein:spam— accumulated spam (Gmail auto-deletes after 30 days but you can force it)
Always dry-run first.
Caveats
- Don’t skip the Takeout backup. Mistakes in the query happen. Trash is recoverable for 30 days; permanent deletion is not.
- Test users limit. The OAuth consent screen allows up to 100 test users without verification. For personal use that’s irrelevant.
- Quotas. The Gmail API has generous quotas (1 billion units per day per project) and this script uses tiny amounts. You won’t hit a limit.
--trashvs--delete. Use--trash. Always. Empty Trash manually from Gmail when you’re sure.- Important emails are safe. The
-is:importantfilter in our final query specifically excludes them. Verify with a dry run if you’re nervous.
The Script
The complete script lives here: gmail_bulk_cleanup.py on GitHub Gist.
To embed it inline in your own blog, drop this snippet wherever you want the code to appear:
<script src="https://gist.github.com/youngindian/98cb0a93ad9f023bc1d40999bfb1d521.js"></script>
Save it as gmail_bulk_cleanup.py in the same folder as your credentials.json and run as described above.