How to Import All Your Archive Email Into Gmail
Personal Productivity, SaaS March 28th, 2007
Update (10/24/2007): This post has become unexpectedly popular. After 12K page visits on day one, half a year later it still receives 5-600 visitors every day. However, now that Gmail supports IMAP, it has mostly become obsolete, so I suggest you read my Simplified Guide to Importing All Your Archive Email Into Gmail instead.
This post still has value, mostly in the comments section, where 120 or so readers help out each other on numerous related issues.
The original post:
I finally got sick of all the problems with Outlook, bit the bullet and transferred all my historical email online. Having spent a few days using “native” Gmail (vs. POP to Outlook) I already feel a lot more productive. Ironically I’m writing this on the very day when Yahoo announced unlimited storage – but I’m with Mike on this: message threading, labels and powerful search still make Gmail (the Google Apps flavor) the best choice for me. At least for now – but I keep an eye for the next incarnation of another product – will name it in due course (if you guessed which one, you’re probably right
).
Migrating to a new email service wouldn’t be complete if you couldn’t move all your old “baggage” with you. Apparently this is a burning problem for many, as a year-old post I wrote on the subject is one of my most popular hits ever. Back then I was still happily (?) POP-ing it down to Outlook, but wanted a fast all-in-one searchable archive, and Gmail was the perfect solution. But none of the solutions were perfect – until now. There are several āgmail-loaderā tools on the Net, but some simply donāt work, others change the original sender information to the email account they use for the transfer ā pretty bad, IMHO. My simple solution a year ago was using Thunderbird with a redirect extension. You can read the steps to achieve this here. Even this solution wasn’t flawless: gmail listed all historical mail with the date of the transfer – the original date was sill preserved and searchable, you just got the list display messed up. This still appears to be the biggest hurdle users face according to this new discussion on Lifehacker.
The final solution comes from Google themselves: now that they quietly expanded Mail Fetcher to Google Apps accounts, and removed the “non-gmail source” restriction, there is a simple yet perfect two-steps process to get it all done. Gmail Mail Fetcher fixes the date problem, so now in two steps and using two email accounts you can get it all right.
Step 1: Load all your client-based email to a temporary Gmail account either using my Thunderbird procedure, or, for an easier and elegant solution, get hold of an IMAP account. Gmail does not support IMAP, but my old provider, 1and1.com is not a bad choice: 5 email accounts, 2G each with IMAP support $0.99 / month.
In Outlook (or whatever email client) set up an IMAP account according to the instructions from your online provider. Then folder by folder copy all email into the Inbox on the newly created IMAP account. Don’t forget your Sent Mail folder: yes, that goes into the IMAP Inbox, too. Open all your archives and repeat the same process. Don’t worry if it takes a wile: Outlook doesn’t simply copy between local folders, it shoots up all your email to your temporary IMAP server on the web, and you’ll be constrained by your upstream speed (typically lower than downstream). If you have a spare PC, it’s a good idea to use that one.
Step 2: Now that your email is online, make sure POP access is enabled from your temporary account. If this is a gmail account (not IMAP), this is the setting you need:
“Enable POP for all mail (even mail that’s already been downloaded)”
Then in your Gmail target account – the final destination where you want to have all your archive mail – set up Gmail Mail Fetcher to pick up all mail from your temporary account. The dates will magically be fixed!
Here are Google’s instructions on setting up Mail Fetcher. Do NOT check the button for “Leave a copy of retrieved messages on the server” – you do want Mail Fetcher to “eat” them all from the temporary account, in fact that will be one of your indicators that the transfer is finished. Be prepared for a slow process – Gmail will poll your temporary account at 60-90 minute intervals, fetching 200 emails at a time. At Settings > Accounts you can follow the progress, but ignore the “nnn mails remaining” indicator, as it’s totally wrong. When all done, don’t be alarmed that the number of fetched emails is less than what you started with: your email client (and the IMAP server) counted individual emails, while Gmail will group them into thread, and reports the thread count, which could be significantly lower.
Last, but not least a word on labels / categories: if you nicely organized your Outlook archive in folders, Gmail has no way to preserve that structure. The trick here is to do Steps 1 and 2 in iterations, completely transferring one folder at a time. Then you can set a label for all your fetched email to match the original Outlook folder, and keep on changing it folder by folder.
Finally there is the issue of backup: after all we heard of disappearing Gmail… If you trust Gmail, just worried about what may happen to your individual account, there is always the option of setting up a shadow-gmail account which will fetch everything from your primary one. If you want a local archive, “just in case”, either run Outlook to periodically POP your mail down, or I believe Thunderbird has a plugin that allows it to be minimized to the system tray permanently and check the POP server in the background.
Update (3/28): One potential problem I forgot to mention is that all the fetched email becomes “unread”. Hard to believe, but Gmail does not have a “set all read” feature, and while there are some scripts, I’ve read stories of user accounts being suspended for 24 hours for scripting activity. If anyone has an idea how to changed all mail to “read” please comment below, I’ll bring it up here. Thanks.
Update to the update: The solution comes from Jason Brown, and it’s a surprise: Gmail has added a trick, I have no idea when. In Inbox (or wherever the messages are) choose “Select: All” from the list just above the message list. That will select all of the messages that are visible in the list – but here’s the surprise: At the top and bottom of the list where so it used to only say “All 100 conversations on this page are selected”, there is an additional clickable message: “Select all xxxx conversations in Inbox”, which will in fact select *all* the messages in the Inbox. Then choose “Mark as read” from the “More actions…” drop-down list. Job done! You can do this on Inbox, labels, or if you select All Mail, then on the entire account in a single step. Thanks, Jason!
Somewhat related: The Yahoo Mail announcement (unlimited storage) is picking up steam on Techmeme: Google Blogoscoped, PC World: Techlog,Techdirt, Google Operating System, Monkey Bites, CyberNet Technology News, michael parekh on IT, PaulStamatiou.com, Web Strategy, Download Squad, WebProNews, franticindustries, The Webpreneur, Search Engine Land, Liquidmatrix Security Digest, Conversion Rater, larry borsato, Gizmodo, CrunchGear, CenterNetworks and parislemon
Update (4/7): It’s somewhat obvious, but here’s a tip for backing up your archive online: create another (a third, fourth ..etc) Gmail account, configure Mail Fetcher there with your main account as the source, and voila! – you have a second, third..etc backup copy of all your email. I felt the need to spell this out upon reading Using Google Groups To Backup Gmail by The Google Tutor. It’s an interesting concept and nicely written up, but I think it’s built on fundamentally flawed logic:
- If youāre worried about losing content in your particular gmail account, why not get a second /third backup as I described above? You have the full gmail functionality, which you don’t get with Groups.. Whatās the chance of losing all the accounts at the same time? Besides, this method will backup your āSentā mail, too, which forwarding to Groups canāt help with.
- On the other hand, if youāre worried about Google in general, then why trust yet-another Google service? Groops is no safer than Gmail in that case.
Tags: email, email archive, email tools, gafyd, gayd, gmail, gmail loader, gml, Google, google apps, mail fetcher, Outlook, Personal Productivity, thunderbird
Zoli Erdos
Yahoo Mail Announces Unlimited Storage
Yahoo is announcing that all Yahoo Mail users will have free unlimited email storage starting in May 2007. The current storage limit is 1 GB per account (2 GB for $20/year premium users). With this change, Yahoo leapfrogs Gmail (2.8 GB and growing) and…
I have just made the switch to Gmail this week myself. Did you experience any duplicates of emails appearing in Gmail, joined together in conversations? I got a few of them towards the end of the transfer.
Just great, thank you. i will spend my next weekend, backing up my mails.
Shame, that all username combinations I normaly use weren’t free, as I signed in to gmail.
(name.surname, name_surname, name+first letter of surname)
That’s the only reason why i don’t use gmail as my real address.
Will they ever change that xy@gmail.de is the same as xy@gmail.com? I have such a common name, i guess *sigh*
very important: are the dates on your imported emails preserved? if you did a search in gmail for mail dated before 2003, would the correct messages be found?
baratunde: I believe so, and this has been confirmed by other people as well. We have been discussing Mail fetcher on the Google Apps newsgroup for some time and I am very impressed with it so far. Now if they would only implement a Chat Fetcher, as I probably have more archived chat messages in my old gmail account than I had archived email!
David, why not just get your own domain and use it on Google Apps? It’s free.
Yes, they are, that’s the whole point in my post. In fact even prior to Mail Fetcher, with the older import method the original date was preserved. You could search on it, and whenever you displayed an email, it also showed the correct original date.
Only the list display was messed up by the older method, as the list showed the fetch date.
That’s why I am now using the 2-step method, the second step by Mail Fethcer restores the original date even for the list display.
I am POPing mail from an IMAP account and seem to get the correct date in also in list view on the first fetch, ie is there really a need for a temp Gmail account to fix the dates?
No, you’re right, IMAP preserves the correct date. I did not necessarily propose 2 gmail accounts, just 2 accounts meaning that Mail Fetcher can only get it from a web-account, not your email client directly. So in short:
- First step is upload from client.
- best process for this is IMAP, which Gmail does not support, hence the need for Mail Fetcher to a second account
- if not IMAP, then your dates will be messed up, but the Mail Fetcher process will fix it.
> Hard to believe, but Gmail does not have a “set all read” feature
Go to your Inbox (or wherever the messages are), and choose “Select: All” from the list just above the message list. That will select all of the messages that are visible in the list. Scroll to the bottom of the list and then choose “Select all xxxx conversations in Inbox”, which will select *all* the messages in the Inbox. Then choose “Mark as read” from the “More actions…” drop-down list. Job done!
Wow, I’ll be damned… I don’t know since when they have this option, but never noticed it! Thanks for opening my eyes – I added it to the post above.
SearchCap: The Day In Search, March 28, 2007
Below is what happened in search today, as reported on Search Engine Land and from other places across the web:…
Nice post, very helpful, I may follow suit.
Tech News 3/28/3007
Some technology stories making the headlines today/yesterday.
Yahoo Mail to offer unlimited storage In May, Yahoo will start giving users unlimited storage with their email, up from 1GB in the free version.
HP Exits Media Center Business, Drops DEC Lin…
Daily SearchCast, March 28, 2007: Search Engines Woo Mobile Handset Makers; Yahoo Goes Unlimited On Email Storage; MSN UK Runs Ad In The Search Box & More!
Google cuts a deal to bundle mobile apps on LG phones (the preferred phones of Googlemeister Matt Cutts), the latest in the battle by search engines to woo handset makers. Everyone’s going loco for local! Yahoo offers unlimited email storage — hey, wa…
yahoo mail also has imap …
Yes, if you pay for it, you get IMAP and POP from Yahoo.
thanks for confirming!
now to collect my old emails
SearchCap: The Day In Search, March 28, 2007
Below is what happened in search today, as reported on Search Engine Land and from other places across
Buy a domain on Godaddy or whatever, forward (or use pop) all email from myname@myname.com to a gmail account. Then configure Gmail to use myname@myname.com as the default send address.
That way you are using the best invention on the internet with your with your private and personal email.
Yes, that has been the way to do it prior to Google Apps For Your Domain, which allows you to use your Godaddy-registered domain directly with Gmail. The method you propose will result in the funny myname@gmail.com on behalf of myname@myname.com header.
Very cool. Just one question — why does it need to be the Google apps version of gmail?
I do have my own domain, but I do not want to hand over my mx to google. I prefer having it with my webserver / imap account (oh dreamhost, oh-so-cheap-and-shaky…), etc. Also, having email stuck at 2 gb may actually be an issue at some point
Question 2 would be whether anyone can point me to a good pro/con discussion of handing over my soul to google apps.
some users are able to get free imap (like myself), but it’s quite sporadic…
search for: yahoo mail imap next … you should find out all you need (don’t want a direct link since this post is read quite heavily)
Of course it does not have to be, that’s just my preference, but you can do this with a generic Gmail account.
My soul … oh, well, we all know since Scott McNelly’s famous declaration that our privacy has long gone anyway …
This method works great, thanks. Is there a way to filter duplicate copies of the same messages. I merged few archives and apparently there are some overlaps. How to remove the duplicates, any ideas?
No idea, unfortunately. I’m sure I have duplicates, triplicates and worse… but the old stuff is there so I can find it by search, not for browsing, and since the dupes all come up in the same thread, they don’t really bother me.
I wanted a complete searchable archive, and rather err on the side of dupes than missing something.
I feel the same way. As long as I did not miss any messages having multiple copies is not so bad. But being the above average (!) organized one (Monk?!) when I go to bed at night those duplicates keep me up all night.
I’ll probably hold off until Gmail finally offers IMAP: That would be the best of both worlds, surely, access mail over the web on the go and on my mail client at home.
Today GMail is not allowing pop download from other gmail accounts. So, I guess in your solution both the temp and the target accounts can not be from google.
Actually it does. That was my whole point: originally the source had to be non-gmail, and Google appears to have lifted that restriction. But they don’t make big announcements and don’t make the changes available to all accounts at once. I had a lot of feedback confirming that it works, and few users reported it did not – like you. So my guess is that it’s just a matter of time when you will also be able to fetch gmail. You probably have to check it from time to time – there won’t be any notice from Google.
Is there any way I can move my thunderbird “sent” folder into the gmail sent folder?
You actually don’t have to modify your MX records; just verify domain ownership, and then set up a forwarder in your domain email to point to [username]@[yourdomain.com].test-google-a.com
Email will then forward to your google apps mail account, and your ’sent’ address will be from your domain;
http://www.google.com/support/a/bin/answer.py?answer=47940&topic=9202
This is great, and it should work in theory. But I’m having trouble in practice — the problem seems to be with gmail’s fetcher. It ate all the emails from my imap/pop account (16000 messages) but only 4000 conversations showed up in gmail. I know this isn’t just a grouping thing because there were no conversations after around 2003. Anyone have a similar problem? I’m trying again in batches of 2000, but it looks like the first fetch of 200 ate lost about 3 emails (167 conversations in inbox, and I counted 20 emails accounted for by grouping!) The second fetch says it got 200 emails, and they disappeared from pop, but they have yet to show up in my inbox. Where did they go? I thing there’s some delay in processing because emails slowly and inexplicably materialize after I’ve deleted all messages. Maybe I’ll try again, but turn off “delete from server”. Kinda annoying… living up to the beta name. Suggestions?
I did not experience losses. I wonder if it may have to do with read/unread setting in the source account?
Could you have a filter in Gmail (target account) that archives it immediately? Tip: find an email from the source account that you think is “missing”. Copy a fairly unique text string (like a long sentence) from source, paste it into the search box in the traget Gmail account and see if it’s there.
If you’re re-trying Fetcher from the same account, then I know what’s happening – Gmail is smarter than I thought, will not fetch the same email from the same source account again. It includes it in the 200 count though.
I think this must have been another one of google’s infamous “limited rollout” features, because I just went to the trouble of uploading a ton of email to a second gmail account, only to discover that attempting to add it to mail fetcher results in an error to the effect of “cannot fetch from gmail domain.” So I guess I’ll have to wait until I become one of the privileged few.
Thanks for the suggestions. Glad to hear it worked for you (and considering the size of your mailbox, I can infer that is not the problem). It’s not an archive filter, and I have looked for the emails in “all mail”.
Doing more research, I found that others have have had similar problems (http://groups.google.com/group/Gmail-POP-and-Forwarding/browse_thread/thread/df5a4cc0e6f1e66b/16efd2132caab014#16efd2132caab014)
There is definitely a delay in processing between the “fetch” and appearing in the inbox. Apparently this can range from minutes to hours. But at least one other person has lost emails (days go by without them appearing).
Interesting point about re-fetching from the same account. I have removed and re-added the account, but perhaps google still remembers things. I can try renaming the account. Anyways, I’m not expecting you to have all the answers here, mostly posting so that if anyone else has trouble, they know they are not alone and know to look at the google groups to try and troubleshoot.
I really don’t know if the “dedupe” (undupe?) feature is related to the source account specifically, or just matching the exact info in message headers.
Here’s how I discovered it: since Fetcher does not recognize / import labels, I first looked at labels at the source account, then set up filters in the target account with the right criteria to re-create those labels on import. There was one label in the source account with less than 100 conversations that I simply could not find logical criteria for – this label was manually assigned . So I thought when everything else is done, I delete everything but this one label in the source, run Mail Fetcher again, and assign everyting in this run to the one label i need.
To my surprise, I saw the fetch run, 200 processed, in the source account everything moved from inbox to archive, yet only 2-3 new mail showed up in the target account. That’s when I started to suspect Gmail may be chacking for duplicates. Then I did the text string search I described in the previous comment, and voila! all the “missing” email was already sitting in the target account, I just had no way of locating them. (there are now 26K conversations in the account).
Then, with everything fetched I test-deleted 2 emails in the target account, reset the “pop everything” setting in the destination, re-ran Fetcher, and yes, it picked up the deleted 2 emails, and only those again.
Thank you so much. This is so awesome as I’ve just bought a new computer and have decided that I will be switching to Gmail permanently to avoid the mess of exporting/importing e-mail every time I get a new computer.
Since I don’t have easy access to an IMAP server I used the option of exporting to Thunderbird and uploading to the temporary Gmail archive. One question if I may; what do you do with Sent mail? You upload it to the temporary Gmail and grab it using Mail Fetcher? Does the mail recognize that it’s Sent mail and drop it into the appropriate folder automagically? Thanks.
Not all my Email seems to move over using this method. I get maybe 1/8 of them and they seem to be from only specific email address like a good friend of mine. Anyone else experiencing this?
has anyone tried using AOL or fastmail.fm mail for the temp storage as they support IMAP?
AOL: http://support.microsoft.com/kb/883081
Fastmail: http://www.fastmail.fm/docs/faqparts/ExternalMail.htm#ServerDetails
Thanks for the great article. I’m wondering about Sent Items. You say to put the Sent Items in the Inbox, too. This I did, and they appeared in the Inbox over on the Gmail account. I was hoping they would show up in Sent Items instead. Any tricks up your sleeve for this problem? I not finding a way to move them from the Inbox to Sent Items but perhaps I’m not looking hard enough?
- PeterM
You can’t place email explicitely into the “Sent” folder, but you can set a filter to “skip inbox, archive” all the incoming (fetched) mail, which places it into the “All” folder.
This is quite enough for me, I don’t browse email that’s several years old, I typicall search, and the results show up in a threaded conversation, where it’s obvious what’s sent and what’ received. I think with Gmail’s thread concept the notion of inbox/outbox is less important.
That said, I can see some of my fetched /archived email in the “Sent” folder, and not others, and I can’t figure out what made the difference.
Hi Zoli – I can’t tell you how this is useful – thanks. I have opened myself an account on 1and1.com, opened an email account and successfully setup an IMAP folder in Outlook 2K, I am already transferring emails from O2K to my 1and 1 email account online. However, I can’t add the account in Gmail, and you see to see about this “hence the need for Mail Fletcher to a second account”. I’m confused – I thought the point was to (1) send all email from my local PC to an online IMAP account; and (2) tell Gmail to retrieve emails from that account. I can’t get the latter to work… Am I missing something obvious?? Many thanks…
You’re close: but Gmail needs to access the same 1and1 account not as IMAP but POP. pop.1and1.com and other account settings the same as what you have in Outlook now.
I have successfully uploaded 5000+ emails to my 1and 1 account using the IMAP folder suggestion from Outlook 2K, but when I try to add the 1and1 email account using pop.1and1.com, I get the following error – see screenshot at http://gov20.info/files/1and1.JPG
I’ve used the same settings, so can’t imagine what’s wrong. The only difference is that I used my own domain at 1and1, not the long one they assign. Even than it created one with the xxxxx.onlinehome.us format and it calls it a subdomain, which I’ve never touched.
Since you are obviously using Outlook, have you tried setting up POP access to this account from Outlook? The Gmail setting should be the same.
If anyone else has ideas, feel free to jump in!
Yep… I’m continuing this thread into the following Google Forum post – thanks!
Oh, yeah, somwhere on their site 1and1 mentions not to use SSL.
I guess you’re almost OK then?
I have an old gmail account a@gmail.com.
I’m trying to move every mail to b@gmail.com.
I’ve enabled pop access on a@gmail.com, and set up the mail fetcher for b@gmail.com.
It copies about 100-150 emails, then fails with messages such like:
Error: Server is temporarily unavailable.
Server returned error “[SYS/TEMP] Temporary system problem. Please try again later.”
This error message continues and eventually I get a “Sector 4 Lockdown” for a@gmail.com and can’t access the account.
This has happened twice so far, and I’ve only been able to transfer 330 mails in 4 days, with lockdown happening twice.
I don’t think mail fetcher has been completely enabled for transferring between gmail accounts in all cases.
I also had server unavailable messages in the process, although only for hours. Don’t worry abut it, just leave the setup as it is, it will continue fetching after a while. The “normal” rate of transfer is 200 per run, with a 1 -1 1/2 hour frequency.