1. News & Issues
You can opt-out at any time. Please refer to our privacy policy for contact information.

Discuss in my forum

Why the Library of Congress is Archiving Your Tweets

Deal Between Twitter and Government


Twitter bird

This is the iconic Twitter bird.


Watch what you tweet. The Library of Congress is archiving everything on Twitter.

In addition to the millions of books, manuscripts, films and photographs, the Library of Congress has record of every post on Twitter, a popular microblogging service, dating back to 2006. That's equal to about 55 million tweets every day.

It had stored more than 170 billion tweets in chronological order as of January 2013, including those posted by President Barack Obama on his social media pages.

Why the Library of Congress is Archiving Tweets

The Library of Congress said it was undertaking the effort to archive tweets because social media including Twitter were rapidly replacing earlier forms of communication.

"Twitter is a new kind of collection for the Library of Congress but an important one to its mission," wrote Gayle Osterberg, the director of communication for the Library of Congress, in January 2013.

"As society turns to social media as a primary method of communication and creative expression, social media is supplementing, and in some cases supplanting, letters, journals, serial publications and other sources routinely collected by research libraries."

The Library of Congress said its Twitter archive would allow researchers to glean a "a fuller picture of today’s cultural norms, dialogue, trends and events to inform scholarship, the legislative process, new works of authorship, education and other purposes" in the distant future.

Usefulness of the Library of Congress Twitter Archive

The Library of Congress said its Twitter archive would allow researchers to learn about historic events, for example, by reading tweets posted during terrorist attacks across the world. A potential topic of research might be the accuracy of tweets posted early after such attacks.

Another example cited by the Library of Congress of its Twitter archive is the ability of researchers to study the language used by charities to solicit contributions following natural disasters to see whether it was effective.

History of Library of Congress Deal With Twitter

Twitter agreed in April 2010 to provide an archive of posts tweeted publicly dating back to the social media tool's inception in 2006 to the Library of Congress.

"Over the years, tweets have become part of significant global events around the world—from historic elections to devastating disasters," Twitter stated in 2010. "It is our pleasure to donate access to the entire archive of public Tweets to the Library of Congress for preservation and research. It's very exciting that tweets are becoming part of history."

How You Can Access the Twitter Archive

Even though the Library of Congress is a public institution, its agreement with Twitter provided access only to "bona fide" researchers who must promise not to use the information for commercial use or redistribute tweets.

The Library of Congress has said it cannot provide the collection on its website for the public to download.

"The Twitter collection is not only very large, it also is expanding daily, and at a rapidly increasing velocity," the Library of Congress said. "The variety of tweets is also high, considering distinctions between original tweets, re-tweets using the Twitter software, re-tweets that are manually designated as such, tweets with embedded links or pictures and other varieties.

Technical Limitations of the Twitter Archive

The Library of Congress was not even able to grant access to the Twitter archive to researchers years into the project because of the amount of data and technical limitations on searching through it.

The Library of Congress said executing a single search could take as long as 24 hours. "This is an inadequate situation in which to begin offering access to researchers, as it so severely limits the number of possible searches," the Library said.

The Library said that enabling rapid search capabilities of the search archive would require an "extensive infrastructure of hundreds if not thousands of servers. This is costprohibitive and impractical for a public institution."

Your tweet, in other words, is being saved and stored for future generations to inspect and learn from. But whether it'll ever be found amid the millions of books, manuscripts, films and photographs being catalogued in the Library of Congress is another story.

  1. About.com
  2. News & Issues
  3. US Politics
  4. U.S. Government
  5. Library of Congress
  6. Library of Congress and Twitter - Why Your Tweets Are Archived

©2014 About.com. All rights reserved.