RSS (Really Simple Syndication, encompassing Rich Site Summary and RDF Site Summary) is a web syndication protocol used by many blogs and news websites to distribute information; it saves people having to visit several sites repeatedly to check for new content. At this point in time there are many RSS newsfeed aggregators available to the public, but none of them perform any sort of archival of information beyond the RSS metadata. As the articles linked may move or be eliminated at some time in the future, if one wants to be sure one can access them in the future one has to archive them oneself; furthermore, should one want to link such collected articles, it is far easier to do if one has them archived. The purpose of this project is to create an RSS aggregator that will archive the text of the actual articles linked to in the RSS feeds in some kind of linkable, searchable database, and, if all goes well, implement some sort of datamining capability as well. Background RSS stands for Really Simple Syndication, a syndication protocol often used by weblogs and news sites. Technically, RSS is an xml-based communication standard that encompasses Rich Site Summary (RSS 0.9x and RSS 2.0) and RDF Site Summary (RSS 0.9 and 1.0). It enables people to gather new information by using an RSS aggregator (or "feed reader") to poll RSSenabled sites for new information, so the user does not have to manually check each site. RSS aggregators are often extensions of browsers or email programs, or standalone programs; alternately, they can be webbased, so the user can view their "feeds" from any computer with Web access. Data mining is the searching out of information based on patterns present in large amounts of data. //more will be here. Process //there will be more here. References "RSS (protocol)." Wikipedia. 8 Jan.2005. 11 Jan. 2005. < http://en.wikipedia.org/wiki/RSS_%28protocol%29> "Data mining." Wikipedia. 7 Jan.2005. 12 Jan. 2005. .