Data Mining the StackOverflow Database
StackOverflow is a question-and-answer site for programmers. Anyone can post questions or answers, and each user's reputation goes up or down depending on how much their questions and answers are liked by the community. The site is licensed with Creative Commons, meaning the community owns the data they post to the site.
You can download the StackOverflow database in XML format via BitTorrent. It's a great dataset to use when learning data mining because:
Here's how to get started: