StackOverflow Question Demographics Query
From SQLServerPedia
|
See Also: Main Page - Business Intelligence - Data Mining the StackOverflow Database - Interesting StackOverflow Database Queries
How The Query WorksThis query joins Posts to Users to find information about the kinds of people who ask questions. As of 6/2009, the results look like this:
The T-SQL Query
SELECT COALESCE(COUNT(DISTINCT p.ID),0) AS Questions
,COALESCE(AVG(p.Score * 1.00),0) AS AvgScore
,COALESCE(AVG(p.ViewCount * 1.00),0) AS AvgViewCount
,COALESCE(COUNT(DISTINCT p.OwnerUserId),0) AS DistinctQuestioners
,COALESCE(AVG(p.AnswerCount * 1.00),0) AS AvgAnswerCount
,COALESCE(AVG(p.CommentCount * 1.00),0) AS AvgCommentCount
,COALESCE(AVG(p.FavoriteCount * 1.00),0) AS AvgFavoriteCount
,COALESCE(AVG(u.Reputation * 1.00),0) AS AvgQuestionerReputation
,COALESCE(AVG(u.Age * 1.00),0) AS AvgQuestionerAge
,COALESCE(AVG(u.UpVotes * 1.00),0) AS AvgQuestionerUpVotes
,COALESCE(AVG(u.DownVotes * 1.00),0) AS AvgQuestionerDownVotes
FROM dbo.Posts p
INNER JOIN dbo.Users u
ON p.OwnerUserId = u.Id
WHERE p.PostTypeId = 1 AND ClosedDate IS NULL
Result Screenshot: Variation: Grouping Results by Tags
SELECT t.TagName
,COALESCE(COUNT(DISTINCT p.ID),0) AS Questions
,COALESCE(AVG(p.Score * 1.00),0) AS AvgScore
,COALESCE(AVG(p.ViewCount * 1.00),0) AS AvgViewCount
,COALESCE(COUNT(DISTINCT p.OwnerUserId),0) AS DistinctQuestioners
,COALESCE(AVG(p.AnswerCount * 1.00),0) AS AvgAnswerCount
,COALESCE(AVG(p.CommentCount * 1.00),0) AS AvgCommentCount
,COALESCE(AVG(p.FavoriteCount * 1.00),0) AS AvgFavoriteCount
,COALESCE(AVG(u.Reputation * 1.00),0) AS AvgQuestionerReputation
,COALESCE(AVG(u.Age * 1.00),0) AS AvgQuestionerAge
,COALESCE(AVG(u.UpVotes * 1.00),0) AS AvgQuestionerUpVotes
,COALESCE(AVG(u.DownVotes * 1.00),0) AS AvgQuestionerDownVotes
FROM dbo.Posts p
INNER JOIN dbo.Users u
ON p.OwnerUserId = u.Id
INNER JOIN dbo.PostTags pt
ON p.Id = pt.PostId
INNER JOIN dbo.Tags t
ON pt.TagId = t.Id
WHERE p.PostTypeId = 1 AND ClosedDate IS NULL
GROUP BY t.TagName
ORDER BY t.TagName
Result Screenshot: Variation: Grouping Results by Month
SELECT YEAR(p.CreationDate) AS PostYear
,MONTH(p.CreationDate) AS PostMonth
,COALESCE(COUNT(DISTINCT p.ID),0) AS Questions
,COALESCE(AVG(p.Score * 1.00),0) AS AvgScore
,COALESCE(AVG(p.ViewCount * 1.00),0) AS AvgViewCount
,COALESCE(COUNT(DISTINCT p.OwnerUserId),0) AS DistinctQuestioners
,COALESCE(AVG(p.AnswerCount * 1.00),0) AS AvgAnswerCount
,COALESCE(AVG(p.CommentCount * 1.00),0) AS AvgCommentCount
,COALESCE(AVG(p.FavoriteCount * 1.00),0) AS AvgFavoriteCount
,COALESCE(AVG(u.Reputation * 1.00),0) AS AvgQuestionerReputation
,COALESCE(AVG(u.Age * 1.00),0) AS AvgQuestionerAge
,COALESCE(AVG(u.UpVotes * 1.00),0) AS AvgQuestionerUpVotes
,COALESCE(AVG(u.DownVotes * 1.00),0) AS AvgQuestionerDownVotes
FROM dbo.Posts p
INNER JOIN dbo.Users u
ON p.OwnerUserId = u.Id
INNER JOIN dbo.PostTags pt
ON p.Id = pt.PostId
WHERE p.PostTypeId = 1 AND ClosedDate IS NULL
GROUP BY YEAR(p.CreationDate), MONTH(p.CreationDate)
ORDER BY YEAR(p.CreationDate), MONTH(p.CreationDate)
Result Screenshot: Related Articles About the StackOverflow Database
|


