sql - Get top three most common values from every column in a table -
i'm trying write query produce small sample of data each column of table, in sample made of top 3 common values. particular problem part of bigger task, write scripts can characterize database , tables, data integrity, , survey common values in table on per-column basis. think of automated "analysis" of table.
on single column basis, calculating frequency of values , sorting frequency. if had column called "color" , colors in it, , happened color "blue" in rows, top 1 occurring value "blue". in sql easy calculate.
however, i'm not sure how on multiple columns.
currently, when calculation on columns of table, perform following type of query:
use database; declare @t nvarchar(max) set @t = n'select ' select @t = @t + 'count(distinct cast(' + c.name + ' varchar(max))) "' + c.name + '",' sys.columns c c.object_id = object_id('table'); set @t = substring(@t, 1, len(@t) - 1) + ' table;' exec sp_executesql @t
however, not entirely clear me how here.
(sidenote:columns of type text, ntext, , image, since cause errors while counting distinct values, i'm less concerned solving that)
but problem of getting top 3 frequent values per column has got me absolutely stumped.
ideally, i'd end this:
col1 col2 col3 col4 col5 --------------------------------------------------------------------- 1,2,3 red,blue,green 29,17,0 c,d,j nevada,california,utah
i hacked together, seems work:
i cant think should using rank().
use <db>; declare @query nvarchar(max) declare @column nvarchar(max) declare @table nvarchar(max) declare @i int = 1 declare @maxi int = 10 declare @target nvarchar(max) = <table> declare @stage table (i int identity(1,1), col nvarchar(max), tbl nvarchar(max)) declare @results table (columnname nvarchar(max), columnvalue nvarchar(max), columncount int, tablename nvarchar(max)) insert @stage select c.name, o.name sys.columns c join sys.objects o on o.object_id=c.object_id , o.type = 'u' , c.system_type_id in (select system_type_id sys.types [name] not in ('text','ntext','image')) , o.name @target set @maxi = (select max(i) @stage) while @i <= @maxi begin set @column = (select col @stage = @i) set @table = (select tbl @stage = @i) set @query = n'select ' +''''+@column+''''+' , '+ @column select @query = @query + ', count( ' + @column + ' ) count' + @column + ' , ''' + @table + ''' tablename' select @query = @query + ' ' + @table + ' group ' + @column --select @query insert @results exec sp_executesql @query set @i = @i + 1 end select * @results ; cte ( select *, row_number() on (partition columnname order columncount desc) rn @results ) select * cte rn <=3
Comments
Post a Comment