|
我有一张与客户,用户和收入类似的表格(实际上成千上万条记录):
7 h* r$ A+ g* F, Q7 xCustomer User Revenue
* r0 L! \$ y/ U/ ?$ x001 James 500
, h S2 ?" M! E' v, u2 }8 C+ t002 James 7501 d3 w: y( ]8 N
003 James 450
; p4 {/ c5 ^$ f7 ^( L9 ^004 Sarah 100
5 y$ u9 @! A R$ M" ^2 V8 l! l005 Sarah 5000 h& b8 n( s( ^, @* p/ K
006 Sarah 150
* s# l0 f+ r% A4 E' G; \- \* N$ k9 }007 Sarah 600
. N6 p, n$ L8 A, x( Z008 James 1507 |. |, m+ O, Z- u4 G2 b
009 James 100: d+ x" f" {( r, @0 Z3 E m
我想做的是只回报支出最高的客户,这些客户占用户总收入的80%。
3 q, y9 w( t) [! c/ b$ l要手动执行此操作,我将按詹姆斯的客户的收入对其进行排序,计算出总计的百分比和运行的总计百分比,然后仅返回记录,直到运行的总计达到80%:8 s' P5 |! e6 l/ {
Customer User Revenue % of total Running Total %( r1 N# g+ o+ |
002 James 750 0.38 0.38
0 B) E B9 ^: ?( }- f8 e: x001 James 500 0.26 0.64 3 @' L3 I6 J" S' M
003 James 450 0.23 0.87 我已经尝试过使用CTE,但到目前为止还是空白。有没有办法通过单个查询而不是在Excel工作表中手动执行此操作?# I4 s9 L* e& z. v6 A
6 M8 E1 P5 `+ ]8 f; |5 V7 p$ B, V
解决方案:
* l0 W6 C# H; Z l4 [ H
9 f t2 x2 e( Z7 c: n& t
7 Z5 O6 o+ i) c3 ?
2 _2 l: @3 q L6 k; T SQL Server 2012+ 只要
q: I8 }, X5 ~' k1 M& P* x; b您可以使用windowed SUM:, c( a, p4 b: G& L2 l# i" c
WITH cte AS1 D' ^7 x- j3 A3 P
(1 Q* F Y$ R% b
SELECT *,3 {& K4 |6 f) a9 b
1.0 * Revenue/SUM(Revenue) OVER(PARTITION BY [User]) AS percentile,
( _5 Z1 w! ]- N7 j% H" `$ x- R7 h0 B 1.0 * SUM(Revenue) OVER(PARTITION BY [User] ORDER BY [Revenue] DESC)
+ G- m3 p' |( r& | /SUM(Revenue) OVER(PARTITION BY [User]) AS running_percentile f# R- O' |1 O' Z4 w$ u
FROM tab
+ a$ d7 m; l' \- [9 e)
/ `! p; \+ c; b q/ P. RSELECT *) h* u ]- \* S- `; g2 d1 _
FROM cte
) s3 ~. F- }4 I- k; f( bWHERE running_percentile **[LiveDemo](http://rextester.com/IDYCVJ26861)**
5 E( K7 G' t7 R. H
$ W- ]2 y! g( m) i2 u: bSQL Server 2008:; P. l. v6 e, d `
WITH cte AS% x9 ]; ?3 ?/ T1 A! H; ~6 c
(" t) ]! f, d2 p7 E" T+ z
SELECT *, ROW_NUMBER() OVER(PARTITION BY [User] ORDER BY Revenue DESC) AS rn
3 o8 t% R7 j# d' K FROM t 1 @: n8 e0 P+ k( q
), cte2 AS* T' w, o! }* N* F% w
(
- C4 z# f9 q: l( H: z6 e" L* @ SELECT c.Customer, c.[User], c.[Revenue]
* Z: ?) ?# E7 M ,percentile = 1.0 * Revenue / NULLIF(c3.s,0)
5 U7 x$ G" Y; I- v8 C- n h ,running_percentile = 1.0 * c2.s / NULLIF(c3.s,0)
3 I: n6 e4 l5 R# g FROM cte c) v1 R: {8 T. r' H G* T
CROSS APPLY
5 ^2 G/ d2 q( a% s& r (SELECT SUM(Revenue) AS s( j8 }1 d+ }( ]( g
FROM cte c2$ ]& n9 o8 k6 p2 S
WHERE c.[User] = c2.[User]/ ]/ z0 W- [9 p$ E
AND c2.rn **[LiveDemo2](http://rextester.com/TUQ82944)**2 O4 ^* w3 I, v' ^4 [
输出:
: u2 o2 `7 [( ~( d+ B╔══════════╦═══════╦═════════╦════════════════╦════════════════════╗
8 O" v# t$ ] b' R# M. o" U║ Customer ║ User ║ Revenue ║ percentile ║ running_percentile ║# m" U$ d7 w3 N) V/ j1 M
╠══════════╬═══════╬═════════╬════════════════╬════════════════════╣4 j: W# C# K9 z2 O1 n
║ 2 ║ James ║ 750 ║ 0,384615384615 ║ 0,384615384615 ║7 c" n- j3 f! s
║ 1 ║ James ║ 500 ║ 0,256410256410 ║ 0,641025641025 ║. s8 Z# q9 p, W4 b& H# _4 u
║ 7 ║ Sarah ║ 600 ║ 0,444444444444 ║ 0,444444444444 ║
- O/ k' R) f/ Q& g. Q╚══════════╩═══════╩═════════╩════════════════╩════════════════════╝$ W' G; V( A4 k3 X( y( J
( S7 H7 [) i" H9 N& l编辑2:
. J5 C$ Q8 u, b/ Z- ?* K看上去差不多,唯一的麻烦是缺少最后一行,詹姆斯的第三行让他超过0.80,但需要包括在内。$ f8 a5 Y. f' e& y/ K7 I
WITH cte AS
2 m( }/ S: P) b4 T(
) Y1 i* g0 m* D/ w9 j SELECT *, ROW_NUMBER() OVER(PARTITION BY [User] ORDER BY Revenue DESC) AS rn
. l( n# p- j! u9 y FROM t
1 Y+ l: T4 Z3 Q) v# D4 ~), cte2 AS8 u" ?: Y# i% H. r" [/ L
(
8 Q1 g, s: ?0 x3 D! W/ d; G i# p$ D SELECT c.Customer, c.[User], c.[Revenue]- I4 i7 N5 H$ C \6 B* m
,percentile = 1.0 * Revenue / NULLIF(c3.s,0)
/ I$ X" y9 l' G0 P& c; r4 z4 [ ,running_percentile = 1.0 * c2.s / NULLIF(c3.s,0)
4 p) ^5 G. g. g# _) S5 |* q0 f FROM cte c ^5 t5 P( c8 [( J W E8 P# L
CROSS APPLY" A S3 P5 P( q1 D9 P' P+ v
(SELECT SUM(Revenue) AS s
* }3 }" K, _' B7 X FROM cte c2) Q+ U5 `$ B) _6 V; |, f' x$ T
WHERE c.[User] = c2.[User]
9 T0 B* v7 c m) c3 L AND c2.rn = 0.8
4 ?: A$ {. r( G AND cte2.[User] = a.[User]) AS s. ^) r* V( ?, j; f( L: [
WHERE a.running_percentile 输出:, k$ e- L( n1 Y8 Y- r6 {& w8 H
╔══════════╦═══════╦═════════╦════════════════╦════════════════════╗
, K" b3 a9 t% _( D2 k2 L( s2 T║ Customer ║ User ║ Revenue ║ percentile ║ running_percentile ║
' Y. H/ P9 U3 j. }! o$ l6 ]╠══════════╬═══════╬═════════╬════════════════╬════════════════════╣
' @. j9 n! n* d; w: c║ 2 ║ James ║ 750 ║ 0,384615384615 ║ 0,384615384615 ║1 H, d8 r2 S8 _. D# v
║ 1 ║ James ║ 500 ║ 0,256410256410 ║ 0,641025641025 ║
8 W9 m2 g+ T+ q. L" D* ^ j# r║ 3 ║ James ║ 450 ║ 0,230769230769 ║ 0,871794871794 ║
# v' _' F( G" q Z3 E% A! h! }2 g║ 7 ║ Sarah ║ 600 ║ 0,444444444444 ║ 0,444444444444 ║, T7 L( C1 g# I* d
║ 5 ║ Sarah ║ 500 ║ 0,370370370370 ║ 0,814814814814 ║
! Y8 S# a- v Z% {3 Q. d9 T! _9 [# O$ G8 [╚══════════╩═══════╩═════════╩════════════════╩════════════════════╝" ?, S/ I( \- j3 Q6 B) m2 x
$ o* G+ o- m! Z9 C$ p* V+ Z6 |
看起来很完美,已翻译成我的大桌子并返回了我需要的东西,花了5分钟时间完成它,但仍然无法跟随你所做的事情!6 {" a' z3 Q- l8 E9 E. |+ ^
SQL Server 2008不支持OVER()子句中的所有内容,但支持ROW_NUMBER。
3 r8 O9 s& ~ d j+ M首先,CTE只是计算组内的位置:6 q& e6 j- w5 r# |% M5 g* @% c
╔═══════════╦════════╦══════════╦════╗
: }+ m. T5 q; k5 `║ Customer ║ User ║ Revenue ║ rn ║2 }/ D) G, Q+ A- d
╠═══════════╬════════╬══════════╬════╣
7 C8 k0 X. Q# n w8 V║ 2 ║ James ║ 750 ║ 1 ║
, k$ l4 ^5 w: p G4 G: r( i4 X; T║ 1 ║ James ║ 500 ║ 2 ║9 h3 ~/ D0 I6 i: K- o, }8 W
║ 3 ║ James ║ 450 ║ 3 ║
/ y7 H* T" W- v. t5 I& D║ 8 ║ James ║ 150 ║ 4 ║
, s0 H! j6 m- Q( r! w9 e" b2 _║ 9 ║ James ║ 100 ║ 5 ║
+ w. b! `- ?. [║ 7 ║ Sarah ║ 600 ║ 1 ║; x0 w9 g" [9 M+ z( w: p& Q: j" c
║ 5 ║ Sarah ║ 500 ║ 2 ║! }* L: }$ B6 L3 Y+ b, s2 \
║ 6 ║ Sarah ║ 150 ║ 3 ║
5 Q3 q f* }) B; s; K║ 4 ║ Sarah ║ 100 ║ 4 ║6 {2 {5 s* w! U- P9 U* z3 F
╚═══════════╩════════╩══════════╩════╝
: { g3 O1 w6 S3 }- B/ h& r+ P. `# l- _- T% I+ B7 R, e' T8 H
第二个CTE:
% e: h6 K1 s/ \. L' Cc2 子查询根据以下排名计算运行总计 ROW_NUMBER1 K2 x' y( ]9 N7 g3 v: u
c3 计算每位用户的总和
5 t1 ?" @5 v6 u% t% Z在最终查询中,s子查询找到了running总数超过80%的最低总数。. p' m* Q) H6 {
编辑3:
) o: R! u2 Q1 }- S! M使用ROW_NUMBER实际上是多余的。
1 R2 C5 b% ?. i3 ]/ y% R& D0 nWITH cte AS1 q+ L. y0 ] j n+ |) \% P
(
& B" R: `5 M0 V. r/ G$ _# i SELECT c.Customer, c.[User], c.[Revenue]
/ F/ Q) W( G& i0 O ,percentile = 1.0 * Revenue / NULLIF(c3.s,0)' `5 H8 i! @3 l, l+ M* \
,running_percentile = 1.0 * c2.s / NULLIF(c3.s,0)
0 w+ Z9 q' y! c5 I" F4 t6 V+ Q FROM t c
6 v; N* ]. b- W! z' q" a3 m% `+ Q CROSS APPLY- b2 i6 e$ O1 l- G! F
(SELECT SUM(Revenue) AS s" b4 q' \ n% ?
FROM t c2
+ L" a( l' I2 Q1 c3 @ WHERE c.[User] = c2.[User]
( s% }* w: B; z$ a+ J, a; J AND c2.Revenue >= c.Revenue) c2
4 ~- q6 R3 {4 _8 q8 O. O CROSS APPLY" I/ ?+ k, f) F( J) [( {5 r
(SELECT SUM(Revenue) AS s
; ^0 b- P) g3 y8 X( W FROM t c2. j* I1 x# ?; r" Y7 |$ c9 _
WHERE c.[User] = c2.[User]) AS c33 h4 K; w7 r& d0 f* D5 P, T
) / |0 Y+ r- y, b' {& w" u u
SELECT a.*0 Y2 e2 t( z) D8 w( j8 D
FROM cte a
% Q4 N! v) s) x, \7 t( ~- K7 z1 CCROSS APPLY (SELECT MIN(running_percentile) AS rp9 W# s; I, U- a% b
FROM cte c2: R4 H: m! ~+ j7 u& D3 Y' D* j
WHERE running_percentile >= 0.8, J1 X6 @' i. i n4 r
AND c2.[User] = a.[User]) AS s
7 e. K3 s1 Y T* B6 A$ v. w' AWHERE a.running_percentile <= s.rp
9 B1 }0 j$ h) D6 l" p3 }1 W: }ORDER BY [User], Revenue DESC; |
|