|
我有一个表(让我们称之为数据),其中包含一组对象ID,数字值和日期。我想确定在过去X分钟(例如一个小时)内其值呈正趋势的对象。, a9 ^4 A1 @" ^2 g" m g5 R
示例数据:7 g7 @* d: |! `+ w9 E$ C
entity_id | value | date( z, H1 p2 D9 P, E4 O8 R; [
1234 | 15 | 2014-01-02 11:30:00; J) D8 e `! K, t2 x
5689 | 21 | 2014-01-02 11:31:00% i3 g5 y$ \2 G+ M7 u0 R8 G
1234 | 16 | 2014-01-02 11:31:00
( k! A" g! \' z9 h! `0 E& x& a! J8 J我尝试着看类似的问题,但不幸的是没有找到任何帮助…
5 `9 l# n6 r, ~, O4 j1 S 4 U7 [/ j$ X2 u
解决方案:' D0 g4 U6 S! @& Q j
2 ~* o$ l9 X4 Q* ~9 x( m
8 p5 S# L7 P1 i, D9 M
3 K( A3 d8 M6 \. o9 S! M% |( |, \
您启发了我去在SQL Server中实现线性回归。可以对MySQL / Oracle /; |2 A) V6 f# \" Q
Whatever进行修改,而不会带来太多麻烦。这是确定每个entity_id在一小时内趋势的数学上最好的方法,并且只会选择趋势为正的趋势。
+ R9 H' f2 o1 E1 k, J! y: @1 m它实现了此处列出的用于计算B1hat的公式:https
) J6 ]; Z" O7 }2 h://en.wikipedia.org/wiki/Regression_analysis#Linear_regression% v; `- y2 e2 U! ]! P! S
create table #temp4 ?# B- ^4 N, {
( W9 ~) S, s( I4 j( m0 B
entity_id int,
) R) f4 [; L4 c! | value int,
5 @5 P7 u5 L, N# D" @ [date] datetime# d4 t8 W: T0 j0 S
)
" g6 O! x$ h/ @0 H3 vinsert into #temp (entity_id, value, [date])3 |" d0 S7 f9 C9 g0 d
values
- ?; i/ p3 t7 M* d6 Y) U7 v3 D(1,10,'20140102 07:00:00 AM'),1 ^7 v( T( x: ~2 i3 J: b, \: X
(1,20,'20140102 07:15:00 AM'),
1 w; f! {* A p* J8 q, L7 e; X7 F(1,30,'20140102 07:30:00 AM'),
$ h; o" d0 @- K* K$ j(2,50,'20140102 07:00:00 AM'),, r6 v% V9 @: m9 R1 ]# E" w
(2,20,'20140102 07:47:00 AM'),
3 r* b7 C" K6 G* W2 H6 N(3,40,'20140102 07:00:00 AM'),
3 L3 D d. ]0 e$ F2 \(3,40,'20140102 07:52:00 AM')
. I5 n- a6 p& z& d- iselect entity_id, 1.0*sum((x-xbar)*(y-ybar))/sum((x-xbar)*(x-xbar)) as Beta
" c* L* f; l! O3 c5 }5 hfrom$ [4 w( T+ y0 _7 d y2 L8 J9 t( {, t
(6 I# N/ V4 w5 y- S8 M [/ g
select entity_id,
, R) O: h; t: `: u avg(value) over(partition by entity_id) as ybar,/ R$ a5 I( g( v5 v
value as y,) \3 e; t- Y3 K
avg(datediff(second,'20140102 07:00:00 AM',[date])) over(partition by entity_id) as xbar,
: N3 q7 c' @2 A; Z( q+ q* j4 G/ e F0 F datediff(second,'20140102 07:00:00 AM',[date]) as x5 H6 {1 e5 a/ ]/ i2 J0 s, N
from #temp
! @9 m( i4 F; V where [date]>='20140102 07:00:00 AM' and [date]0 |
|