Opened 3 months ago

Closed 4 weeks ago

#1557 closed enhancement (fixed)

stddev / var condensers

Reported by: dmisev Owned by: mtarigradschi
Priority: major Milestone: 9.5
Component: qlparser Version: development
Keywords: Cc: pbaumann, bbell, vmerticariu
Complexity: Medium

Description (last modified by dmisev)

SQL and various array DBMS support standard deviation and variance aggregations. Quoting from the SQL standard:

  • If VAR_POP is specified, then the population variance of <value expression> evaluated for each row that qualifies, defined as the sum of squares of the difference of <value expression> from the mean of <value expression>, divided by the number of rows that qualify.
  • If VAR_SAMP is specified, then the sample variance of <value expression> evaluated for each row that qualifies, defined as the sum of squares of the difference of <value expression> from the mean of <value expression>, divided by the number of rows that qualify minus 1 (one).
  • If STDDEV_POP is specified, then the population standard deviation of <value expression> evaluated for each row that qualifies, defined as the square root of the population variance.
  • If STDDEV_SAMP is specified, then the sample standard deviation of <value expression> evaluated for each row that qualifies, defined as the square root of the sample variance.

So we have these functions:

  • VAR_POP(generalExp)
  • VAR_SAMP(generalExp)
  • STDDEV_POP(generalExp)
  • STDDEV_SAMP(generalExp)

Change History (4)

comment:1 Changed 3 months ago by dmisev

  • Description modified (diff)

comment:2 Changed 2 months ago by bbell

found a 1-pass algo for computing std. dev.

https://www.strchr.com/standard_deviation_in_one_pass

anyone know of a better algo than this?

comment:3 Changed 5 weeks ago by dmisev

  • Description modified (diff)
  • Owner set to mtarigradschi
  • Status changed from new to assigned

I'd say let's go ahead and implement support for these functions? They are really standard stuff and straightforward to implement, so it's pretty low-hanging fruit.

You could start by cloning QtAddCells for example, and adapt it to compute stddev/var (not sure if it could be done with the usual condenser implementation in ops.cc though). Brennan is quite familiar with the tile-based evaluation in qlparser and can help with questions on this.

comment:4 Changed 4 weeks ago by dmisev

  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.