Opened 4 years ago

Closed 4 years ago

#266 closed defect (fixed)

Petascope use of sdom

Reported by: dmisev Owned by: pcampalani
Priority: minor Milestone: 8.4
Component: petascope Version: 8.3
Keywords: Cc:
Complexity:

Description

I see this in the logs:

select sdom(c[0:180258,-29295:-27372]) from haiti_vnir AS c

This is an obvious inefficiency, there's no need to call sdom since we already know the domain apparently, if this subset is done: c[0:180258,-29295:-27372]

Change History (11)

comment:1 Changed 4 years ago by pcampalani

I see.
I can look at this in January, if not extremely urgent.
It is mainly about adding a check in the WCS setBounds() function probably.

comment:2 Changed 4 years ago by dmisev

Yes sure; the offending place is probably AbstractFormatExtension:68

comment:3 Changed 4 years ago by dmisev

But we should definitely optimize this somehow, it's doubling evaluation time more or less, e.g. 5s for the sdom and 5s for getting the actual data..

comment:4 Changed 4 years ago by pcampalani

I agree with you.

I believe the sdom request should actually reside in the DbMetadataSource.read() method when the cellDomain objects are created, whereas the setting the bounds (setBounds()) shouldn't be actually needed since:

  • asterisks are allowed in the rasql queries, in case the subsets in the W*S request apply only on a subset of the coverage's dimensions (e.g. `mean_summer_airtemp[0:10,*:*]);
  • rasql does not break in case the pixel bounds are outside the range (e.g. mean_summer_airtemp[-10:10,-10:10]=mean_summer_airtemp[0:10,0:10]),
  • Petascope anyway knows about the grid-domain ranges (now by means of ps_cellDomain, in new implementations by means of sdom).

This way still a sdom would be thrown for each request, but we might build a cache of coverages metadata that is checked at every DbMetadataSource.read() and refreshed e.g. when the domain extents (ps_domain currently) change ?

comment:5 Changed 4 years ago by dmisev

Yeah that's all good, but also we need to push some optimizations to rasdaman as the problem is that the sdom() function requires evaluation of its arguments in order to compute the result.

E.g. sdom(c) alone is instant, but sdom(c[*:*,*:*,..]) requires loading c from the database and that's very inefficient.

comment:6 Changed 4 years ago by dmisev

Ok forget about my talk above, it's a bit wrong :) But your post is valid.

comment:7 Changed 4 years ago by pcampalani

Ah, I see.. but if sdom(c) is instant, then the extents of the mdd are somewhere in some table right?
Then sdom(c[*:*,...]) as well should just look there when it detects "no operations" to be done like in this case. Whereas it must load the mdd in the remaining cases (e.g. sdom(scale(...))?

comment:8 Changed 4 years ago by abeccati

  • Priority changed from critical to minor

Does not look so critical, unless it slows the show down considerably. Setting prio to minor for now.

comment:9 Changed 4 years ago by dmisev

A 2x slow-down is considerable I'd say. Rasdaman enterprise is not affected because it has some query optimizations, but for community we need to fix petascope.

comment:10 Changed 4 years ago by abeccati

  • Milestone set to 8.4

comment:11 Changed 4 years ago by pcampalani

  • Resolution set to fixed
  • Status changed from new to closed

We can fix this:

commit fc3a97ba0e43a1546eeb01da543e200c43560797
Author: Piero Campalani <cmppri@unife.it>
Date:   Fri Jan 25 18:16:04 2013 +0100

    Avoid sdom request when updating GetCoverage metadata (ticket #266).
Note: See TracTickets for help on using tickets.