Opened 5 years ago

Closed 4 years ago

#267 closed task (fixed)

Tiling with rasimport

Reported by: dmisev Owned by: aherzig
Priority: critical Milestone: 9.0
Component: rasgeo Version: 8.3
Keywords: Cc: a.beccati@…, pbaumann, joachim.ungar@…, HerzigA@…, pcampalani
Complexity: Trivial

Description (last modified by abeccati)

Check what is the tiling strategy that rasimport uses. Is it fixed to a certain tile size and configuration, or it's adaptable to the input maybe?

Check whether it's easily possible to make it flexible (i.e. allow using the rasql storage layout sublanguage).

Update documentation to include the tiling parameter that has been implemented as flexible tiling strategy specification solution.

Attachments (4)

a.png (201 bytes) - added by dmisev 4 years ago.
b.png (190 bytes) - added by dmisev 4 years ago.
tilingtest_2.txt (3.2 KB) - added by aherzig 4 years ago.
0001-provisional-patch-adding-tiling-support-to-rasimport.patch (7.5 KB) - added by aherzig 4 years ago.

Download all attachments as: .zip

Change History (35)

comment:1 Changed 5 years ago by dmisev

  • Cc HerzigA@… pcampalani added
  • Description modified (diff)

comment:2 Changed 4 years ago by abeccati

Probably an option to be specified on the command line with the tiling substring to be passed to the insert inside rasgeo would be most flexible.

comment:3 in reply to: ↑ description ; follow-up: Changed 4 years ago by herziga@…

Replying to dmisev:

Check what is the tiling strategy that rasimport uses.

rasimport uses rasql's 'insert into COLLNAME values ... ' statement without specifying any tiling scheme at all. BTW, is there a default tiling scheme for this case?

Check whether it's easily possible to make it flexible (i.e. allow using the rasql storage layout sublanguage).

As Alan has suggested, easiest would be to have an option like
--tiling '<here comes the tiling spec as string parameter>'
and if it's specified, it just goes at the end of the 'insert into COLLNAME values ...' statement.
Would that work?

comment:4 Changed 4 years ago by pbaumann

Alex, that should work. Only inconvenience is that the string has to be enclosed in quotes properly to make it one shell word, such as:

$ rasimport ... --tiling "area of interest [blabla]"

...which seems acceptable. So Alan's suggestion is favored by me, too.

comment:5 in reply to: ↑ 3 ; follow-up: Changed 4 years ago by dmisev

Replying to herziga@…:

rasimport uses rasql's 'insert into COLLNAME values ... ' statement without specifying any tiling scheme at all. BTW, is there a default tiling scheme for this case?

If I understood correctly, rasimport imports data partitioning it manually into chunks of a variable size (the chunk size is computed depending on some parameters)? So the tiles in the object are equivalent to the chunks that rasimport commits.

By default there's no tiling, I still need to change this to the most meaningful generic tiling spec.

comment:6 in reply to: ↑ 5 Changed 4 years ago by herziga@…

Replying to dmisev:

If I understood correctly, rasimport imports data partitioning it manually into chunks of a variable size (the chunk size is computed depending on some parameters)? So the tiles in the object are equivalent to the chunks that rasimport commits.

Yes, that's correct. rasimport first creates (insert into COLLNAME ...) an initial one pixel image (e.g. [0:0,0:0] for 2D) and then subsequently updates it by chunks of rows (e.g. update <COLLNAME> as m set m assign shift(<MDD>, <r_Point>) where oid(m) = <OID>). If I now were to specify a tiling scheme with the initial insert statement, would those incoming chunks (update statement) be adjusted to that scheme automatically or would they have to be put in in appropriate chunks (tiles) according to the scheme?

Changed 4 years ago by dmisev

Changed 4 years ago by dmisev

comment:7 follow-up: Changed 4 years ago by dmisev

Yes it will automatically partition the update chunks according to the tiling scheme, but the problem is that it won't automatically accommodate the existing tiles.

To give you an example, suppose the tiling is regular 512x512 tiles, but rasimport commits 750x512 chunks. Then the resulting tiles in rasdaman after inserting two chunks with rasimport will be as

but it should be as

But this is a general issue of partial updates, not of rasimport I'd say. So as long as we use some fixed larger chunk size in rasimport (e.g. 100MB) I think issues like this will be minimized.

comment:8 follow-up: Changed 4 years ago by dmisev

So in conclusion: I think the --tiling "tiling_spec" which will be passed verbatim at the end of the first insert statement as "tiling tiling_spec" is a pretty good solution.

comment:9 in reply to: ↑ 8 Changed 4 years ago by herziga@…

Replying to dmisev:

So in conclusion: I think the --tiling "tiling_spec" which will be passed verbatim at the end of the first insert statement as "tiling tiling_spec" is a pretty good solution.

Sweet! BTW, rasimport uses 128MiB as chunksize

comment:10 follow-up: Changed 4 years ago by dmisev

To me it seemed like it's variable, because rasimport in certain cases was creating a huge number of small tiles in my experience.

comment:11 in reply to: ↑ 10 Changed 4 years ago by herziga@…

Replying to dmisev:

To me it seemed like it's variable, because rasimport in certain cases was creating a huge number of small tiles in my experience.

Mmmh, that's interesting. rasimport uses a fixed number of rows (nrows = maxMem_bytes / (numColumns * pixelsize_bytes)) for each iteration step, only the last chunk may be smaller. Perhaps I'm missing something in my own code?? BTW it's in rasimport's importImage(...) function.

comment:12 Changed 4 years ago by dmisev

Yes, this formula is pretty much what I ended up with when I investigated, but didn't have time to try understand the reason for it. So apparently the chunk size is not fixed, but perhaps what you mean is that it's limited to 128MB?

comment:13 Changed 4 years ago by herziga@…

Sorry, you're right, that's what I meant. The reason is to be able to process images which don't fit into RAM; 128MiB is just an arbitrary choice. We could turn it into a parmater though?

comment:14 Changed 4 years ago by abeccati

  • Milestone set to 8.4

comment:15 in reply to: ↑ 7 Changed 4 years ago by aherzig

Replying to dmisev:
I just made an initial test with rasgeo and the new tiling option. Unfortunately, it doesn't seem to work with the current rasgeo workflow logic of importing an image as chunks of rows by partial updates:

rasimport -f t1.img -coll t1tiled1 -tiling "tiling regular [0:499,0:499] index rc_index"
ERROR - rimport::main, l. 1371: Exception: The tile configuration is incompatible to the marray domain.

I assume 'compatible' means, the chunk size must not be smaller than the tile size (for any or all dimensions?)? If that's the case, we had to adjust rasgeo such that the chunk size is adjusted (made compatible) to the tile size. This involved revising the whole logic to partition the data as well as adding capability to parse the tiling specification in the first place. Since you mentioned earlier that partial updates and accommodating for existing tiles is more of a server rather than a client problem, I was wondering whether you've got any ideas how to proceed in this case? Is this something you're going to address in the future, or do we have to implement 'tiling upon import' for large data on the client side?

comment:16 follow-up: Changed 4 years ago by dmisev

The regular tiling is a bit constrained, it has to divide evenly the image domain, but even then it may still be a problem with the chunks, I'm not sure.

Can you try maybe with aligned tiling and leave out the index? E.g.

tiling aligned [0:499,0:499] tile size 250000

(multiply the tile size by the type size)

comment:17 in reply to: ↑ 16 Changed 4 years ago by aherzig

Replying to dmisev:
Aligned tiling seems to work with rasimport, at least it doesn't throw any exceptions and the image is imported correctly. However, I don't know how to check whether the tiling is correct though.
Strangely enough, I couldn't import an image using partial updates and aligned tiling on the command line (s. tilingtest_2.txt). I also tried directional tiling on the commandline, but it didn't work either using the 'partial update' workflow (and hence failed with rasimport). So, it seems only aligned tiling is working with partial updates and therefore with rasimport. See tilingtest_2.txt for the few tests I did.

Changed 4 years ago by aherzig

comment:18 Changed 4 years ago by dmisev

Yes, with directional tiling it won't work, because it expects that the limits you give when you insert the array match the domain of the inserted array, unless the dimension is marked as *

Maybe you can attach a patch here and I'll check if the aligned tiling worked well.

comment:19 Changed 4 years ago by dmisev

  • Owner changed from dmisev to aherzig
  • Status changed from new to assigned

comment:20 Changed 4 years ago by pbaumann

Dimitar, did you have a chance to check the patch?

comment:21 Changed 4 years ago by dmisev

Oh I didn't notice a patch was uploaded, trac doesn't seem to send notifications for attachment uploads.

The patch is fine, just missing to update the README with the new parameter. It can be applied and later we can fix the README

comment:22 Changed 4 years ago by dmisev

Alex can you please upload the patch to the patchmanager?

comment:23 Changed 4 years ago by dmisev

  • Resolution set to fixed
  • Status changed from assigned to closed

comment:24 follow-up: Changed 4 years ago by ungarj

  • Complexity set to Medium

Alex, thanks a lot for patching! However, I agree with Dimitar that a README file or help entry would be useful for us. Should we reopen the ticket?

comment:25 in reply to: ↑ 24 Changed 4 years ago by aherzig

  • Cc a.beccati@… added

Replying to ungarj:
Very good point, Joachim, we shouldn't forget about that. Not quite sure how to handle this,
shall we re-open this one or open a new one?

comment:26 Changed 4 years ago by abeccati

  • Complexity changed from Medium to Trivial
  • Description modified (diff)
  • Priority changed from major to minor
  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopened and updated accordingly.

comment:27 Changed 4 years ago by dmisev

  • Milestone changed from 8.4 to 8.5

comment:28 Changed 4 years ago by abeccati

  • Description modified (diff)
  • Priority changed from minor to critical

We got some feedback by users about that missing documentation so I'm raisin priority

comment:29 Changed 4 years ago by abeccati

  • Status changed from reopened to assigned

comment:30 Changed 4 years ago by dmisev

  • Milestone changed from 8.5 to 9.0

comment:31 Changed 4 years ago by dmisev

  • Resolution set to fixed
  • Status changed from assigned to closed

Documentation fixed.

Note: See TracTickets for help on using tickets.