Opened 5 years ago

Closed 19 months ago

#240 closed defect (fixed)

Exporting large amounts of data truncates the first ~100 bytes

Reported by: dmisev Owned by: dmisev
Priority: critical Milestone: 9.2
Component: conversion Version: 8.3
Keywords: Cc: pbaumann, mdumitru, atoader
Complexity: Medium

Description

More specifically it produces invalid files, file/gdalinfo don't recognize them.

Tested with exporting 500MB collection, and Konstantin reported fail with tiff only with 16MB collection.

Attachments (2)

reproduce.sh (1.0 KB) - added by dmisev 5 years ago.
test.sh (601 bytes) - added by dmisev 3 years ago.

Download all attachments as: .zip

Change History (23)

comment:1 Changed 5 years ago by dmisev

  • Summary changed from Exporting large amounts of data to tiff/png fails to Exporting large amounts of data to tiff/png/jpeg fails

It's also reproducible with the new encode function, so it doesn't look to be specific to the conversion code itself.

comment:2 Changed 5 years ago by dmisev

Tested some more

png(), encode(c, "PNG" starts failing at exporting 4MB with "Error: unable to save PNG stack"

tiff(), encode(c, "GTiff") starts failing at 8MB

comment:3 Changed 5 years ago by pbaumann

maybe we should isolate this in a small GDAL program and then file a GDAL ticket and/or ask the GDAL list.

comment:4 Changed 5 years ago by dmisev

  • Owner changed from dmisev to mdumitru
  • Status changed from new to assigned

We suspect it may be some memory corruption problem, so it might be helpful to run valgrind on directql (can be found in source:applications/directql).

I attached the script I used to reproduce this problem.

Changed 5 years ago by dmisev

comment:5 Changed 4 years ago by pbaumann

  • Milestone set to 8.4

comment:6 Changed 4 years ago by pbaumann

  • Milestone changed from 8.4 to 8.5

comment:7 Changed 4 years ago by dmisev

  • Complexity set to Medium
  • Owner changed from mdumitru to klipskoch

Conversion sources are in source:conversion, e.g. source:conversion/png.cc

Source for the encode function is in source:qlparser/qtencode.cc

It should be tested with Java in addition (e.g. WCPS query), because as it seems it's
no problem to encode a lot of data via the Java API, so it may be helpful to get a clue why it fails with C++ (rasql client).

comment:8 Changed 4 years ago by dmisev

  • Milestone changed from 8.5 to 9.0

comment:9 Changed 3 years ago by dmisev

Have you looked at this Kinga? I suspect it is a bug in the C++ client API, as it doesn't happen via rasj.

comment:10 Changed 3 years ago by dmisev

  • Owner changed from klipskoch to fxavier

comment:11 Changed 3 years ago by fxavier

By executing reproduce.sh, I get the following outputs:


Testing with 1 MB (786432 B)


output.txt rasql_1.png rasql_1.tif reproduce.sh function tiff(c)...failed, file not found.
output.txt rasql_1.png rasql_1.tif reproduce.sh function png(c)...failed, file not found.
output.txt rasql_1.png rasql_1.tif reproduce.sh function jpeg(c)...failed, file not found.
output.txt rasql_1.png rasql_1.tif reproduce.sh function encode(c, "GTiff")...failed, file not found.
output.txt rasql_1.png rasql_1.tif reproduce.sh function encode(c, "PNG")...failed, file not found.


.............
(this repeats throughout all the iterations, until my pc crashes).

rasdaman error 0: Exception: Transfer Failed
rasdaman error 801: RasManager? Error: System overloaded, please try again later.
rasdaman error 801: RasManager? Error: System overloaded, please try again later.
rasdaman error 801: RasManager? Error: System overloaded, please try again later.
rasdaman error 801: RasManager? Error: System overloaded, please try again later.
.......
(this repeats throughout all the iterations).

I don't know it this is the error or if I didn't reproduced it yet.

comment:12 Changed 3 years ago by fxavier

Update in this ticket:

TIFF encode is simply not supported:
rasql -q 'select encode(c, "tiff") from greytest as c' --out file
...
Executing retrieval query...rasdaman error 0: Exception: Feature is not supported

JPEG encode is OK for small datasets (186KB?), but not supported for big ones:
(With 64KB)
Executing retrieval query...ok
Query result collection has 1 element(s):

Result object 1: going into file rasql_1.jpg...ok.

(With over 186KB +-)
Executing retrieval query...rasdaman error 0: Exception: ODMG General

PNG also has its limitations, but it works better than JPEG. This error only appears with sizes of images from ~3MB onwards:
Executing retrieval query...rasdaman error 0: Exception: ODMG General

[EDIT]:
GTIFF works. I could encode 100MB, but at 150MB it crashed, so the limit is somewhere in between.
Encoding 100MB to GTIFF puts my rasdaman using 1.6GB of RAM, and valgrind on directql doesn't behave like that.

Last edited 3 years ago by fxavier (previous) (diff)

comment:13 Changed 3 years ago by dmisev

For tiff it should be

rasql -q 'select encode(c, "gtiff") from greytest as c' --out file

Can you check in the logs if there's more specific error than ODMG General?

Changed 3 years ago by dmisev

comment:14 Changed 3 years ago by dmisev

I attached a simpler script for testing attachment:test.sh

comment:15 Changed 3 years ago by dmisev

  • Summary changed from Exporting large amounts of data to tiff/png/jpeg fails to Exporting large amounts of data truncates the first ~100 bytes

Ok further clue: the bug is irrelevant of the encoding part, even with csv it fails, and as it seems the first 100 bytes or so are cut from the result somewhere on the client side. This does not happen in rasj, so it must be in the C++ client code.

Exporting a [0:9999,0:99] mdd to csv produces proper first 300 bytes:

$ head -c 300 rasql_1.csv 
{2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2},{2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2

Exporting a [0:99999,0:99] mdd to csv however truncates the first 106 bytes (I assume, it may be even more):

$ head -c 300 rasql_1.csv 
,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2},{2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2},{2

comment:16 Changed 3 years ago by dmisev

  • Priority changed from major to critical

comment:17 Changed 3 years ago by dmisev

  • Cc mdumitru added; kozlov@… removed
  • Owner changed from fxavier to dmisev

So hopefully the new protocol will fix this :)
Right now there is some leak going on in the RNP protocol that causes this.

comment:18 Changed 3 years ago by dmisev

  • Resolution set to wontfix
  • Status changed from assigned to closed

The new protocol fixes this, wontfix for RNP.

comment:19 Changed 19 months ago by dmisev

  • Cc atoader added
  • Milestone changed from 9.0.x to 9.2
  • Resolution wontfix deleted
  • Status changed from closed to reopened

comment:20 Changed 19 months ago by dmisev

I submitted a patch that fixes it in RNP, however rasnet is still not fixed.

comment:21 Changed 19 months ago by dmisev

  • Resolution set to fixed
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.