#1148 closed defect (fixed)

Failed to connect to raserver by rasnet when insert data concurrently

Reported by: bphamhuu Owned by: atoader
Priority: critical Milestone: 9.2
Component: rasserver Version: development
Keywords: rasserver error Cc: dmisev
Complexity: Medium

Description

The error is (it should be exception message like RNP is Write transaction is lock, please try again later.)

Executing insert query...[ERROR] - The client failed to connect to rasserver.
terminate called after throwing an instance of 'std::runtime_error'
  what():  
Aborted (core dumped)

How to test, open 2 terminals and run the query below in same time

rasql -q 'insert into test_001 values decode($1)' -f /home/rasdaman/images/multi_cubs/0_backup.tiff --user rasadmin --passwd rasadmin

Then the error will appear in the terminal which runs later (use a big file (~50 MB) and you will see.

Attachments (4)

0.tiff (10.0 MB) - added by bphamhuu 15 months ago.
input file
0_backup.tiff.tar.gz (765.2 KB) - added by bphamhuu 15 months ago.
George, use this file, thanks.
log3.tar.gz (30.6 MB) - added by bphamhuu 14 months ago.
rasmgr_error
log_5.tar.gz (53.6 KB) - added by bphamhuu 14 months ago.
log for patch

Change History (16)

comment:1 Changed 15 months ago by mdumitru

  • Priority changed from major to critical

comment:2 Changed 15 months ago by gmerticariu

Can you please provide the file so we can reproduce the error?

Changed 15 months ago by bphamhuu

input file

Changed 15 months ago by bphamhuu

George, use this file, thanks.

comment:3 Changed 15 months ago by bphamhuu

I've upload the file for you, George.

comment:4 Changed 15 months ago by atoader

  • Owner changed from atoader to gmerticariu
  • Status changed from new to assigned

comment:5 Changed 14 months ago by bphamhuu

Another case is when running a script to run wcst_import.sh to import data, open a rasql and it has the same error (I thought select is not write transaction?).

 rasql -q "select c[0:500, 0:500] + 5 from multiple_cov_01 as c" --out string
rasql: rasdaman query tool v1.0, rasdaman v9.2.0-beta1-gf4a50b8 -- generated on 18.01.2016 08:06:30.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...terminate called after throwing an instance of 'std::runtime_error'
  what():  
Aborted (core dumped)

Changed 14 months ago by bphamhuu

rasmgr_error

comment:6 Changed 14 months ago by bphamhuu

After few trying with rasql above, I could not stop_rasdaman.sh with 2 rasserver with memory is "N/A" (it hangs up to now is few minutes).

[rasdaman@gonzo multi_cov]$ stop_rasdaman.sh 
stop_rasdaman.sh: terminating all rasdaman servers

comment:7 Changed 14 months ago by atoader

  • Owner changed from gmerticariu to atoader
  • Status changed from assigned to accepted

comment:8 Changed 14 months ago by bphamhuu

@AToader: even when I stop_rasdaman.sh and start it again and just query a normal query without running wcst_import.sh, I see this kind of error (so you can resize the radius of problem).

rasql -q "select c[0:500, 0:500] from multiple_cov_001 as c" --out string
rasql: rasdaman query tool v1.0, rasdaman v9.2.0-beta1-gf4a50b8 -- generated on 18.01.2016 08:06:30.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...terminate called after throwing an instance of 'std::runtime_error'
  what():  
Aborted (core dumped)

comment:9 Changed 14 months ago by atoader

  • Resolution set to fixed
  • Status changed from accepted to closed

comment:10 Changed 14 months ago by bphamhuu

  • Resolution fixed deleted
  • Status changed from closed to reopened

@AToader: The problem seems still here (I've pulled and reinstall and start servers).

rasql -q "select c[0:500, 0:500] from multiple_cov_001 as c" --out string
rasql: rasdaman query tool v1.0, rasdaman v9.2.0-beta1-gc9085a3 -- generated on 19.01.2016 07:51:15.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...rasdaman error 0: General error received from the server.
aborting transaction...E0119 08:16:15.128748962   12906 tcp_client_posix.c:171]     failed to connect to 'ipv4:10.70.11.237:7002': socket error: connection refused
E0119 08:16:16.130202929   12906 tcp_client_posix.c:171]     failed to connect to 'ipv4:10.70.11.237:7002': socket error: connection refused
ok

Changed 14 months ago by bphamhuu

log for patch

comment:11 Changed 14 months ago by atoader

Bang, as you can see the client is not crashing anymore. This ticket is about the crash.
From what I can see from the logs, you're system is in an inconsistent state. Please kill any running rasmgr's, rasserver's and remove the data/TRANSACTION folder.

comment:12 Changed 14 months ago by bphamhuu

  • Resolution set to fixed
  • Status changed from reopened to closed

AToader, I've done as you suggest (remove TRANSACTION folder, kill all rasmgr, rasservers) start again and the error still here, however, as you said it is not related to this ticket, then I will close it.

-q "select c[0:500, 0:500] from multiple_cov_001 as c" --out string
rasql: rasdaman query tool v1.0, rasdaman v9.2.0-beta1-gf4a50b8 -- generated on 18.01.2016 08:06:30.
opening database RASBASE at localhost:7001...ok
Executing retrieval query...rasdaman error 237: Exception: Client communication failure
aborting transaction...E0118 13:08:39.628931307   22984 tcp_client_posix.c:171]     failed to connect to 'ipv4:10.70.11.237:7002': socket error: connection refused
ok
rasql done.

Note: See TracTickets for help on using tickets.