Opened 3 years ago

Closed 3 years ago

#625 closed defect (fixed)

target tiles still locked after commit

Reported by: aherzig Owned by: klipskoch
Priority: major Milestone: 9.0
Component: lockmgr Version: development
Keywords: rasgeo, tile lock Cc: pbaumann, pcampalani
Complexity: Medium

Description

Within a series of rasql queries, wrapped as individual transactions, i.e.

// query one
transaction.begin(read_write);
r_OQL_Query q(...);
r_oql_execute(q);
transaction.commit()

// query two
transaction.begin(read_write);
...

the tile lock seems to get stuck at one point and results in

Error: One or more of the target tiles are locked by another transaction

I've attached a minimal example.
Just extract the archive into "manuals_and_examples/examples/c++", reconfigure and build tilelockbug. rasgeo's RasdamanHelper2 class makes heavily use of the above pattern, hence rasimport and raserase virtually don't work at all.

Attachments (1)

lockbug_demo.tar (10.0 KB) - added by aherzig 3 years ago.
minimal bug example

Download all attachments as: .zip

Change History (17)

Changed 3 years ago by aherzig

minimal bug example

comment:1 Changed 3 years ago by klipskoch

This problem is related to ticket #622.
The same happens when trying to run queries with directql.
I submitted a patch fixing that.

But, basically your code needs the connection of the lockmanager (it has an own connection) to the database.
Therefor, you can add the following after opening the database in your minimal example:

#if LOCKMANAGER_ON
    LockManager * lockManager = LockManager::Instance();
    lockManager->connect();
#endif

and the following code after closing the database:

#if LOCKMANAGER_ON
    LockManager * lockManager = LockManager::Instance();
    lockManager->disconnect();
#endif

comment:2 Changed 3 years ago by dmisev

  • Cc pbaumann added

Kinga why is it necessary for the client to do this?

Can't you do the connect() (if not already connected) in the LockManager::Instance() itself?

comment:3 Changed 3 years ago by klipskoch

Actually, the connect() is not done in the Instance(), but after the server (in case of rasql) connects to the database and the disconnect() when the server calls closeDB().
So, if you have code like directql, you have to additionally call these functions.

I could put the connect in Instance(), but the only place I am sure that I can call disconnect is when the client/server closes the database. Putting the connect() in Instance() and disconnect() outside would not look nice.

comment:4 Changed 3 years ago by dmisev

What about putting disconnect() in the destructor? That should work, no?

comment:5 Changed 3 years ago by klipskoch

That should theoretically work. I will have to test it.

comment:6 Changed 3 years ago by dmisev

A client like rasimport should not need to do this in any case, because it's not even linked with the server's libs. So it can't even work in any case.

I believe the problem with rasimport is another, and in the ticket we are missing an email that Alex sent to the mailing list. See below:

The issue I reported only seems to come up when the rasdaman data base name deviates from its default name RASBASE, as it was the case when I first observed the tile lock error. Using the default data base name RASBASE, I don't get the error with the sample queries attached to the ticket (and rasimport/raserase work just fine).

comment:7 Changed 3 years ago by dmisev

It looks like in your connect you assume a default RASBASE database, but it can be different - you can get the name from the configuration.

comment:8 Changed 3 years ago by klipskoch

Yes, you are right. I will change that. Thanks for pointing this out.

comment:9 Changed 3 years ago by dmisev

But it would still be good for the connect/disconnect to reside within the Lockmanager class, otherwise we need to always remember to do this manually.

comment:10 Changed 3 years ago by klipskoch

I will try that as well and let you know.

comment:11 Changed 3 years ago by klipskoch

I submitted a patch which moves connect() to Instance() and disconnect() to the destructor.
It also solves the hardcoded database name problem. I extracted the dbId, dbUser and dbPasswd
from the configuration.

comment:12 Changed 3 years ago by dmisev

Is it possible to also extract the dbPort? Piero has reported that it doesn't work with his database which is at port 5433, instead of the default one 5432.

comment:13 Changed 3 years ago by klipskoch

From configuration I can extract only dbConnectionId. This seems to contain the name, but no port.

comment:14 Changed 3 years ago by dmisev

  • Cc pcampalani added

comment:15 Changed 3 years ago by dmisev

  • Priority changed from blocker to major

Lowering priority, non-default port issue still to be fixed.

comment:16 Changed 3 years ago by dmisev

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.