Skip to content

fix(udr): Add cleanup of UDR objects from Engine on attachment disconnect#8992

Open
TreeHunter9 wants to merge 1 commit intoFirebirdSQL:v5.0-releasefrom
TreeHunter9:v5.0-release-udr-cleanup
Open

fix(udr): Add cleanup of UDR objects from Engine on attachment disconnect#8992
TreeHunter9 wants to merge 1 commit intoFirebirdSQL:v5.0-releasefrom
TreeHunter9:v5.0-release-udr-cleanup

Conversation

@TreeHunter9
Copy link
Copy Markdown
Contributor

There is a problem with UDR objects cleanup that can lead to server crash.

Detailed description of the Issue

As I understand:

  1. The Engine is the representation of plugin that is loaded lazy when a UDR function (or any other UDR object) is called;
  2. After Engine is created, it creates UdrPluginImpl which is basically loads UDR factory objects from plugin's dynamic library;
  3. Then, using these factories, we create an instance of the UDR function (SharedFunction) and store a pointer to this function inside Engine. The actual owner of SharedFunction is attachment;
  4. When SharedFunction is executed, it basically creates a copy of itself and store it in children map where IExternalContext* is a key. Upon the next execution, the already created copy will be taken from children map;
  5. IExternalContext is created per attachment and per engine;

So, after execution of UDR function, there will be one main SharedFunction and one child, but on attachment disconnect (Engine::closeAttachment()), only the child will be removed, while the SharedFunction* will still be stored in Engine. The attachment is dead, so SharedFunction too, and now we have a dangling pointer inside SortedArray<class SharedFunction*> functions;. Referencing the dangling pointer itself doesn't cause the server to crash, but when a new object will be placed at address to which pointer is pointing to, this will become a problem.

I found the way to reproduce the crash by using 2 UDR modules. I'm using http_client_udr and udf_compat (comes with firebird), but I'm guessing that any UDR will do the job.

  1. Execute udr.sh like this ./udr.sh "/home/treehunter/Firebird_5/firebird/gen/Release/firebird" 100 10, it will create database in root folder, populate it with UDR functions and procedures, then it will wait;
  2. Connect to created database, execute some created UDR functions to prevent the Engine from being unloaded when the script ends;
  3. Resume execution of script, it will create 10 workers that will connect to database and execute some UDR functions + procedures and then disconnect, this will be repeated 100 times. We do it to populate Engine with dangling pointers after attachment disconnect;
  4. Once the script has finished running, use attachment from step 2. to execute select div(2, 1) from rdb$database. This is necessary to load the new Engine from udf_compat;
  5. Now disconnect from database. Server will crash trying to close the attachment. Stacktrace is attached.
stacktrace
#0  Firebird::SortedVector<Firebird::BePlusTree<Firebird::Pair<Firebird::NonPooled<Firebird::IExternalContext*, Firebird::IExternalProcedure*> >*, Firebird::IExternalContext*, Firebird::FirstObjectKey<Firebird::Pair<Firebird::NonPooled<Firebird::IExternalContext*, Firebird::IExternalProcedure*> > >, Firebird::DefaultComparator<Firebird::IExternalContext*> >::NodePtr, 375u, Firebird::IExternalContext*, Firebird::BePlusTree<Firebird::Pair<Firebird::NonPooled<Firebird::IExternalContext*, Firebird::IExternalProcedure*> >*, Firebird::IExternalContext*, Firebird::FirstObjectKey<Firebird::Pair<Firebird::NonPooled<Firebird::IExternalContext*, Firebird::IExternalProcedure*> > >, Firebird::DefaultComparator<Firebird::IExternalContext*> >::NodeList, Firebird::DefaultComparator<Firebird::IExternalContext*> >::find (this=0xcccccccccccccccc, item=@0x7fffd79fdaa8: 0x7fffd41974d8, pos=@0x7fffd79fd9f8: 3838171680) at src/include/../common/classes/vector.h:200
#1  Firebird::BePlusTree<Firebird::Pair<Firebird::NonPooled<Firebird::IExternalContext*, Firebird::IExternalProcedure*> >*, Firebird::IExternalContext*, Firebird::FirstObjectKey<Firebird::Pair<Firebird::NonPooled<Firebird::IExternalContext*, Firebird::IExternalProcedure*> > >, Firebird::DefaultComparator<Firebird::IExternalContext*> >::ConstAccessor::locate (this=0x7fffd79fda70, lt=Firebird::locEqual, key=@0x7fffd79fdaa8: 0x7fffd41974d8) at src/include/../common/classes/tree.h:389
#2  Firebird::BePlusTree<Firebird::Pair<Firebird::NonPooled<Firebird::IExternalContext*, Firebird::IExternalProcedure*> >*, Firebird::IExternalContext*, Firebird::FirstObjectKey<Firebird::Pair<Firebird::NonPooled<Firebird::IExternalContext*, Firebird::IExternalProcedure*> > >, Firebird::DefaultComparator<Firebird::IExternalContext*> >::ConstAccessor::locate (this=0x7fffd79fda70, key=@0x7fffd79fdaa8: 0x7fffd41974d8) at src/include/../common/classes/tree.h:373
#3  Firebird::GenericMap<Firebird::Pair<Firebird::NonPooled<Firebird::IExternalContext*, Firebird::IExternalProcedure*> >, Firebird::DefaultComparator<Firebird::IExternalContext*> >::get (this=0x7fffe4cfe3b8, key=@0x7fffd79fdaa8: 0x7fffd41974d8, value=@0x7fffd79fdac8: 0x7fffc80075b8) at src/include/../common/classes/GenericMap.h:265
#4  Firebird::Udr::Engine::closeAttachment (this=0x7fffe4cf9f40, context=0x7fffd41974d8) at src/plugins/udr_engine/UdrEngine.cpp:671
#5  Firebird::IExternalEngineBaseImpl<Firebird::Udr::Engine, Firebird::ThrowStatusWrapper, Firebird::IPluginBaseImpl<Firebird::Udr::Engine, Firebird::ThrowStatusWrapper, Firebird::Inherit<Firebird::IReferenceCountedImpl<Firebird::Udr::Engine, Firebird::ThrowStatusWrapper, Firebird::Inherit<Firebird::IVersionedImpl<Firebird::Udr::Engine, Firebird::ThrowStatusWrapper, Firebird::Inherit<Firebird::IExternalEngine> > > > > > >::cloopcloseAttachmentDispatcher (self=0x7fffe4cf9f48, status=0x7fffd79fdd68, context=0x7fffd41974d8) at src/include/firebird/IdlFbInterfaces.h:15891
#6  Firebird::IExternalEngine::closeAttachment<Firebird::CheckStatusWrapper> (this=0x7fffe4cf9f48, status=0x7fffd79fdd60, context=0x7fffd41974d8) at src/include/firebird/IdlFbInterfaces.h:4639
#7  Jrd::ExtEngineManager::closeAttachment (this=0x7ffff61c3760, tdbb=0x7fffd79fe538, attachment=0x7fffe407ab50) at src/jrd/ExtEngineManager.cpp:1295
#8  release_attachment (tdbb=0x7fffd79fe538, attachment=0x7fffe407ab50, dropGuard=0x0) at src/jrd/jrd.cpp:7733
#9  purge_attachment (tdbb=0x7fffd79fe538, sAtt=0x7ffff61d7c90, flags=2) at src/jrd/jrd.cpp:8504
#10 Jrd::JAttachment::freeEngineData (this=0x7ffff61d7ea0, user_status=0x7fffd79fe6f0, forceFree=false) at src/jrd/jrd.cpp:3367
#11 Jrd::JAttachment::internalDetach (this=0x7ffff61d7ea0, user_status=0x7fffd79fe6f0) at src/jrd/jrd.cpp:3304
#12 Jrd::JAttachment::detach (this=0x7ffff61d7ea0, user_status=0x7fffd79fe6f0) at src/jrd/jrd.cpp:3316
#13 Firebird::IAttachmentBaseImpl<Jrd::JAttachment, Firebird::CheckStatusWrapper, Firebird::IReferenceCountedImpl<Jrd::JAttachment, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IVersionedImpl<Jrd::JAttachment, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IAttachment> > > > >::cloopdetachDispatcher (self=0x7ffff61d7ea8, status=0x7fffd79fe948) at src/include/firebird/IdlFbInterfaces.h:12157
#14 Firebird::IAttachment::detach<Firebird::CheckStatusWrapper> (this=0x7ffff61d7ea8, status=0x7fffd79fe940) at src/include/firebird/IdlFbInterfaces.h:2814
#15 operator() (__closure=0x7fffd79fe8c0) at src/yvalve/why.cpp:6080
#16 std::__invoke_impl<void, Why::YAttachment::detach(Firebird::CheckStatusWrapper*)::<lambda()>&>(std::__invoke_other, struct {...} &) (__f=...) at /usr/include/c++/15.2.1/bits/invoke.h:63
#17 std::__invoke_r<void, Why::YAttachment::detach(Firebird::CheckStatusWrapper*)::<lambda()>&>(struct {...} &) (__fn=...) at /usr/include/c++/15.2.1/bits/invoke.h:113
#18 std::_Function_handler<void(), Why::YAttachment::detach(Firebird::CheckStatusWrapper*)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/15.2.1/bits/std_function.h:292
#19 std::function<void()>::operator() (this=0x7fffd79fe8c0) at /usr/include/c++/15.2.1/bits/std_function.h:593
#20 Why::done<Why::YAttachment> (status=0x7fffd79fe940, entry=..., y=0x7ffff7961e50, newClose=..., oldClose=...) at src/yvalve/why.cpp:1354
#21 Why::YAttachment::detach (this=0x7ffff7961e50, status=0x7fffd79fe940) at src/yvalve/why.cpp:6079
#22 Firebird::IAttachmentBaseImpl<Why::YAttachment, Firebird::CheckStatusWrapper, Firebird::IReferenceCountedImpl<Why::YAttachment, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IVersionedImpl<Why::YAttachment, Firebird::CheckStatusWrapper, Firebird::Inherit<Firebird::IAttachment> > > > >::cloopdetachDispatcher (self=0x7ffff7961e58, status=0x7fffd79fe9e8) at src/include/firebird/IdlFbInterfaces.h:12157
#23 Firebird::IAttachment::detach<Firebird::CheckStatusWrapper> (this=0x7ffff7961e58, status=0x7fffd79fe9e0) at src/include/firebird/IdlFbInterfaces.h:2814
#24 rem_port::end_database (this=0x7fffe4c8b3d0, sendL=0x7fffce0fdc68) at src/remote/server/server.cpp:3250
#25 process_packet (port=0x7fffe4c8b3d0, sendL=0x7fffce0fdc68, receive=0x7fffce0fe238, result=0x7fffd79fec98) at src/remote/server/server.cpp:5160
#26 loopThread () at src/remote/server/server.cpp:6944
#27 (anonymous namespace)::ThreadArgs::run (this=0x7fffd79fedd0) at src/common/ThreadStart.cpp:78
#28 (anonymous namespace)::threadStart (arg=0x7ffff7605ca0) at src/common/ThreadStart.cpp:94
#29 start_thread (arg=<optimized out>) at pthread_create.c:454
#30 __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

I have found that half of dangling pointers SharedFunction* inside SortedArray<class SharedFunction*> functions is pointing to SharedProcedure, but no crashes occur when iterating over these pointers. I suspect this is because SharedFunction and SharedProcedure are binary compatible. However, when we adding new UDR engine to the game, the UdrPluginImpl can be created at the address where SharedFunction was stored, and this would certainly cause a crash.

Proposed fix

Track the created main UDR objects that is stored inside Engine by attaching them to attachment, so that we can track them and remove them from Engine when the attachment is disconnecting.

Behavior on v6

The metacache has drastically change the logic of handling main UDR objects, they no longer belong to connections. It means that after creation of main SharedFunction it will be shared between connections, and execution from different attachment will now create a child. The child is normally removed on attachment disconnect, so there is no dangling pointer problem as I understand.
For example, my test case from above do not reproduce the crash, and if look at SortedArray<class SharedFunction*> functions inside Engine, on v5 there is around 200 elements (dangling pointers) due to creation of SharedFuncion per attachment, but on v6 there is 2 elements - these are functions that are being called during the test (every new attachment creates only the child of these main SharedFunctions, which are normally removed when attachment disconnects).
I'm not very familiar with the new matacache code, but at first glance, there is no bug in v6. Maybe Alex can confirm this.

@AlexPeshkoff
Copy link
Copy Markdown
Member

The only detail in your logic does not let me answer definitely 'yes'.

You say that depending upon type of external engine you get crush or not. This is explained by creation of binary compatible or not object in RAM pointed to by dangling pointer. But pay attention - pre-6 external routines are created in attachment pool, which is destroyed together with attachment. If it was not for destroyed pool I could agree: possibility that object of same size is created in same block rather high. In a test we can expect same sequence of RAM allocation for old and new attachment making pool to have same blocks map. But in real life that hardly works, i.e. looks like external engine should segfault much more often.

In all other aspects specially re v6 I agree with you - external routines were moved to database's pool, and no dangling pointers should arrive.

@TreeHunter9
Copy link
Copy Markdown
Contributor Author

But pay attention - pre-6 external routines are created in attachment pool

Jrd::ExtEngineManager::ExtRoutines yes, but not the Udr::Shared* objects, they are created using default pool, for example, inside Engine::makeFunction().

@AlexPeshkoff
Copy link
Copy Markdown
Member

But pay attention - pre-6 external routines are created in attachment pool

Jrd::ExtEngineManager::ExtRoutines yes, but not the Udr::Shared* objects, they are created using default pool, for example, inside Engine::makeFunction().

Yes - now I've got what objects do you talk about. They are really in default pool, moreover - in plugin's default pool. In this case no more questions remain.

@asfernandes asfernandes self-requested a review April 16, 2026 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants