Hello! I'm in search of advice

I recently bought an intel D435i for a project of mine. I want to use this, a DSLR and open 3D as basis for creating scenes which I then hope to augment using differentiable rendering.

Right now I am experimenting with the python code examples (I am new to coding in general). I have a vision for a pipeline which I want to share with you.

Right now streaming options for me contain too much data! At present I think the only data I need stream continuously is the IMU odometry.

So the idea is that i have a script for capturing single depth and colour frames from the D435i and DSLR (which should be aligned) and use the odometry from the IMU to ‘estimate’ the position for these frames. From this, I would hope registration would be pretty easy and very fast!

I’m guessing that what i describe are basics for most SLAM applications, but I am not interesting in it being real time. I am interested in accurate scenes with very little noise.

Ps. any help and advice would be greatly appreciated.

Usually, IMU odometry is very unreliable due to accumulated noise, especially for cheap sensors on D435i.

Recommended ways are RGB-D odometry (i.e. aligning RGB-D images) or visual-inertial odometry (i.e. combining both image and IMU data). However, estimated poses for frames will still drift (again, due to accumulated noise, but at a smaller magnitude). Therefore we need some global constraints to correct the error. Typically, if you visit a place again after a time interval (we call it loop closure), you will expect a similar frame pose to the previous frame. This intuitive constraint will help reduce drift by forcing these frames to align.

All these functions are implemented in our reconstruction system (though, we do not support visual-inertial odometry). It would be best if you follow the tutorial, collect some data (with D435i only and without the DSLR), and see how it works. It is a bit slow, but it is the state-of-the-art system with very accurate poses.

I’m a little sad to find out that they used such a cheap sensor… maybe it will be possible to replace it :thinking: I was still hoping that imu data would give good enough poses for global registration to work, at least well enough to use ICP

I noticed from my own data collection that loop closure is very important for these algorithms. I have had some pretty interesting reconstructions without implementing it ^^

Is there a specific reason why visual-intertial odometry is not supported?

As you suggest, I am going to work through all these examples as I’m sure that doing this will answer most of my questions naturally. btw, I have to commend the effort put into examples. thanks to all who are working on this project.

Our library started with RGB-D reconstruction where IMU was not considered. At current stage, we focus more on 3D processing and perception. Since VIO generally falls in the SLAM domain (which overlaps our interest but is not a dominat factor), the priority of implementing it is relatively low. Personally I do hope we can touch that part when the community grows!

Feel free to send us feedback when you play with the examples!

I got pretty wrapped up in SLAM applications for this exact reason… but to be honest I really think I wasted a lot of time. O3D is way closer to what I aiming for IMO and apparently more friendly than PCL and OCV.

From what I have seen in the examples, I’m pretty sure I could use my methodology, although it won’t be pretty and most likely held together with sellotape, I’m better of giving it a shot instead of sitting around waiting for someone else to do it. I’ll be sure to report back with any successes :slight_smile: thanks for the warm welcome.