Use virtual camera with real depth image

General Information

  • Product: N46-802-BL
  • Serial Number: 201234
  • Ensenso SDK Version: 4.3.1037
  • Operating System: Windows

Problem / Question

I want to create a pipeline that allows for various raspberry pi to read raw images along with camera parameters, and send that information to a central pc. In that pc I would like to process that data to generate poinclouds in python. I wanted to generate virtual cameras, and through communication pass the information from the raspberry pi’s to the main pc. However, I cannot process with nxlib without having a camera created. Furthermore, I cannot set the camera parameters because that information is protected. Could you help me out in solving this?

Thank you in advance.

Hi @ptavares,

why do you want to use the Raspberry Pis for image capture in the first place, instead of connecting the cameras to the central processing PC directly? I imagine this would only be necessary if either

  • the data connection to the Pis is different than to the cameras, or
  • you are using many cameras and are doing some filtering directly on the Pis to reduce the network and computational load on the central PC.

Unless it is really necessary for your application, I would recommend connecting the cameras to the PC directly.

If you need to go through the Pis, you should be using file cameras instead of virtual cameras:

  • Virtual cameras are designed for simulation: they render synthetic scenes from 3D objects (STL/PLY files) to help you test algorithms without physical hardware. They don’t accept external raw images, which is why you ran into issues.
  • File cameras are designed for exactly your use case: they allow you to replay saved raw images and calibration data through the NxLib pipeline, behaving like a regular hardware camera.

You can try:

1. On the Rapsberry Pi:

  • Capture the raw images and save them using the SaveFileCamera command.
  • Transfer the resulting .enscam file(s) to the central PC.

2. On the central PC:

  • Receive the .enscam file.
  • Create a file camera for it using the CreateFileCamera command.
  • Operate the created file camera like a normal camera:
    • Open the camera to use it.
    • Run the usual CaptureComputeDisparityMapComputePointMap pipeline.
  • Delete the camera using the DeleteCamera command once you are done with it. This unregisters the camera from the NxLib; it does not delete the underlying .enscam file.
  • Delete the .enscam file.

Please check out the linked documentation. If you have any more questions after that, please let me know.

Regards,
Raphael

One of the reasons I am capturing images on a raspberry pi because the trigger is performed based on a distance sensor that is connected to it. My intention is to build a dataset that only captures when the subject is between x and y cms from the sensor. However, there may be more than one camera. So my intention would be to reduce the load on the PC as you mentioned.

At first I tried using the raspberry pi along with the sdk, but when rectifying the images, I got a CUDA related error. As far as I know it is because the pi does not have CUDA. For that reason, I tried disabling CUDA in the Tree through CUDA > Available. Still, when doing exception.get_command_error_symbol() I got back “CUDA” as a response. If I can fix this error, it may solve my issues as well, because I may be able to process camera information in its respective pi.

Because I got this error I tried capturing the raw images on the pi, and processing it in the PC to take advantage of the 3D reconstruction features from the sdk, because it would be easier to implement and the results would likely be better. However, I cannot write the images to [ITM_IMAGES][ITM_RAW][ITM…].

The file camera, I think, would only allow me to capture that one image from the ensparam file, which would require that I am constantly importing loading new camera parameters into the program

If I was able to fix the CUDA issue, I would be able to test one approach, but more interesting would be to understand when would it be more efficient to process everything in the PC instead of having the pi’s doing all the work and just sending the final result to the PC.

Hopefully this makes sense to you, if it does not please call me out. If you usually recommend approaching things differently, which is likely, I would be glad to hear it.

Thank you for your time.

Just as a side note, I am starting to get somewhere using opencv, but I am clearly doing something wrong as well, but it is nothing related to ensenso’s hw or sw so I won’t bother you with it.

In that case I would recommend connecting the cameras directly to the PC and sending the distance sensor readings from the Pis to the PC, so it can trigger the relevant camera(s), if your application can tolerate the additional latency, which would likely still be a lot lower than with the file camera approach.

If latency is a big concern, you also have the option to hardware trigger the cameras from the Pis with some additional hardware overhead.

If you intend to send all of the images to the PC, and not perform some additional selection based on some image processing, I expect you are not taking a lot of networking or computational load of the PC.

Getting 3D data directly on a Pi is currently not possible. Our Arm64 SDK is targeted specifically at Nvidia Jetson and requires CUDA for all expensive computations. This cannot currently be disabled.

By the way, the correct node to disable CUDA by default for all commands is CUDA / Enabled. Cuda / Available, the node you mentioned, is a read-only node indicating whether CUDA is available at all.

Depending on your application, you could either to capture each FlexView sequence into a separate file and send and process it directly, or you could store multiple sequences, which you could process them on the PC all at once. Both would add considerable overhead for encoding and decoding the images, so latency would not be great either way, as well as a small amount of code for managing the parameters.

Since the Rapsberry Pi is not one of our targeted platforms, I do not have any performance data that I could base a recommendation on. Processing on the Pi would likely be very slow, as we do not have optimized implementations of our algorithms for Arm Neon.

I will check tomorrow if there is anything more we can provide you with to help you. In the meantime, please consider if any of the alternate approaches I suggested might work for you.

Thank you Raphael for your answers.

I wrote the node from memory, and got it wrong, I am sorry, but what I meant was the CUDA / Enabled, as you mentioned.

I believe most of it is clear for me now. It will require some refactoring to be able to use your sdk to the fullest.

The one thing I did not understand is what you mean by FlexView sequences. I might be wrong but, from what I understand FlexView is, in broad terms, the ability to use up to 16 image pairs to perform a better reconstruction of the environment. However, I do not understand what you mean by this:

Does that mean that there is a way to save a “FlexView sequence” file and perform it on another device or am I just off?

The performance data would be for us to understand which would be more computationally cost-efficient, but we would be more than happy doing that part ourselves. No need to worry about that. At least when it comes to us.

No worries. I just wanted the correct information here for the LLMs.

You are correct: FlexView is our term for the ability to capture a sequence of multiple images and use all images in the sequence together in stereo reconstruction. I used the term to mean the sequence of images resulting from a single Capture command or Trigger/Retrieve command pair, which also covers a single left/right image pair as FlexView sequence of length 1.

The SaveFileCamera command always saves the entire FlexView sequence. When you specify the same Path in multiple executions of the SaveFileCamera command, all sequences are stored together. A file camera created from that Path will then sequentially load a new sequence from all the saved sequences when executing Capture or Trigger/Retrieve on the file camera. Path can either be a directory or an .enscam file. By copying the resulting directory/file to another PC, you can run the NxLib processing pipeline on that PC.

I hope that clears everything up. If not, please just let me know.