Putting it all together - Tracking on Caffeine

Combining LIDAR and Camera elements

The final step of the project was to combine the object tracking from the LIDAR with the object identification from the camera. The final code and all of its functions can be downloaded here. The code essentially performs the following functions:

Tracks multiple objects using the LIDAR. Each object is labeled as object 1 or 2.
Estimates the trajectory of an object if obscured from view.
Corrects any object swapping (from crossing in front of each other) by running the function from the camera (returns an array of [1 2] or [2 1] depending on the actual order of objects as seen from the camera.
Plots the corrected object trajectories.

It should be noted that the code in its current form does not successfully identify and track objects in the scene due to time constraints.

Closing Remarks

Several lessons were learned throughout the duration of the project. These lessons were learned from the perspective of the team, and are not meant to be generalizable to every mechatronics project. However, we believe that these observations could be useful to fellow project-doers that are undertaking object tracking with LIDARs and object detection with cameras. These lessons are:

If object identification and tagging are needed (as was the case in this project), research and try to understand the mechanics of each method thoroughly. Obviously, in retrospect, this observation seems obvious, but too often, we rush into coding and testing out different methods, only to find out that it is not robust enough, or not suitable for our application. If we had listed out the criteria we needed for image processing, we might have been able to select the appropriate method without so much testing.
Try to code for as general a number of objects as possible. In other words, try to code for "i" number of objects, as opposed to just 1 or 2. This is particularly important when interfacing multiple sensors (LIDAR and camera), because the method of tracking and object detection are so different in each code, that even the slightest inconsistencies when merging code can cause multiple errors (read: headaches).
Starting with trying to tracking humans with a camera is a bad idea. Humans have multiple moving parts (arms, legs, shirt), causing the program to think it sees about 16 different moving points entering the scene.
Multiply the 'estimated project duration' by 5. Now that's the REAL amount of time required to ACTUALLY complete the project.