Coordinate Systems for Robot Sensor Fusion

CS 480: Robotics & 3D Printing Lecture, Dr. Lawlor

One of the hardest problems in fusing multiple robot sensors into a coherent view of the world is consistently representing their coordinate systems. For example, if I put a Kinect sensor on a ground robot outdoors, both the position and orientation of both robot and Kinect are pretty arbitrary 3D values. Smashing these down to 2D just can't represent things like tunnels or overpasses, and ignores dangerous things like cliffs.

You need to represent the positions of the robot or sensor. Use 3D XYZ vectors for this. You also need the ability to easily manipulate them: 3D vector arithmetic demo in THREE.js
You need to represent the orientation of objects in 3D. Comparison of methods to represent 3D rotation. I prefer to use 3 vectors for the XYZ axes of each coordinate system; this is equivalent to keeping a 3x3 matrix to represent the object's rotation.
Is the world's Y axis up, or is Z up? The answer varies depending on the programmer and file format.

Making Z up means the X and Y axes are flat like a map. (I and John Carmack prefer this.)
Making Y up means the camera's initial Y axis matches the world Y axis. (Some games and file formats use this.)

Is that number in robot-local coordinates, or global coordinates? Kinect depth values are in the Kinect's nonlinear pixel-and-disparity coordinate system; even after converting to XYZ the point (0,0,0) is centered on the Kinect, not the robot or world origin.

You can even use a 4x4 homogenous matrix to represent both position and orientation. These even compose, so you can incrementally compute the world-to-tool coordinate system by multiplying the world-to-robot, robot-to-arm, and arm-to-tool coordinate system offsets.

Coordinate system malfunctions are incredibly common in setting up a new robot, or in doing any sort of sensor filtering or combination. A typical problem results in the sensor data arriving rotated 90 or 180 degrees from reality, or the sensor data being the mirror image of what it should be, or everything working fine until the robot moves, and the sensor data then being projected at some new and invalid location.