Skip to the content.

Patents

Precise vehicle pose estimation is essential for the effective operation of vehicle marshaling systems within assembly plants, ensuring the proper navigation of vehicles to various fitting junctions. This is enabled using 3D pose detectors, which require extensively labeled training data - a process that is both costly and labor-intensive. In contrast, 2D keypoint detection offers a simpler method for obtaining pixel coordinate landmarks but does not provide the necessary 3D poses directly. This innovation addresses the challenge by using 2D keypoints to generate accurate 3D vehicle poses. These poses are then utilized to train a pose estimator network. Additionally, the invention serves as a auto corrector for 3D labels provided by human annotators.The invention utilizes vehicle physical dimensions from CAD models, along with the detected 2D landmarks, to formulate a nonlinear optimization problem which is solved using Levenberg-Marquardt algorithm.This solution saves more than 500k dollars that Ford Motor Company spends on getting 3D vehicle pose annotations.

The thousands of AVs in Ford’s fleet in any given city need to be housed and maintained and multi- level garages are one place where this is expected to happen. The vehicles and their sensors need to be checked and calibrated, and their compute and software stacks updated. At any given time, hundreds of vehicles are expected to be operating in any garage. A major bottleneck is in difficulties of localization and mapping within a GPS-denied environment like an enclosed depot or terminal. A network of ceiling cameras looking down and edge devices for routing Autonomous Vehicles through such GPS-denied environments can be used. When there are many such cameras or edge devices distributed throughout an environment, one of the fundamental things that is required to be done is the extrinsic calibration of the cameras to the environment. Manually calibrating thousands of edge devices would require several days of skilled labor work and downtime. This invention disclosure describes an automated method to complete the 6 DoF extrinsic calibration of these cameras using a mobile calibration robot. The calibration robot has an upward looking fisheye camera and an ArUco marker. The fisheye camera is used to perform visual odometry. This is followed by detection of ArUco markers from the environment cameras and a machine learning based detector to detect environment cameras from the fisheye camera. The solution was tested out on a large network of environment cameras (40) over a large area (800 sq m).

2D-aided 3D Perception Network (Filed, Appl No: 18/489263)

Perception solutions for tasks like 3D object detection, traffic signal/sign detection, lane detection are crucial for ADAS solutions. Camera based perception is preferred mainly due to low cost and reliability of camera sensors. Lately, vision-based Bird’s Eye View (BEV) perception solutions are used due to the ability to represent features in a 3D world according to a metric scale. But the ability to learn/represent depth remains a challenge which leads to poor understanding of 3D relationships between various components in the world. The invention aids the learning of the relationship between the 2D and the 3D world by reconstructing the 2D features from BEV features using camera perspective projection geometry aligned transformer-based attention network and perform auxiliary downstream 2D object detection and lane line detection. The solution further enforces consistency losses to learn global relationships between the reconstructed pixel features.This generalizable method works with any of the camera BEV based perception system and is only needed at training time.