Editor’s note: This is the latest post in our NVIDIA DRIVE Labs series, which takes an engineering-focused look at individual autonomous vehicle challenges and how NVIDIA DRIVE addresses them. Catch up on all of our automotive posts, here.
Anyone who’s circled a busy parking lot or city block knows that finding an open spot can be tricky. Faded line markings. Big trucks hiding smaller cars. Other drivers on the hunt. It all can turn a quick trip to the store into a high-stress ordeal.
To park in these environments, autonomous vehicles need a visual perception system that can detect an open spot under a variety of conditions. Perceiving both indoor and outdoor spaces, separated by single, double or faded lane markings, as well as differentiating between occupied, unoccupied and partially obscured spots are key for such a system — as is doing so under varying lighting conditions.
Geometry also introduces complexity. Not every parking space is a perfect rectangle. They can be angled or slanted, perpendicular or parallel. This results in a diverse set of potential spatial orientations between the parking space and the car looking to park in it.
AI Marks the Spot
To enable parking space perception, we use camera image data collected in various conditions, and deep neural network processing through our ParkNet DNN.
To address the geometric diversity in parking shape and orientation, we trained ParkNet to detect parking spaces as four-sided polygons rather than rectangles. The DNN generalizes to define four lines connected at arbitrary, rather than right, angles. This enables it to perceive parking spaces regardless of the lane markings’ orientation with respect to the car.
ParkNet also determines which of the four sides represents the “entry line” of the parking spot. That is, the open side of the polygon where the car is supposed to enter the space. Since entry line information is a key input into self-parking planning and control software, it needs to have high classification probability, as well as obey both traffic rules and common sense.
ParkNet outputs parking space detections and entry line classifications in 2D image space. So a change from 2D to 3D coordinates is needed to use ParkNet outputs in autonomous parking planning and control software.
By using camera self-calibration results (that is, estimates for the camera pitch/yaw/roll values, that represent up/down, left/right, and clockwise/anticlockwise positioning), ParkNet results can be converted into 3D coordinates. This enables 3D position estimation that’s particularly accurate for short-distance self-parking maneuvers.