Labeling: Annotation Tool



We need a labeled dataset to train the model; more importantly, the quality and consistency of the data are crucial for a good model.

In my personal experience, the quality of the Korean annotation company did not meet my quality standards, and the feedback process took a long time. Although I created a 20-page annotator instruction manual (referencing the nuScenes annotator instruction) to ensure high-quality labeling, expecting outsourcing companies to maintain this level of quality was challenging.

Therefore, I believed it was essential to establish an in-house labeling process. Moreover, I wanted the labeling process not to be labor-intensive, even though the labor cost of graduate students is known to be very cheap.

My approach was to create a roughly functioning LiDAR-based 3D object detector using only a small portion of the data. Then, I would use some detection results (with high confidence) as pseudo-labels and manually annotate instances where the model failed to detect. This iterative process would gradually reduce the need for manual labeling as the model improved.


3D bounding box annotation using Supervisely


To start, I needed to choose an annotation tool, and my final candidates were CVAT and Supervisely. Since my goal was to upload LiDAR points along with 3D bounding boxes (pseudo-labels), manually correct and annotate labels, and then download the final data, I found Supervisely and its Python API more suitable for achieving these objectives.

I spent some time annotating a few hundred samples in my spare time (while watching Netflix). Concurrently, I also worked on developing an offboard 3D LiDAR detector in parallel.