At Deepen, we’re focussed on building tools that help autonomous systems — vehicles, robots, and beyond — process sensor data more efficiently and accurately. With our AI-powered tools, teams building such systems can annotate massive amounts of 2D and 3D data gathered from multiple sensor types. We help them to iterate faster and with more reliable data, which makes their systems safer and frees them to focus on innovation.
Our 3D annotation and visualization tools were designed to run on-premise. This post tells the story of how we used parts of the open AVS stack — created by Uber’s Advanced Technologies Group (ATG) and Visualization team — to scale our 3D tools up to work on the web. Our experience left us believing that the future of autonomous systems may be much more dependent on open source projects like AVS than we had previously thought.
Our first 3D tool was built natively and can seamlessly able to handle millions of LiDAR data points. In order to provide quick turn-around for annotation tasks, our tools allow hundreds of people to simultaneously work on the same labeling task without stepping on each other. But they are all designed to be installed somewhere as a single private instance.
We already had our entire 2D toolset available on the web, which made it much more accessible and widely used. We wanted to do the same for the 3D tools, so we embarked on creating a web tool that would allow us to further make our tools accessible as well as close the feedback loop between the multiple parties that care about data. At the core, the ability to view, manipulate, and iterate on large datasets was integral to making our tool powerful. We would also need to support multiple annotation types from 3D bounding boxes, polygons, polylines, all the way down to point level segmentation.
3D Programming on the Web
As we kicked off this project, the first question that came up was the choice of web framework or library. We were looking for a solid foundation that would be extendable, performant, and well documented.
Our existing stack used Python on the backend, and we weren’t married to any particular front-end framework. 3D graphics on the web is still considered to be in its relative infancy, the last time any of us had built an end-to-end product that used a 3D library was with three.js, a cross-browser library that is used to create 3D graphics without having to worry about the nitty-gritty details of WebGL.
Using three.js, we were able to put together a quick demo to visualize some of the LiDAR data that we work with. It was relatively easy to get started, but we quickly realized that we would have a lot of work ahead of us in terms of controls, interactions, and UI. On top of all of that, we would need to make this entire experience buttery smooth so our users don’t go crazy.
We landed on deck.gl after searching through other visualization frameworks. Just by looking at examples, we were easily able to grasp the key concepts. Namely, deck.gl relies on a concept of layers, which offers a flexible API to feed in data and customize how the layer should be rendered.
Out of the box, deck.gl comes with numerous layers, including TextLayer, PolygonLayer, LineLayer, and importantly for us, a PointCloudLayer. We also came up with our own layers by extending on the base layer and writing our own shaders.
We found many positives from adopting deck.gl:
- Layering system makes the rendering scaleable and allows the user to control what they see.
- We could manipulate data and see it render across multiple visualizations
- Layers only update when they need to when the data is changed. You can control which attributes should change (i.e. if the colors change, deck.gl can be told to only re-render that property)
- Allows us to nicely use and integrate Luma GL into our custom layers.
- Integrates with React out of the Box. This played very nicely with the rest of our application as we decided to use React in general. Having React as a first-class citizen made it seamless to build a robust web application with building the UI with React, and maintaining the state with Redux.
Deck.GL does a great job with its in-built optimizations, handling re-renders efficiently and providing a smooth experience on the web to interact with different layers, including our core data-intense Point Cloud Layer, even on low-end computers. However, when the size of the datasets grew immensely with more than a few million points, we started hitting performance issues while interacting with the layers, because of the hard-bound memory limitations of WebGL and the browser.
Dealing with data-intense 3D graphics in a browser context is a nightmare for most developers, but with the WebGL Context access provided by deck.gl library, we were able to improve and optimize the performance to an extent.
As previously mentioned, the layering system works well for dividing work among peers. We found it very easy to distribute and construct tasks that were specifically-layer focused. In turn, we had very few conflicts as we were building the product. Organically, the developer who initially wrote the layer became responsible for how the layer would interact and fit within the entire application.
We look forward to seeing deck.gl grow with Open-Source support and provide new features like streamable data layers, locking different layers together (to move around, all of them), panning and rotation of Point-Cloud Layers with only sampled points for smoother movement. We also like to see more layers being published in the future such as smooth-curved polygons, 3D polygons, and 3D paths.
As the community around deck.gl grows, it would be interesting to see how we can promote the production and availability of community-built open-source layers. With the rapid development speed of the framework, there are challenges around keeping the documentation updated with every release. Often, you’ll find documentation or examples that were written for previous versions of deck.gl that don’t work anymore.
Over the last few months, we’ve deployed the first version of our annotation tools on the web – making sure that we optimize for productivity. We ensured that our frontend was one seamless experience from the moment the user logs in, uploads their data, through to visualization, and providing feedback. Thus, our frontend is structured as a Single-page Application (SPA). One of the reasons we were able to move quickly and solve key issues is the community behind deck.gl. The documentation is always improving, and there are frequently new releases that solve issues that are open by the community. The code for deck.gl is also thoroughly documented and structured in a manner that makes it very easy to navigate around. We found ourselves patching up their code with relative ease.
Our backend involves a lot of Python code for data pipelining and AI-based inference on the data that we collect. We have extensive data-pipelining that makes it easy for customers to bring their data as-is. Our goal is to make sure that teams don’t have to spend any time to focus on the annotation problem. We’re constantly exploring how we can make our web tool more performant by using various methods of data serialization and implementing tiling to reduce annotation time.
If you’re interested in taking on some of these challenges with our team, or learning more in general, visit deepen.ai.