Machine Vision Project
Abstract
So far, you have successfully programmed robots using reactive control strategies and taken a deep dive into the robot localization problem. Next, you will be learning about machine vision and the role it plays in robotics. In contrast to previous project, this time you will have a great deal of freedom to shape your project towards your interests and learning goals.
Learning Objectives
- Self-directed learning of a new robotics algorithm.
- Implementing a robotics system with minimal scaffolding.
Teaming
For this project, you can work with one other student. If you want to have a team of three, please talk to the instructors beforehand.
Please fill out this Google Sheet when you have a project team and Github repo.
Project Topic
Your project should be about machine vision and its intersection with robotics. You will carry out that project either using a robot simulator, a computer vision dataset, the Neato, or some combination.
In your project proposal you will be coming up with an implementation plan. That is, if you are using a particular algorithm to solve a problem, which parts of the algorithm will you implement, and which will you use pre-built implementations for? Be strategic in these decisions to balance learning about algorithms with system building (e.g., programming the simulated Neatos (or another robot) to do something interesting). You have substantial latitude in shaping your project to focus on the parts you really want to learn (e.g., system design versus basic understanding of algorithms). That said, we expect that you will do some exploration of algorithms as part of this project (which could include implementation or perhaps substantial learning about an algorithm or class of algorithms).
Potential Algorithm Topics
- Object tracking or detection
- Image de-noising or correction; in-painting
- Image segmentation; semantic labeling
- Text recognition
- Gesture recognition
- Fiducial tracking and navigation
- Visual odometry or SLAM
- Structure from motion
- Neural rendering (radiance fields; Gaussian splats)
- Vision-Lidar or Vision-Inertial sensor fusion
(Note: there are some resources for these topics later in the document)
Robot Platform and Data Pipelines
- One option is to continue to work with the Neato. In class, you’ll see the Neato camera setup and go through a simple demo of how to use the images to control a robot.
- You may want to use an external dataset for your project. Here are some possible starting points.
-
Visual Data and Papers with Code has a nice collection of computer vision datasets and projects.
- If you want a huge (but very cool) dataset for self-driving vehicles, consider using Waymo’s
Open Dataset - If you’re interested in machine learning for robot control, you might consider building off some of
the datasets (and code) from the 2018 Robot Learning project You also might want to check outthe Google Slides presentation summarizing their results . - Pick any open dataset of imagery in a domain of interest to you (e.g., for ocean biology FathomNet is a nice option; for city landscapes Cityscapes could be good; for food check out Food-101…and so on!)
- A lot these datasets are big and fancy, but don’t be afraid to start with smaller, more classic datasets. That’s a great way to learn without having to deal with the greater complexity that comes with some of this data.
-
Deliverables
There are four deliverables for this project.
Project Proposal (due 10/22)
At a minimum, please include the answers to the following questions. You should include enough detail for us to be able to give you useful feedback. Submit your proposal on canvas.
- Who is on your team?
- What is the main idea of your project?
- What are your learning goals for this project?
- What algorithms or computer vision areas will you be exploring?
- What components of the algorithm will you implement yourself, which will you use built-in code for? Why?
- What is your MVP?
- What is a stretch goal?
- What do you view as the biggest risks to you being successful (where success means achieving your learning goals) on this project?
- What might you need from the teaching team for you to be successful on this project?
In-class Presentation / Demo (11/11)
We’d like each team to spend about 10 minutes presenting what they did for this project. You can structure the presentation in whatever manner you’d like, however, you should try to meet these goals:
- Explain the goal of your project
- At a high-level explain how your system works
- Demonstrate your system in action (either in a video [recommended] or live). If your system doesn’t work completely yet, that is fine, try to show at least one component of your system in action.
- This presentation / demo should be very informal. This presentation will be assessed in a purely binary fashion (basically did you do the things above).
Please submit your presentation on canvas.
Code (Due 11/12)
- You should turn in your code and writeup via Github. Submit a link on canvas when you’re all set!
Writeup (Due 11/12)
In your ROS package create a README.md
file to hold your project writeup. Your writeup should touch on the following topics. We expect this writeup to be done in such a way that you are proud to include it as part of your professional portfolio. As such, please make sure to write the report so that it is understandable to an external audience. Make sure to add pictures to your report, links to Youtube videos, embedded animated Gifs (these can be recorded with the tool peek
).
- What was the goal of your project? Since everyone is doing a different project, you will have to spend some time setting this context.
- How did you solve the problem (i.e., what methods / algorithms did you use and how do they work)? As above, since not everyone will be familiar with the algorithms you have chosen, you will need to spend some time explaining what you did and how everything works.
- Describe a design decision you had to make when working on your project and what you ultimately did (and why)? These design decisions could be particular choices for how you implemented some part of an algorithm or perhaps a decision regarding which of two external packages to use in your project.
- What if any challenges did you face along the way?
- What would you do to improve your project if you had more time?
- Did you learn any interesting lessons for future robotic programming projects? These could relate to working on robotics projects in teams, working on more open-ended (and longer term) problems, or any other relevant topic.
Resources and Potential Project Directions
Resources
Some picks (good starting points for machine learning flavored computer vision projects):
- UZH FPV Drone Racing Dataset: Get OpenVins or GTSAM running to do some pose estimation. Try to implement or compute the optical flow between image frames.
- KITTI Autonomous Car Dataset. Implement or use YOLO or other CNN to do 2D object detection for AV environment (e.g., this tutorial). Some form of 360 video to BEV model similar to this tutorial.
- Some Foundation Model (e.g., Segment Anything). Try to run and use the SAM model for some structured navigation task with the neatos (e.g., detect stop signs…)
- Try to run VINT on the Neato
In-Class Activities from Past CompRobo Offerings and Other Tutorials
-
Object Tracking -
3D Structure from Motion -
Image Filtering -
Object Recognition (the instructions on running the code are out of date, but the ideas might help).
-
Convert ROS image messages to OpenCV using CvBridge -
Visual Navigation for Flying Robots is a course on said topic. The linked page includes lectures and even some bag files.
-
Connecting a webcam to ROS and OpenCV -
An overview of methods for object tracking -
Slides from the CompRobo Learning Project 2018 Class Report Out -
Mastering OpenCV with Practical Computer Vision Projects - Visual odometry resources (
one example ) -
Canny edge detection -
Template matching -
Hough line transform -
Basics of histograms and
histogram equalization -
Basic Numpy tutorials -
GUI Features in OpenCV -
Basic Operations on Images -
Arithmetic Operations on Images -
Corner Detection -
Numpy Examples List
Past projects to draw from
- Machine Vision Projects from Fall 2022
-
Robot Learning Report out 2018 - Computer Vision Project Writeups from 2017
-
Self Driving Neato -
Computer Vision Emotion Detection -
Visual Localization -
Predicting Paths of Tracked Objects -
Neato Keeper -
Lane Follower -
Meal Recognition -
Pac Neato -
Neato Augmented Reality Parking
-