Machine Vision Project

Abstract

So far, you have successfully programmed robots using reactive control strategies and taken a deep dive into the robot localization problem. Next, you will be learning about machine vision and the role it plays in robotics. In contrast to previous project, this time you will have a great deal of freedom to shape your project towards your interests and learning goals.

Learning Objectives

Self-directed learning of a new robotics algorithm.
Implementing a robotics system with minimal scaffolding.

Teaming

For this project, you can work with one other student. If you want to have a team of three, please talk to the instructors beforehand.

Please fill out this Google Sheet when you have a project team and Github repo.

Project Topic

Your project should be about machine vision and its intersection with robotics. You will carry out that project either using a robot simulator, a computer vision dataset, the Neato, or some combination.

In your project proposal you will be coming up with an implementation plan. That is, if you are using a particular algorithm to solve a problem, which parts of the algorithm will you implement, and which will you use pre-built implementations for? Be strategic in these decisions to balance learning about algorithms with system building (e.g., programming the simulated Neatos (or another robot) to do something interesting). You have substantial latitude in shaping your project to focus on the parts you really want to learn (e.g., system design versus basic understanding of algorithms). That said, we expect that you will do some exploration of algorithms as part of this project (which could include implementation or perhaps substantial learning about an algorithm or class of algorithms).

Potential Algorithm Topics

Object tracking or detection
Image de-noising or correction; in-painting
Image segmentation; semantic labeling
Text recognition
Gesture recognition
Fiducial tracking and navigation
Visual odometry or SLAM
Structure from motion
Neural rendering (radiance fields; Gaussian splats)
Vision-Lidar or Vision-Inertial sensor fusion

(Note: there are some resources for these topics later in the document)

Robot Platform and Data Pipelines

One option is to continue to work with the Neato. In class, you’ll see the Neato camera setup and go through a simple demo of how to use the images to control a robot.
You may want to use an external dataset for your project. Here are some possible starting points.
- Visual Data
  and Papers with Code has a nice collection of computer vision datasets and projects.
- If you want a huge (but very cool) dataset for self-driving vehicles, consider using Waymo’s Open Dataset
- If you’re interested in machine learning for robot control, you might consider building off some of the datasets (and code) from the 2018 Robot Learning project You also might want to check out the Google Slides presentation summarizing their results.
- Pick any open dataset of imagery in a domain of interest to you (e.g., for ocean biology FathomNet is a nice option; for city landscapes Cityscapes could be good; for food check out Food-101…and so on!)
- A lot these datasets are big and fancy, but don’t be afraid to start with smaller, more classic datasets. That’s a great way to learn without having to deal with the greater complexity that comes with some of this data.

Deliverables

There are four deliverables for this project.

Project Proposal (due 10/22)

At a minimum, please include the answers to the following questions. You should include enough detail for us to be able to give you useful feedback. Submit your proposal on canvas.

Who is on your team?
What is the main idea of your project?
What are your learning goals for this project?
What algorithms or computer vision areas will you be exploring?
What components of the algorithm will you implement yourself, which will you use built-in code for? Why?
What is your MVP?
What is a stretch goal?
What do you view as the biggest risks to you being successful (where success means achieving your learning goals) on this project?
What might you need from the teaching team for you to be successful on this project?

In-class Presentation / Demo (11/11)

We’d like each team to spend about 10 minutes presenting what they did for this project. You can structure the presentation in whatever manner you’d like, however, you should try to meet these goals:

Explain the goal of your project
At a high-level explain how your system works
Demonstrate your system in action (either in a video [recommended] or live). If your system doesn’t work completely yet, that is fine, try to show at least one component of your system in action.
This presentation / demo should be very informal. This presentation will be assessed in a purely binary fashion (basically did you do the things above).

Please submit your presentation on canvas.

Code (Due 11/12)

You should turn in your code and writeup via Github. Submit a link on canvas when you’re all set!

Writeup (Due 11/12)

In your ROS package create a README.md file to hold your project writeup. Your writeup should touch on the following topics. We expect this writeup to be done in such a way that you are proud to include it as part of your professional portfolio. As such, please make sure to write the report so that it is understandable to an external audience. Make sure to add pictures to your report, links to Youtube videos, embedded animated Gifs (these can be recorded with the tool peek).

What was the goal of your project? Since everyone is doing a different project, you will have to spend some time setting this context.
How did you solve the problem (i.e., what methods / algorithms did you use and how do they work)? As above, since not everyone will be familiar with the algorithms you have chosen, you will need to spend some time explaining what you did and how everything works.
Describe a design decision you had to make when working on your project and what you ultimately did (and why)? These design decisions could be particular choices for how you implemented some part of an algorithm or perhaps a decision regarding which of two external packages to use in your project.
What if any challenges did you face along the way?
What would you do to improve your project if you had more time?
Did you learn any interesting lessons for future robotic programming projects? These could relate to working on robotics projects in teams, working on more open-ended (and longer term) problems, or any other relevant topic.

Resources and Potential Project Directions

Resources

Some picks (good starting points for machine learning flavored computer vision projects):

UZH FPV Drone Racing Dataset: Get OpenVins or GTSAM running to do some pose estimation. Try to implement or compute the optical flow between image frames.
KITTI Autonomous Car Dataset. Implement or use YOLO or other CNN to do 2D object detection for AV environment (e.g., this tutorial). Some form of 360 video to BEV model similar to this tutorial.
Some Foundation Model (e.g., Segment Anything). Try to run and use the SAM model for some structured navigation task with the neatos (e.g., detect stop signs…)
Try to run VINT on the Neato

In-Class Activities from Past CompRobo Offerings and Other Tutorials

Object Tracking
3D Structure from Motion
Image Filtering
Object Recognition
(the instructions on running the code are out of date, but the ideas might help).
Convert ROS image messages to OpenCV using CvBridge
Visual Navigation for Flying Robots
is a course on said topic. The linked page includes lectures and even some bag files.
Connecting a webcam to ROS and OpenCV
An overview of methods for object tracking
Slides from the CompRobo Learning Project 2018 Class Report Out
Mastering OpenCV with Practical Computer Vision Projects
Visual odometry resources (one example)
Canny edge detection
Template matching
Hough line transform
Basics of histograms
and histogram equalization
Basic Numpy tutorials
GUI Features in OpenCV
Basic Operations on Images
Arithmetic Operations on Images
Corner Detection
Numpy Examples List

Past projects to draw from

Machine Vision Projects from Fall 2022
Robot Learning Report out 2018
Computer Vision Project Writeups from 2017
- Self Driving Neato
- Computer Vision Emotion Detection
- Visual Localization
- Predicting Paths of Tracked Objects
- Neato Keeper
- Lane Follower
- Meal Recognition
- Pac Neato
- Neato Augmented Reality Parking