How to use the HTML slides

All materials can be found here
- usr: CV
- pwd: sose24

Use the keys left/right for navigating through the slides.
Click icon (top left) to open the navigation menu.
Press f/ESC to enter/leave fullscreen mode.
Double-click an item (e.g. an image) to zoom in/out.
If the bottom boundary flashes on slide change, something was written on the virtual whiteboard.
- Scroll down to see it.

Who am I?

Born in Darmstadt
- Grown up in Wiesbaden
JoGu Mainz
TU Darmstadt
MPI Informatik, Saarbrücken
Daimler Chrysler Research, Ulm
RheinMain University of Applied Sciences, Wiesbaden
- Gründungsmitglied hessian.AI
- Mitgründer aivju.de

Course Goal and Content

Goal
- Gain an understanding of the theoretical and practical concepts of computer vision
  - Focus on 2D vision
- After this course, you should be able to
  - develop and train computer vision models
  - repoduce results and
  - conduct original research

(Planned) Content
1. Introduction, Organization
2. Primitives, Transformations, Geometric Image Formation
3. Photometric Image Formation, Image Sensing Pipeline
4. Image Filtering
5. Orthogonal Basis Transformation (Fourier)
6. Features
7. Motion
8. Introduction to Machine Learning, Neural Networks
9. Transfer Learning for Image Classification
10. Object Detection
11. Image Segmentation
12. Image Manipulation

Organization

SWS 2V + 2Ü, 6 ECTS, Total Workload: 180h
Lecture (14)
- Friday, 10:00-11:30, D17/18
- Apr. 19/26, May 03/10/17/24/31, June 07/14/21/28, July 05/12/19
Exercise Sessions
- Friday, 11:45-13:15, D17/18. Submission each Thursday until 16:00 via read.MI
- Exercises are mandatory
Exam
- Content: lectures and exercises
- Very likely oral (date and time will be announced)

Course Materials

Books
- R. Szeliski, Computer Vision: Algorithms and Applications, Springer 2011
  https://szeliski.org/Book
- I. Gooldfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press 2016
  https://www.deeplearningbook.org
- J. E. Solem, Programming Computer Vision with Python, O’Reilly 2012
- V. K. Ayyadevara, Y. Reddy, Modern Computer Vision with PyTorch, Packt 2020
- M. P. Deisenroth, A. A. Faisal, C. S. Ong, Mathematics for Machine Learning
  https://mml-book.github.io
- K. B. Petersen, M. S. Pedersen, The Matrix Cookbook
  http://www.cs.toronto.edu/~bonner/courses/2012s/csc338/matrix_cookbook.pdf

Course Materials

Tutorials
- The Python Tutorial: https://docs.python.org/3/tutorial
- Numpy Quickstart: https://numpy.org/devdocs/user/quickstart.html
- PyTorch Tutorial: https://pytorch.org/tutorials
Frameworks, IDEs
- Visual Studio Code: https://code.visualstudio.com/
- Google Colab: https://colab.research.google.com
Courses
- Slide deck covering Szeliski’s book https://szeliski.org/Book
- I. Gkioulekas, Computer Vision https://www.cs.cmu.edu/~16385/
- A. Owens, Foundations of Computer Vision https://web.eecs.umich.edu/~ahowens/eecs504/w20/

Prerequisites

Basic math skills
- Linear Algebra, Calculus, Probability
Basis computer science skills
- Variables, functions, loops, classes, algoritms
Basic Python coding skills
- https://docs.python.org/3/turorial/
Basic PyTorch coding skills
- https://pytorch.org/turorials

Prerequisites

Linear Algebra
- Vectors: \(\mathbf{x}, \mathbf{y} \in \mathbb{R}^n\)
- Matrices: \(\mathbf{A}, \mathbf{B} \in \mathbb{R}^{m\times n}\)
- Operations:
  - \(\mathbf{x}^\top\mathbf{y}, \mathbf{x}\times\mathbf{y}\)
  - \(\mathbf{A}\mathbf{x}\)
  - \(\mathbf{A}^\top, \mathbf{A}^{-1}, \text{trace}(\mathbf{A}), \text{det}(\mathbf{A}), \mathbf{A}+\mathbf{B}, \mathbf{AB}\)
- Norms: \(||\mathbf{x}||_1, ||\mathbf{x}||_2, ||\mathbf{x}||_\infty, ||\mathbf{A}||_F\)
- Eigenvalues, Eigenvectors, SVD: \(\mathbf{A}=\mathbf{UDV}^\top\)
Calculus
- Multivariate functions: \(f:\mathbb{R}^{n}\rightarrow \mathbb{R}\)
- Partial derivatives: \(\frac{\partial f}{\partial x_i}, i=1,\ldots, n\), Gradient
- Integrals: \(\int f(x)dx\)

Probability
- Probability distributions: \(P(X=x)\)
- Expectation: \(\mathbb{E}_{x\sim p}[f(x)] = \int_{x}p(x)f(x)dx\)
- Variance: \(\text{Var}(f(x))=\mathbb{E}[(f(x)-\mathbb{E}[f(x)])^2]\)
- Marginal: \(p(x)=\int p(x,y)dy\)
- Conditional: \(p(x,y)=p(x|y)p(y)\)
- Bayes rule: \(p(x|y) = p(y|x)/p(y)\)
- Distributions: Uniform, Gaussian

Time Management

Activity	Times	Total
Attending (watching) the lecture	2h / week	24h
Self-study of lecture materials	2h / week	24h
Participation in exercise	2h / week	24h
Solving the assignments	6h / week	72h
Preparation for the final exam	36h	36h
Total workload		180h

Computer Vision

Goal of Computer Vision is to convert light into meaning (geometric, semantic, …)

Computer Vision Applications

Optical Character Recognition (a)
Mechanical Inspection / 3D Modelling (b)
Retail (c)
Medical Applications (d)
Automotive (Savety and Driving) (e)
Surveillance (f)

images/CV_Applications_1.png — [R. Szelisky ©]

Computer Vision Applications

Image Stitching / Video Stabilization
Exposure Bracketing
Robotics
Mobile Devices
Accessibility (e.g. Image Captioning), …

“A bird that is sitting on a branch”

images/ImageStichingSzelisky.png [R. Szelisky ©] images/ExposureBracketing.png [R. Szelisky ©] images/Quadruped_A1.png [quadruped.de ©] images/AR-Raccon-On-S21.png Mobile AR

Biological Vision vs. Computer Vision

Human Vision is the process of discovering what is present in the world and where it is by looking

images/HumanVisionScheme.png — [Adapted from K. Sutliff/Science ©]

Biological Vision vs. Computer Vision

Over 50% of the processing in the human brain is dedicated to visual information

images/BiiologicalOpticalSystem.png — [OpenStax College ©]

Biological Vision vs. Computer Vision

Computer Vision is the study of analyzing images to achieve results similar to those as by humans

images/ComputerVisionScheme.png — [Adapted from K. Sutliff/Science ©]

Artificial Intelligence

“An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves”

[John McCarthy at Dartmouth Summer Research Project on Artificial Intelligence, 1956]

Machine Learning
Computer Vision
Computer Graphics
Natural Language Processing
Robotics & Control
Art, Industry 4.0, Education, …

Computer Vision vs. Computer Graphics

Computer Vision is an ill-posed inverse problem
- Many 3D scenes yield the same 2D image
- Additional constraints (knowledge about world) are required

Computer Vision vs. Image Processing

Computer Vision seeks to achieve full scene understanding (in contrast to (classical) Image Processing)

images/CV-ImageProcessing.png — [R. Szelisky ©]

Computer Vision and Machine Learning

ImageNet https://www.image-net.org/

images/imagenet.png — [https://image-net.org/static_files/papers/imagenet_cvpr09.pdf]

The Deep Learning Revolution

images/image_classification_006.png — [https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/ ©]

Why is Visual Perception hard?

images/EinsteinMatrix.png — What the computer sees

Why is Visual Perception hard?

Image are 2D Projections of the 3D World
- Many 3D scenes yield the same 2D image
- Additional constraints (knowledge about world) are required

Images are 2D Projections of the 3D World

Adelson and Pentland’s workshop metaphor:

To explain an image (a) in terms of reflectance, lighting and shape, a painter (b), a light designer (c) and a sculptor (d) will design three different, but plausible, solutions.

images/AdelsonPentland.png — E. H. Adelson, A. P. Pentland: The perception of shading and reflectance, 1996. D. C. Knill: Perception as Bayesian inference, 1996

Images are 2D Projections of the 3D World

Perspective Illusion:

Images are 2D Projections of the 3D World

Perspective Illusion (Ames Room)

Challenges: Occlusion

images/StarwarsMagritt_small.png — [https://imgur.com/a/nQJss ©]

Challenges: Illumination

Challenges: Motion

images/BlurryBee.png — [https://commons.wikimedia.org/wiki/File:Heliopsis_helianthoides_var._scabra_Summer_Sun_4zz.jpg#/media/File:Heliopsis_helianthoides_var._scabra_Summer_Sun_4zz.jpg]

Challenges: Motion

images/Rolling_shutter.png — [https://commons.wikimedia.org/wiki/File:Rolling_shutter_näidis.png]

images/Rolling_shutter_effect_animation.gif

[https://commons.wikimedia.org/wiki/File:Rolling_shutter_effect.svg]

Challenges: Perception vs. Measurement

images/checkershadowillusion.png — [http://persci.mit.edu/gallery/checkershadow]

Challenges: Perception vs. Measurement

Challenges: Perception vs. Measurement

Challenges: Perception vs. Measurement

Challenges: Perception vs. Measurement

images/RotatingSnakes.png — Rotation Snakes by Kitaoka Akiyoshi http://www.ritsumei.ac.jp/~akitaoka/index-e.html

Challenges: Deformation and Intra Class Variation

images/Chairs.png — [M. Aubry, D. Maturana, A. Efros, B. Russel and J.Sivic, Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models]

Timeline of Computer Vision

Next Lecture

Primitives
- Points, Lines and Planes
- Homogeneous Coordinates
Transformations
- 2D / 3D Transformations
- Homography Estimation
Geometric Image Formation
- Pinhole Camera
- Projection Models
- Lens Distortion

How to use the HTML slides

About myself

Who am I?

About this course

Course Goal and Content

Organization

Course Materials

Course Materials

Prerequisites

Prerequisites

Time Management

About Computer Vision

Computer Vision

Computer Vision Applications

Computer Vision Applications

Biological Vision vs. Computer Vision

Biological Vision vs. Computer Vision

Biological Vision vs. Computer Vision

Artificial Intelligence

Computer Vision vs. Computer Graphics

Computer Vision vs. Image Processing

Computer Vision and Machine Learning

The Deep Learning Revolution

Why is Visual Perception hard?

Why is Visual Perception hard?

Images are 2D Projections of the 3D World

Images are 2D Projections of the 3D World

Images are 2D Projections of the 3D World

Challenges: Occlusion

Challenges: Illumination

Challenges: Motion

Challenges: Motion

Challenges: Perception vs. Measurement

Challenges: Perception vs. Measurement

Challenges: Perception vs. Measurement

Challenges: Perception vs. Measurement

Challenges: Perception vs. Measurement

Challenges: Deformation and Intra Class Variation

Timeline of Computer Vision

Next Lecture