Computer Vision

Introduction

Prof. Dr. Ulrich Schwanecke

RheinMain University of Applied Sciences

🚀 by Decker

How to use the HTML slides

  • All materials can be found here
    • usr: CV
    • pwd: sose24
images/QR2DCVws2324.svg
  • Use the keys left/right for navigating through the slides.
  • Click icon (top left) to open the navigation menu.
  • Press f/ESC to enter/leave fullscreen mode.
  • Double-click an item (e.g. an image) to zoom in/out.
  • If the bottom boundary flashes on slide change, something was written on the virtual whiteboard.
    • Scroll down to see it.

About myself

Who am I?

  • Born in Darmstadt
    • Grown up in Wiesbaden
  • JoGu Mainz
  • TU Darmstadt
  • MPI Informatik, Saarbrücken
  • Daimler Chrysler Research, Ulm
  • RheinMain University of Applied Sciences, Wiesbaden
images/uli-map.png

About this course

Course Goal and Content

  • Goal
    • Gain an understanding of the theoretical and practical concepts of computer vision
      • Focus on 2D vision
    • After this course, you should be able to
      • develop and train computer vision models
      • repoduce results and
      • conduct original research
  • (Planned) Content
    1. Introduction, Organization
    2. Primitives, Transformations, Geometric Image Formation
    3. Photometric Image Formation, Image Sensing Pipeline
    4. Image Filtering
    5. Orthogonal Basis Transformation (Fourier)
    6. Features
    7. Motion
    8. Introduction to Machine Learning, Neural Networks
    9. Transfer Learning for Image Classification
    10. Object Detection
    11. Image Segmentation
    12. Image Manipulation

Organization

  • SWS 2V + 2Ü, 6 ECTS, Total Workload: 180h
  • Lecture (14)
    • Friday, 10:00-11:30, D17/18
    • Apr. 19/26, May 03/10/17/24/31, June 07/14/21/28, July 05/12/19
  • Exercise Sessions
    • Friday, 11:45-13:15, D17/18. Submission each Thursday until 16:00 via read.MI
    • Exercises are mandatory
  • Exam
    • Content: lectures and exercises
    • Very likely oral (date and time will be announced)

Course Materials

Course Materials

Prerequisites

Prerequisites

  • Linear Algebra
    • Vectors: \(\mathbf{x}, \mathbf{y} \in \mathbb{R}^n\)
    • Matrices: \(\mathbf{A}, \mathbf{B} \in \mathbb{R}^{m\times n}\)
    • Operations:
      • \(\mathbf{x}^\top\mathbf{y}, \mathbf{x}\times\mathbf{y}\)
      • \(\mathbf{A}\mathbf{x}\)
      • \(\mathbf{A}^\top, \mathbf{A}^{-1}, \text{trace}(\mathbf{A}), \text{det}(\mathbf{A}), \mathbf{A}+\mathbf{B}, \mathbf{AB}\)
    • Norms: \(||\mathbf{x}||_1, ||\mathbf{x}||_2, ||\mathbf{x}||_\infty, ||\mathbf{A}||_F\)
    • Eigenvalues, Eigenvectors, SVD: \(\mathbf{A}=\mathbf{UDV}^\top\)
  • Calculus
    • Multivariate functions: \(f:\mathbb{R}^{n}\rightarrow \mathbb{R}\)
    • Partial derivatives: \(\frac{\partial f}{\partial x_i}, i=1,\ldots, n\), Gradient
    • Integrals: \(\int f(x)dx\)
  • Probability
    • Probability distributions: \(P(X=x)\)
    • Expectation: \(\mathbb{E}_{x\sim p}[f(x)] = \int_{x}p(x)f(x)dx\)
    • Variance: \(\text{Var}(f(x))=\mathbb{E}[(f(x)-\mathbb{E}[f(x)])^2]\)
    • Marginal: \(p(x)=\int p(x,y)dy\)
    • Conditional: \(p(x,y)=p(x|y)p(y)\)
    • Bayes rule: \(p(x|y) = p(y|x)/p(y)\)
    • Distributions: Uniform, Gaussian

Time Management


Activity Times Total
Attending (watching) the lecture 2h / week 24h
Self-study of lecture materials 2h / week 24h
Participation in exercise 2h / week 24h
Solving the assignments 6h / week 72h
Preparation for the final exam 36h 36h
Total workload 180h

About Computer Vision

Computer Vision

  • Goal of Computer Vision is to convert light into meaning (geometric, semantic, …)
images/lightpainting.png

Computer Vision Applications

  • Optical Character Recognition (a)

  • Mechanical Inspection / 3D Modelling (b)

  • Retail (c)

  • Medical Applications (d)

  • Automotive (Savety and Driving) (e)

  • Surveillance (f)

images/CV_Applications_1.png
[R. Szelisky ©]

Computer Vision Applications

  • Image Stitching / Video Stabilization
  • Exposure Bracketing
  • Robotics
  • Mobile Devices
  • Accessibility (e.g. Image Captioning), …
    images/SchnulliTaucht.png“A bird that is sitting on a branch”

images/ImageStichingSzelisky.png[R. Szelisky ©] images/ExposureBracketing.png[R. Szelisky ©] images/Quadruped_A1.png[quadruped.de ©] images/AR-Raccon-On-S21.pngMobile AR

Biological Vision vs. Computer Vision

  • Human Vision is the process of discovering what is present in the world and where it is by looking
images/HumanVisionScheme.png
[Adapted from K. Sutliff/Science ©]

Biological Vision vs. Computer Vision

  • Over 50% of the processing in the human brain is dedicated to visual information
images/BiiologicalOpticalSystem.png
[OpenStax College ©]

Biological Vision vs. Computer Vision

  • Computer Vision is the study of analyzing images to achieve results similar to those as by humans
images/ComputerVisionScheme.png
[Adapted from K. Sutliff/Science ©]

Artificial Intelligence

“An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves”

[John McCarthy at Dartmouth Summer Research Project on Artificial Intelligence, 1956]

  • Machine Learning
  • Computer Vision
  • Computer Graphics
  • Natural Language Processing
  • Robotics & Control
  • Art, Industry 4.0, Education, …
images/AI-Enviornment-Agent.png

Computer Vision vs. Computer Graphics

images/CV-CG.png

  • Computer Vision is an ill-posed inverse problem
    • Many 3D scenes yield the same 2D image
    • Additional constraints (knowledge about world) are required

Computer Vision vs. Image Processing

  • Computer Vision seeks to achieve full scene understanding (in contrast to (classical) Image Processing)
images/CV-ImageProcessing.png
[R. Szelisky ©]

Computer Vision and Machine Learning

images/imagenet.png
[https://image-net.org/static_files/papers/imagenet_cvpr09.pdf]

The Deep Learning Revolution

images/image_classification_006.png
[https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/ ©]

Why is Visual Perception hard?

images/Einstein.png
What we see
images/EinsteinMatrix.png
What the computer sees

Why is Visual Perception hard?

images/CV-CG.png

  • Image are 2D Projections of the 3D World
    • Many 3D scenes yield the same 2D image
    • Additional constraints (knowledge about world) are required

Images are 2D Projections of the 3D World

Adelson and Pentland’s workshop metaphor:

  • To explain an image (a) in terms of reflectance, lighting and shape, a painter (b), a light designer (c) and a sculptor (d) will design three different, but plausible, solutions.
images/AdelsonPentland.png
E. H. Adelson, A. P. Pentland: The perception of shading and reflectance, 1996. D. C. Knill: Perception as Bayesian inference, 1996

Images are 2D Projections of the 3D World

Perspective Illusion:

Images are 2D Projections of the 3D World

Perspective Illusion (Ames Room)

images/AmesRoomFront.png
images/AmesRoomAbove.png

Challenges: Occlusion

images/StarwarsMagritt_small.png
[https://imgur.com/a/nQJss ©]

Challenges: Illumination

Challenges: Motion

images/BlurryBee.png
[https://commons.wikimedia.org/wiki/File:Heliopsis_helianthoides_var._scabra_Summer_Sun_4zz.jpg#/media/File:Heliopsis_helianthoides_var._scabra_Summer_Sun_4zz.jpg]

Challenges: Motion

images/Rolling_shutter.png
[https://commons.wikimedia.org/wiki/File:Rolling_shutter_näidis.png]

images/Rolling_shutter_effect_animation.gif[https://commons.wikimedia.org/wiki/File:Rolling_shutter_effect.svg]

Challenges: Perception vs. Measurement

images/checkershadowillusion.png
[http://persci.mit.edu/gallery/checkershadow]

Challenges: Perception vs. Measurement

images/PerceptionVsMeasurement.png

Challenges: Perception vs. Measurement

images/dalmatian.png

Challenges: Perception vs. Measurement

images/ParrotOrWomen.png

Challenges: Perception vs. Measurement

images/RotatingSnakes.png
Rotation Snakes by Kitaoka Akiyoshi http://www.ritsumei.ac.jp/~akitaoka/index-e.html

Challenges: Deformation and Intra Class Variation

images/Chairs.png
[M. Aubry, D. Maturana, A. Efros, B. Russel and J.Sivic, Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models]

Timeline of Computer Vision

images/TimelineOfComputerVision.svg

Next Lecture

  • Primitives
    • Points, Lines and Planes
    • Homogeneous Coordinates
  • Transformations
    • 2D / 3D Transformations
    • Homography Estimation
  • Geometric Image Formation
    • Pinhole Camera
    • Projection Models
    • Lens Distortion