Pattern Recognition and Image Processing

A course covering the basics of pattern recognition and image understanding aimed at upper level undergraduate students.

  • cameras, camera models, biological eyes
  • images as arrays, linear filters, gradient filters, anisotropic filters
  • frequency domain, Fourier transform
  • image compression
  • nonlinear filters, document normalization
  • binary morphology, grayscale morphology
  • distance transform, labeling
  • template matching
  • introduction to classification
  • nearest neighbor classification
  • probability theory, Bayesian decision theory
  • parametric Gaussian classifer
  • perceptrons
  • linear least square, logistic regression
  • gradient descent
  • nonlinear classifiers
  • k-means algorithm
  • PCA
  • dynamic time warping
  • Markov models
  • HMM recognizers and OCR

Materials

Exam Questions

These are sample questions like they might be asked in an exam. You may be asked these questions in an oral exam, or additional questions. You should be able to give clear and concise answers quickly to these kinds of questions, since there isn’t much time in an oral exam. In preparing for the exam, try to find answers for these questions, using the lecture notes as a starting point and additional reading as needed.

Lecture 1

  • How are images represented in Python?
  • Write an expression to cut out an 11x11 central square of a color image stored in array a.
  • What do the interpolation and cmap arguments to imshow mean? What are possible values?
  • What is the difference between amin, min, and minimum in Python?
  • Write an expression to rotate an image stored in array a by 90 degrees counterclockwise.
  • Convert a grayscale image into a color image that displays like the grayscale image.

Lecture 2

  • What is the tristimulus theory?
  • How was it first determined experimentally that human color perception is based on three different receptors?
  • What is the definition of “color”?
  • What is the Bayer pattern?
  • How does the image recorded by the human eye differ from that recorded by a digital camera?
  • Does every pure color correspond to a single wavelength of light?
  • What is a metameric color?
  • Assign wavelengths to the perimeter of the CIE color space.
  • Discuss: should the emission spectra of a monitor be the same as the sensitivity curves of human photo receptors?
  • Why is the gamut of a monitor typically a triangle in color space?
  • What is color constancy?
  • How can color constancy be achieved?

Lecture 3

  • What is an impulse?
  • What is a box filter?
  • What is a Gaussian filter?
  • Name common boundary conditions for filters and explain their uses.
  • Give a statistical justification for blur filters.
  • What is a Mondrian model of images?
  • What is the definition of a linear filter?
  • What is an impulse response?
  • How do linearity, convolution, and impulse response relate?
  • What are important algebraic properties of convolution?
  • How might associativity of convolution be used to speed up filtering operations?
  • What is separability and how does it speed up convolutions?
  • What is the formula for a Gaussian filter?
  • How can you speed up the implementation of convolutions using array operations?

Lecture 4

  • What is a gradient filter?
  • What is the response of a gradient filter to a uniform input?
  • How do you compute a simple gradient filter on an array “a” (no library functions)?
  • What are the Prewitt and Sobel filters?
  • What does the impulse response of a gradient filter look like?
  • What does an ideal step edge look like in 1D?
  • What does a noisy step edge look like in 1D?
  • Write Python code to generate a noisy step edge in 1D.
  • What is the relationship between gradient filters on discrete images and derivatives?
  • What is the gradient magnitude filter on a 2D image, and how do you compute it?
  • Is gradient magnitude a linear filter?
  • How do you use histogramming to find a good edge detection threshold?
  • What is hysteresis thresholding?
  • What is the Laplace filter?
  • How do you use the Laplace filter for edge detection?
  • What is the definition of an anisotropic filter?
  • Give an example of an anisotropic filter.
  • How do you compute an anisotropic Gaussian with the major axis at an angle of theta?
  • What is the Mondrian model?
  • What is the blobworld model?
  • What are precision, recall, and the f-measure?

Lecture 5

  • What kind of boundary conditions are implied by the use of the Fourier transform?
  • Given that the Fourier transform is linear, does it make sense to talk of its impulse response?
  • What is the Fourier transform of a shifted impulse? Why?
  • What is Euler’s formula?
  • In what sense is the Fourier transform a “linear decomposition”?
  • Describe the relationship between the Fourier transform and orthonormal bases.
  • What are the basis functions of the 2D Fourier transform?
  • What does the Fourier transform of a lowpass filter look like?
  • What does the Fourier transform of a highpass filter look like?
  • What kind of filters show “ringing”? What does “ringing” look like?
  • What is the relationship between the Fourier transform and convolution?
  • What is the Fast Fourier Transform?
  • What does the FFT imply for the complexity of convolution? Why?
  • What is the Fourier Transform of a Gaussian?

Lecture 6

  • What is a Harmonic oscillator?
  • Why are decompositions into sines and cosines so important?
  • What is the relationship between sin and cos?
  • What is the frequency of a sine wave?
  • What is the phase of an oscillation?
  • Define: octave.
  • Given a collection of sine waves of different phases and frequencies, which sine waves are orthogonal to each other when the signal is viewed as a vector?
  • For a single frequency, how many sine waves of different phases do you need to reconstruct all possible phases by linear combination?
  • For signals of length N, how many sine and how many cosine components does the Fourier transform have?
  • What is a spectrum and how is it related to the Fourier transform?
  • Write down the formulas for addition/subtraction/multiplication/division of complex numbers.
  • When viewing complex numbers as 2D vectors, what do multiplication and division do?
  • What is Euler’s formula?
  • What is the formula for the DFT using complex numbers?
  • What is the Fourier transform of a time-translated signal?
  • What is the Fourier transform of a time-scaled signal?
  • What is the Thomasson-Lanczos / Cooley-Tukey / Gauss Lemma?
  • Explain the FFT algorithm.

Lecture 7

  • Show that (some filter) is not a linear filter.
  • Show that the median filter is not a linear filter.
  • What is the definition of a linear filter?
  • Name common nonlinear filters.
  • Which of these images was median-filtered?
  • What is the median filter used for?
  • Is the median filter separable? Prove it.
  • What is a rank filter?
  • How would you use nonlinear filters for removing stains and weak printing for a historical document?
  • How do you express binary images as sets in mathematical morphology?
  • What Python expressions correspond to these set operations?
  • What are the basic operations of binary morphology?
  • How are they defined?
  • What are the duality relations between mathematical morphology operators?
  • What algebraic properties do the operators of mathematical morphology obey?
  • What is the hit-or-miss transform?
  • Identify from these images which mathematical morphology operator was applied.

Lecture 8

  • What are the four basic operations of binary morphology?
  • Define the four basic operations of binary morphology in terms of sets.
  • Express the basic operations of binary morphology in terms of Python image processing primitives over binary images.
  • Define the hit-or-miss transform.
  • What are the four basic operations of grayscale morphology?
  • Express the basic operations of grayscale morphology in terms of Python image processing primitives over grayscale images.
  • Given this ____ input/output pair, what morphological operation produced it?
  • What is the distance transform?
  • How can you compute a distance transform using grayscale morphological operations?
  • Describe an efficient dynamic programming algorithm for computing the distance transform.

Lecture 9

  • What is the connected component labeling algorithm?
  • What is the union-find algorithm?
  • How do you use the union-find algorithm for efficient connected component labeling?
  • Describe the union-find algorithm.
  • What Python library functions are used for connected component labeling and extraction?
  • What is template matching?
  • What is the relationship between template matching and cross-correlation?
  • What is the relationship between cross-correlation and convolution?
  • What is the relationship between template matching and nearest neighbor classification?
  • Describe robust methods for peak finding in an image based on Gaussian filtering and local maxima operations.
  • Describe robust methods for peak finding using grayscale morphology.

Lecture 10

  • Describe, in an object oriented way, what the relationship between nature and a classifier is.
  • What is a nearest neighbor classifier?
  • What is the misclassification rate?
  • What is a training set? What is a test set?
  • Why don’t we estimate error rates on the training set?
  • What is the margin of error of the estimate of a classifier’s error rate?
  • What is the MNIST classification problem?
  • Describe common methods of feature extraction and preprocessing for handwritten digit recognition?
  • What are common norms or metrics used for nearest neighbor classification?
  • What is the definition of a norm?
  • What is cosine dissimilarity?
  • Is cosine dissimilarity a norm?
  • What is a “canonical element”?
  • What are common transformations for canonicalization?

Lecture 11

  • What is a binomial distribution?
  • What is the goal of parameter estimation?
  • How do you estimate the parameter of a binomial distribution?
  • What are the mean and variance of the binomial distribution?
  • What is an extreme value distribution?
  • Given two normal distributions with the same mean and fixed size blocks of samples, what can you say about the maxima of each block?
  • Given a medical test with an error rate of 1%, what can you say about the probability that a patient has the disease given a positive test result? What factors is this influenced by?
  • What is Bayes rule?
  • What is the decision rule that minimizes the misclassification rate?
  • What is a 0-1 loss function?
  • What is a class conditional density?
  • What is a posterior distribution?
  • How do you compute the posterior distribution from the class conditional density?

Lecture 12

  • Given a symmetric real matrix, how do you decompose it into a diagonal matrix and rotations?
  • What is the relationship between a real symmetric matrix and an ellipse?
  • What is a quadratic form?
  • How do quadratic forms relate to norms and metrics?
  • What is positive-definiteness?
  • What is the definition of the determinant?
  • What is the trace of a matrix?
  • How does the trace of a matrix relate to its eigenvalues?
  • How does the determinant of a matrix relate to its eigenvalues?
  • What is a standard normal distribution?
  • Why are normal densities so important and common?
  • What is the probability that a random sample is outside the range of 1, 2, 3 standard deviations from the mean?
  • What is the central limit theorem?
  • What is a characteristic function?
  • Show that the covariance matrix can be interpreted as a quadratic form.
  • How do you obtain an arbitrary $n$-dimensional normal density from a set of $n$ independent normally distributed random variables with mean 0 and standard deviation 1.
  • Explain the relationship between the quadratic form determined by the covariance matrix and measurements of independent normal variables.
  • How do you compute a parametric classifier for normal densities?

Lecture 13

  • What is a parametric classifier?
  • What is a discriminative classifier?
  • What is a generative classifier?
  • Given the class conditional density, how do you compute the posterior?
  • Describe how you would construct a parametric classifier for the MNIST task.
  • What is a discriminant function?
  • What is a decision function?
  • What is a decision region?
  • What is a decision boundary?
  • Assume a two class classification problem with normal class conditional densities. What form do the decision boundaries take? Under what conditions?
  • What is a perceptron classifier?
  • Describe the perceptron learning algorithm.
  • A perceptron classifier and a parametric classifier with two normal densities both have linear decision boundaries. What is the difference between the two tasks?