Files
004_comission/vinniesniper-54816/task1/digest.md
louiscklaw cacba493ca update,
2025-01-31 22:31:22 +08:00

3.8 KiB

CS4185 Multimedia Technologies and Applications

What does this program do?

  • Loads aninput imageand 1000 database imagesto be compared with it.

  • Converts the images to grayscale

  • Compares the base image with the database image using pixel-by-pixel difference.

  • Displays the numerical matching parameters obtained.

  • Displays the input image and the best match result.

Basic Requirements (80%)

  • Automatically changing some setting according to the extracted features is allowed.

Students are required to finish the following four tasks in the basic requirements:

  1. Improve the number of correctly matched images (20%)

  2. Modify the above program to retrieve similar images (20%)

    • Utilize color information.

      • Color Models

        • The RGB Color Models
          • Quantization
      • Color Histograms

        • disadvantages
          1. Color similarity across histogram bins is not considered
          2. Spatial color layout is not considered
    • Using different layout.

      • Color Layout
        • http://en.wikipedia.org/wiki/Color_layout_descriptor
        • Need for Color Layout -> Global color features give too many false positives.
        • How it works:
          • Divide the whole image into sub-blocks.
          • Extract features from each sub-block.
        • Can we go one step further?
          • Divide the image into regions based on color feature concentration.
          • This process is called segmentation
    • Utilize edge and shape information.

      • Circle Hough transform:

      • segmentation

        • Features for local regions in the image

        • Interest points: corners, edges and others

        • Keypoints:

          • points in images, which are invariant to image translation, scale and rotation, and are minimally affected by noise and small distortions
        • Scale-invariant feature transform (SIFT) by David Lowe

          • Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging parameters

          • advantages:

            • Locality:features are local, so robust to occlusion and clutter (no prior segmentation)
            • Distinctiveness:individual features can be matched to a large database of objects
            • Quantity:many features can be generated for even small objects
            • Efficiency:close to real-time performance - Extensibility:can easily be extended to wide range of differing feature types, with each adding robustness
          • Detect keypoints using the SIFT detector

            • tutorial3 -> page 13 ~ 18
          • R-trees, SR-Trees ?

    • Features fusion

      • Image Features Measures
      • Image Distance Measures
      • db
        • tutorial4 -> page 9
  3. Improve on the Precision (20%)

    • The percentage of retrieved images that are matched
  4. Improve on the Recall (20%)

    • the percentage of matched images that are retrieved.

The extension includes two parts,

technical improvementand UI design.

The technical improvement :

  • 15% of marks will be given based on the technical difficulties
  • may include new retrieval algorithms
    • (e.g., 80+% of precision and 55+% of recall),
  • high dimensional data indexing
    • (efficiently storing and managing the features extracted from the database, modifying the program so that it does not need to compute the features every time),
  • retrieval algorithms for particular types of images(e.g., sunset images, images containing human faces),
  • a crawler to obtain images from the internet, or adding semantic informationto help improve the retrieval performance.

UI design

  • 10% will be given based on the UI design.

Submission

  • Program
  • Demo
  • Report