This commit is contained in:
louiscklaw
2025-02-01 01:19:51 +08:00
commit 3b0b154910
32597 changed files with 1171319 additions and 0 deletions

View File

@@ -0,0 +1,17 @@
vecs/
venv/
opencv/
samples/
annotations/
stage_outputs/
negative_images/
positive_images/
positive_images_old/
positive_images_gray/
.vscode/
.venv
*.MOV
*.vec
*.jpg
*.png
*.avi

View File

@@ -0,0 +1,207 @@
`createsamples.pl`: Copyright (c) 2008, Naotoshi Seo
From: https://github.com/sonots/tutorial-haartraining
`mergevec.py`: Copyright (c) 2014, Blake Wulfe
From: https://github.com/wulfebw/mergevec
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2021 Eech Hsiao, ARVI AI INC.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,285 @@
# Training, Classifying (Haar Cascades), and Tracking
Training our own Haar Cascade in OpenCV with Python. A cascade classifier has multiple stages of filtration. Using a sliding window approach, the image region within the window goes through the cascade. Can easily test accuracy of cascade with `classifier.py` script, which takes single images, directory of images, videos, and camera inputs. However, we want to also track our ROI (region of interest). This is because detection (through cascades, etc) is in general, more consuming and computationally taxing. Tracking algorithms are generally considered less taxing, because you know a lot about the apperance of the object in the ROI already. Thus, in the next frame, you use the information from the previous frame to predict and localize where the object is in the frames after.
There are many different types of tracking algorithms that are available through `opencv_contrib/tracking`, such as KCF, MOSSE, TLD, CSRT, etc. Here's a [good video](https://www.youtube.com/watch?v=61QjSz-oLr8) that demonstrates some of these tracking algorithms. Depending on your use case, the one you choose will differ.
## Jump to Section
* [Environment Setup](#environment-setup)
* [Image Scraping](#image-scraping)
* [Postive & Negative Image Sets](#positive-&-negative-image-sets)
* [Positive Samples Image Augmentation](#positive-samples-image-augmentation)
* [CSV Bounding Box Coordinates](#csv-bounding-box-coordinates)
* [Training](#training)
* [Testing Cascade](#testing-cascade)
* [Video Conversions](#video-conversions)
* [Contributing](#contributing)
* [Acknowledgements](#acknowledgements)
* [References](#references)
## Environment Setup
* Ubuntu 18.04; 20.04
* OpenCV 3.x.x (for running cascade training functions; built from source)
* OpenCV 4.4.0 (for running tracking algorithms)
* OpenCV Contrib (branch parallel with OpenCV 4.4.0)
As a clarification, the listed Ubuntu variants are for quick easy installs, depending on whether you're using `apt` or `pip`. To specifically get the OpenCV version you want, you will need to build from source (especially when you want to downgrade packages). I mainly used a Python virtual environment `venv` for package management. You can build and install OpenCV from source in the virtual environment (especially if you want a specific development branch or full control of compile options), or you can use `pip` locally in the `venv`. Packages included are shown in the `requirements.txt` file for reproducing the specific environment.
The project directory tree will look similar to the following below, and might change depending on the arguments passed to the scripts.
<blockquote>
```
.
├── classifier.py
├── bin
│   └── createsamples.pl
├── negative_images
│   └── *.jpg / *.png
├── positive_images
│   └── *.jpg / *.png
├── negatives.txt
├── positives.txt
├── requirements.txt
├── samples
│   └── *.vec
├── stage_outputs
│   ├── cascade.xml
│   ├── params.xml
│   └── stage*.xml
├── tools
│   ├── bbox_from_vid.py
│   └── mergevec.py
└── venv
```
</blockquote>
## Image Scraping
Go ahead and web scrape relevant negative images for training. Once you have a good amount, filter extensions that aren't `*.jpg` or `*.png` such as `*.gif`. Afterwards, we'll convert all the `*.png` images to `*jpg` using the following command:
```
mogrify -format jpg *.png
```
Then we can delete these `*.png` images. Let's also rename all the images within the directoy to be `img.jpg`
```
ls | cat -n | while read n f; do mv "$f" "img$n.jpg"; done
```
To check if all files within our directory are valid `*.jpg` files:
```
find -name '*.jpg' -exec identify -format "%f" {} \; 1>pass.txt 2>errors.txt
```
## Positive & Negative Image Sets
Positive images correspond to images with detected objects. Images were cropped to 150 x 150 px training set. Negative images are images that are visually close to positive images, but *must not have* any positive image sets within.
<blockquote>
```
/images
img1.png
img2.png
positives.txt
```
</blockquote>
To generate your `*.txt` file, run the following command, make sure to change image extension to whatever file type you're using.
```
find ./positive_images -iname "*.png" > positives.txt
```
As a quote from OpenCV docs:
<blockquote>
Negative samples are taken from arbitrary images. These images must not contain detected objects. [...] Described images may be of different sizes. But each image should be (but not nessesarily) larger then a training window size, because these images are used to subsample negative image to the training size.
</blockquote>
## Positive Samples Image Augmentation
We need to create a whole bunch of image samples, and we'll be using `OpenCV 3.x.x` to augment these images. [These tools / functionalities were disabled during legacy C API](https://github.com/opencv/opencv/issues/13231#issuecomment-440577461), so we'll need to first be on a downgraded version of OpenCV, and once we have our trained cascade model, we can upgrade back to `4.x.x`. As mentioned in the link earlier, most modern approaches use deep learning approaches. However having used Cascades, they still their applications! Anyways, to create a training set as a collection of PNG images:
```
opencv_createsamples -img ~/opencv-cascade-tracker/positive_images/img1.png\
-bg ~/opencv-cascade-tracker/negatives.txt -info ~/opencv-cascade-tracker/annotations/annotations.lst\
-pngoutput -maxxangle 0.1 -maxyangle 0.1 -maxzangle 0.1
```
But we need a whole bunch of these. To augment a set of positive samples with negative samples, let's run the perl script that Naotoshi Seo wrote:
```
perl bin/createsamples.pl positives.txt negatives.txt samples 1500\
"opencv_createsamples -bgcolor 0 -bgthresh 0 -maxxangle 1.1\
-maxyangle 1.1 maxzangle 0.5 -maxidev 40 -w 50 -h 50"
```
Merge all `*.vec` files into a single `samples.vec` file:
```
python ./tools/mergevec.py -v samples/ -o samples.vec
```
**Errors**: If you run into the following error when running `mergevec.py`:
```
Traceback (most recent call last):
File "./tools/mergevec.py", line 170, in <module>
merge_vec_files(vec_directory, output_filename)
File "./tools/mergevec.py", line 133, in merge_vec_files
val = struct.unpack('<iihh', content[:12])
struct.error: unpack requires a string argument of length 12
```
You need to remove all `*.vec` files with size 0 in your `samples` directory. Simply `cd samples` into the directory and double check with `ls -l -S` for file sizes, and run:
```
find . -type f -empty -delete
```
Note: others have said that using artifical data vectors is not the best way to train a classifier. Personally, I have used this method and it worked fine for my use cases. However, you may approach this idea with a grain of salt and skip this step. If you want to hand select your ROI, you may use OpenCV's `opencv_annotation` function to select your regions of interest in a directory of images.
If you don't want to use artifical data, you will need to have your images and the coordinates of your region of interest. You can then use `opencv_createsamples` to merge all these images into a single vector `.vec` file and use this for training instead. You will need to supply an `info.dat` file that contains the image instance and the object coordinates in `(x, y, width, height)`. Your `info.dat` should look something like below, and can also be created using `opencv_annotation` or other customized data piplining methodologies you will need to implement.
<blockquote>
```
img/img1.jpg 1 140 100 45 45
img/img2.jpg 2 100 200 50 50 50 30 25 25
```
</blockquote>
## CSV Bounding Box Coordinates
In `tools`, there is a script called `bbox_from_vid.py` which can be used to generate the `(x_min, y_min, x_max, y_max)` coordinates, where `min` is from top left point of the bounding box and `max` is the bottom right point. It uses OpenCV's tracking algorithms to track the object selected.
```
usage: bbox_from_vid.py [-h] [-v] [-o] [-c] [-z]
Get bbox / ROI coords and training images from videos
optional arguments:
-h, --help show this help message and exit
-v, --vid specify video to be loaded
-o, --center select bounding box / ROI from center point
-c, --csv export CSV file with bbox coords
-z, --scale decrease video scale by scale factor
```
For example, if you wanted to save the bounding box coordinates of a tracked object:
```
./bbox_from_vid.py -v video_input.avi -c vid_bbox_coords.csv
```
## Training
There are two ways in OpenCV to train cascade classifier.
* `opencv_haartraining`
* `opencv_traincascade` - Newer version. Supports both Haar and LBP (Local Binary Patterns)
These were the parameters I used for my initial cascade training. Later, we can introduce a larger dataset. To begin training using `opencv_traincascade`:
```
opencv_traincascade -data stage_outputs -vec samples.vec -bg negatives.txt\
-numStages 20 -minHitRate 0.999 -maxFalseAlarmRate 0.5 -numPos 390\
-numNeg 600 -w 50 -h 50 -mode ALL -precalcValBufSize 8192\
-precalcIdxBufSize 8192
```
The first cascade worked relatively well. However, performance suffered in different lighting conditions. As a result, I trained a second cascade with a larger dataset that included different lighting conditions..
```
opencv_traincascade -data stage_outputs -vec samples.vec -bg negatives.txt\
-numStages 22 -minHitRate 0.993 -maxFalseAlarmRate 0.5 -numPos 1960\
-numNeg 1000 -w 50 -h 50 -mode ALL -precalcValBufSize 16384\
-precalcIdxBufSize 16384
```
Parameters for tuning `opencv_traincascade` are available in the [documentation](https://docs.opencv.org/4.4.0/dc/d88/tutorial_traincascade.html). `precalcValBufSize` and `precalcIdxBufSize` are buffer sizes. Currently set to 8192 Mb. If you have available memory, tune this parameter as training will be faster.
Something important to note is that
> vec-file has to contain `>= [numPos + (numStages - 1) * (1 - minHitRate) * numPos] + S`, where `S` is a count of samples from vec-file that can be recognized as background right away
`numPos` and `numNeg` are the number of positive and negative samples we use in training for every classifier stage. Therefore, `numPos` should be relatively less than our total number of positive samples, taking into consideration the number of stages we'll be running.
```
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 1960 : 1960
NEG count : acceptanceRatio 1000 : 1
Precalculation time: 106
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 1|
+----+---------+---------+
| 3| 1| 1|
+----+---------+---------+
| 4| 1| 0.568|
+----+---------+---------+
| 5| 0.99949| 0.211|
+----+---------+---------+
END>
Training until now has taken 0 days 1 hours 3 minutes 47 seconds.
```
Each row of the training output for each stage represents a feature that's being trained. HR stands for Hit Ratio and FA stands for False Alarm. Note that if a training stage only has a few features (e.g. N = 1 ~ 3), that can suggest that the training data you used was not optimized.
**Note**: You can always pause / stop your cascade training and just build your final `cascade.xml` with the training stages that you've completed thus far. Just run your `opencv_traincascade` script and change the `-numStages` argument up to whichever completed stage you want, while *keeping every other parameter the same*. Your `cascade.xml` will be created.
```
opencv_traincascade -data stage_outputs -vec samples.vec -bg negatives.txt -numStages 22 -minHitRate 0.9993 -maxFalseAlarmRate 0.5 -numPos 1960 -numNeg 1000 -w 50 -h 50 -mode ALL -precalcValBufSize 16384 -precalcIdxBufSize 16384
```
## Testing Cascade
To test how well our cascade performs, run the `classifier.py` script.
```
usage: classifier.py [-h] [-s] [-c] [-i] [-d] [-v] [-w] [-f] [-o] [-z] [-t]
Cascade Classifier
optional arguments:
-h, --help show this help message and exit
-s, --save specify output name
-c, --cas specify specific trained cascade
-i, --img specify image to be classified
-d, --dir specify directory of images to be classified
-v, --vid specify video to be classified
-w, --cam enable camera access for classification
-f, --fps enable frames text (TODO)
-o, --circle enable circle detection
-z, --scale decrease video scale by scale factor
-t, --track select tracking algorithm [KCF, CSRT, MEDIANFLOW]
```
When testing a tracking algorithm, **pass the scale parameter**. For example, to run a video through the classifier and save the output:
```
./classifier.py -v ~/video_input.MOV -s ~/video_output -z 2 -t KCF
```
## Video Conversions
The `classifier.py` automatically saves output videos as `*.avi` (fourcc: XVID). If you need other video types, this can be done very easily with `ffmpeg`. There are way more command arguments, especially if you want to consider encoding and compression types. The following command below converts the `*.avi` to `*.mp4` and compresses it.
```
ffmpeg -i video_input.avi -vcodec libx264 -crf 30 video_output.mp4
```
## Contributing
Pull requests are welcomed.
## Acknowledgements
For releasing their tools and notes under MIT license.
* [Naotoshi Seo](https://github.com/sonots) - `createsamples.pl`
* [Blake Wulfe](https://github.com/wulfebw) - `mergevec.py`
* [Thorsten Ball](https://github.com/mrnugget)
## References
* [OpenCV - Cascade Classifier Training](https://docs.opencv.org/master/dc/d88/tutorial_traincascade.html)
* [OpenCV - Face Detection using Haar Cascades](https://docs.opencv.org/master/d2/d99/tutorial_js_face_detection.html)
* [Naotoshi Seo - Tutorial: OpenCV haartraining](http://note.sonots.com/SciSoftware/haartraining.html)
* [Thorsten Ball - Train your own OpenCV Haar Classifier](https://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html)

View File

@@ -0,0 +1,79 @@
#!/usr/bin/perl
use File::Basename;
use strict;
##########################################################################
# Create samples from an image applying distortions repeatedly
# (create many many samples from many images applying distortions)
#
# perl createtrainsamples.pl <positives.dat> <negatives.dat> <vec_output_dir>
# [<totalnum = 7000>] [<createsample_command_options = ./createsamples -w 20 -h 20...>]
# ex) perl createtrainsamples.pl positives.dat negatives.dat samples
#
# Author: Naotoshi Seo
# Date : 09/12/2008 Add <totalnum> and <createsample_command_options> options
# Date : 06/02/2007
# Date : 03/12/2006
#########################################################################
my $cmd = './createsamples -bgcolor 0 -bgthresh 0 -maxxangle 1.1 -maxyangle 1.1 maxzangle 0.5 -maxidev 40 -w 20 -h 20';
my $totalnum = 7000;
my $tmpfile = 'tmp';
if ($#ARGV < 2) {
print "Usage: perl createtrainsamples.pl\n";
print " <positives_collection_filename>\n";
print " <negatives_collection_filename>\n";
print " <output_dirname>\n";
print " [<totalnum = " . $totalnum . ">]\n";
print " [<createsample_command_options = '" . $cmd . "'>]\n";
exit;
}
my $positive = $ARGV[0];
my $negative = $ARGV[1];
my $outputdir = $ARGV[2];
$totalnum = $ARGV[3] if ($#ARGV > 2);
$cmd = $ARGV[4] if ($#ARGV > 3);
open(POSITIVE, "< $positive");
my @positives = <POSITIVE>;
close(POSITIVE);
open(NEGATIVE, "< $negative");
my @negatives = <NEGATIVE>;
close(NEGATIVE);
# number of generated images from one image so that total will be $totalnum
my $numfloor = int($totalnum / $#positives);
my $numremain = $totalnum - $numfloor * $#positives;
# Get the directory name of positives
my $first = $positives[0];
my $last = $positives[$#positives];
while ($first ne $last) {
$first = dirname($first);
$last = dirname($last);
if ( $first eq "" ) { last; }
}
my $imgdir = $first;
my $imgdirlen = length($first);
for (my $k = 0; $k < $#positives; $k++ ) {
my $img = $positives[$k];
my $num = ($k < $numremain) ? $numfloor + 1 : $numfloor;
# Pick up negative images randomly
my @localnegatives = ();
for (my $i = 0; $i < $num; $i++) {
my $ind = int(rand($#negatives));
push(@localnegatives, $negatives[$ind]);
}
open(TMP, "> $tmpfile");
print TMP @localnegatives;
close(TMP);
#system("cat $tmpfile");
!chomp($img);
my $vec = $outputdir . substr($img, $imgdirlen) . ".vec" ;
print "$cmd -img $img -bg $tmpfile -vec $vec -num $num" . "\n";
system("$cmd -img $img -bg $tmpfile -vec $vec -num $num");
}
unlink($tmpfile);

View File

@@ -0,0 +1,240 @@
#!/usr/bin/env python3
#import matplotlib.pyplot as plt
import argparse as ap
import numpy as np
import cv2 as cv
import os
import sys
class CustomFormatter(ap.HelpFormatter):
def _format_action_invocation(self, action):
if not action.option_strings:
metavar, = self._metavar_formatter(action, action.dest)(1)
return metavar
else:
parts = []
if action.nargs == 0:
parts.extend(action.option_strings)
else:
default = action.dest.upper()
args_string = self._format_args(action, default)
for option_string in action.option_strings:
#parts.append('%s %s' % (option_string, args_string))
parts.append('%s' % option_string)
parts[-1] += ' %s'%args_string
return ', '.join(parts)
# Parser Arguments
parser = ap.ArgumentParser(description='Cascade Classifier', formatter_class=CustomFormatter)
parser.add_argument("-s", "--save", metavar='', help="specify output name")
parser.add_argument("-c", "--cas", metavar='', help="specify specific trained cascade", default="./stage_outputs/cascade.xml")
parser.add_argument("-i", "--img", metavar='', help="specify image to be classified")
parser.add_argument("-d", "--dir", metavar='', help="specify directory of images to be classified")
parser.add_argument("-v", "--vid", metavar='', help="specify video to be classified")
parser.add_argument("-w", "--cam", metavar='', help="enable camera access for classification")
parser.add_argument("-f", "--fps", help="enable frames text (TODO)", action="store_true")
parser.add_argument("-o", "--circle", help="enable circle detection", action="store_true")
parser.add_argument("-z", "--scale", metavar='', help="decrease video scale by scale factor", type=int, default=1)
parser.add_argument("-t", "--track", metavar='', help="select tracking algorithm [KCF, CSRT, MEDIANFLOW]", choices=['KCF', 'CSRT', 'MEDIANFLOW'])
args = parser.parse_args(sys.argv[1:])
# Load the trained cascade
cascade = cv.CascadeClassifier()
if not cascade.load(args.cas):
print("Can't find cascade file. Do you have the directory ./stage_outputs/cascade.xml")
exit(0)
def plot():
pass
def detect_circles(src):
img = src
img_gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
img_blur = cv.medianBlur(img_gray, 5)
rows = img_blur.shape[0]
#Images circles = cv.HoughCircles(img_blur, cv.HOUGH_GRADIENT, 1, rows / 3, param1=100, param2=40, maxRadius=40)
circles = cv.HoughCircles(img_blur, cv.HOUGH_GRADIENT, 1, rows/3, param1=100, param2=15, minRadius=10, maxRadius=15)
if circles is not None:
circles = np.uint16(np.around(circles))
for i in circles[0, :]:
center = (i[0], i[1])
# circle center
cv.circle(img, center, 1, (0, 100, 100), 3)
# circle outline
radius = i[2]
cv.circle(img, center, radius, (255, 0, 255), 3)
return img
def choose_tracker():
OPENCV_TRACKERS = {
'KCF': cv.TrackerKCF_create(),
'CSRT': cv.TrackerCSRT_create(),
'MEDIANFLOW': cv.TrackerMedianFlow_create()
}
tracker = OPENCV_TRACKERS[args.track]
return tracker
def tracking(vid, tracker):
ok, frame = vid.read()
frame = scale(frame, args.scale)
ok, roi = tracker.update(frame)
if ok:
p1 = (int(roi[0]), int(roi[1]))
p2 = (int(roi[0] + roi[2]), int(roi[1] + roi[3]))
cv.rectangle(frame, p1, p2, (0,255,0), 2, 1)
cpoint_circle = cv.circle(frame, (int(roi[0]+(roi[2]/2)), int(roi[1]+(roi[3]/2))), 3, (0,255,0), 3)
return frame
def save(frame):
# Need dimensions of frame to determine proper video output
fourcc = cv.VideoWriter_fourcc(*'XVID')
height, width, channels = frame.shape
out = cv.VideoWriter(args.save + '.avi', fourcc, 30.0, (width, height))
return out
def get_roi(frame):
# Get initial bounding box by running cascade detection on first frame
frame_gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
frame_gray = cv.GaussianBlur(frame_gray, (3, 3), 0)
cas_object = cascade.detectMultiScale(frame_gray)
if len(cas_object) == 0:
return []
roi = (cas_object[0][0], cas_object[0][1], cas_object[0][2], cas_object[0][3])
return roi
def get_cascade(frame):
frame_gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
frame_gray = cv.GaussianBlur(frame_gray, (3, 3), 0)
cas_object = cascade.detectMultiScale(frame_gray)
for (x, y, w, h) in cas_object:
cv.rectangle(frame, (x,y), (x+w, y+h), (0,0,255), 2)
cpoint_circle = cv.circle(frame, (int(x+(w/2)), int(y+(h/2))), 3, (0,0,255), 3)
return frame
def scale(frame, scale_factor):
height, width, channels = frame.shape
scaled_height = int(height/scale_factor)
scaled_width = int(width/scale_factor)
resized_frame = cv.resize(frame, (scaled_width, scaled_height))
return resized_frame
def img_classifier():
# Read image, convert to gray, equalize histogram, and detect.
img = cv.imread(args.img, cv.IMREAD_COLOR)
img_gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
#img_gray = cv.equalizeHist(img_gray)
cas_object = cascade.detectMultiScale(img_gray)
for (x, y, w, h) in cas_object:
roi = cv.rectangle(img, (x,y), (x+w, y+h), (0,0,255), 2)
cpoint_circle = cv.circle(img, (int(x+(w/2)), int(y+(h/2))), 3, (0,0,255), 3)
if args.circle is True:
roi = img[y:y+h, x:x+w]
img = detect_circles(roi)
cv.imshow('image', img)
cv.waitKey(0)
cv.destroyAllWindows()
def dir_classifier():
imgs = []
for filename in os.listdir(args.dir):
img = cv.imread(os.path.join(args.dir, filename))
if img is not None:
img_gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
cas_object = cascade.detectMultiScale(img_gray)
for (x, y, w, h) in cas_object:
cv.rectangle(img, (x,y), (x+w, y+h), (0,0,255), 2)
cpoint_circle = cv.circle(img, (int(x+(w/2)), int(y+(h/2))), 3, (0,0,255), 3)
if args.circle is True:
roi = img[y:y+h, x:x+w]
img = detect_circles(roi)
cv.imshow(str(filename), img)
cv.waitKey(0)
imgs.append(img)
#print(imgs)
#return imgs
def vid_classifier():
vid = cv.VideoCapture(args.vid)
if not vid.isOpened():
print("Could not open video")
sys.exit()
# Read the first frame
_ , frame = vid.read()
frame = scale(frame, args.scale)
if not _:
print("Cannot read video file")
sys.exit()
if args.save is not None and _ is True:
out = save(frame=frame)
if args.track is not None and _ is True:
cas_roi = get_roi(frame)
while not cas_roi:
_, frame = vid.read()
frame = scale(frame, args.scale)
cas_roi = get_roi(frame)
tracker = choose_tracker()
tracker.init(frame, cas_roi)
while(vid.isOpened()):
_ , frame = vid.read()
frame = scale(frame, args.scale)
frame = get_cascade(frame)
if args.track is not None:
frame = tracking(vid=vid, tracker=tracker)
if args.circle is True:
roi = get_roi(frame)
roi_circle = frame[int(roi[1]):int(roi[1] + roi[3]), int(roi[0]):int(roi[0] + roi[2])]
frame = detect_circles(roi_circle)
cv.imshow('video', frame)
if args.save is not None:
out.write(frame)
if cv.waitKey(1) & 0xFF == ord('q'):
break
if args.save is not None:
out.release()
vid.release()
cv.destroyAllWindows()
def cam_classifier():
cam = cv.VideoCapture(0)
if not cam.isOpened():
raise IOError("Cannot access camera")
while(cam.isOpened()):
_, frame = cap.read()
frame = get_cascade(frame)
cv2.imshow('camera', frame)
if cv.waitKey(10) & 0xFF == ord('q'):
break
cam.release()
cv.destroyAllWindows()
if __name__ == "__main__":
if args.img is not None:
img_classifier()
elif args.vid is not None:
vid_classifier()
elif args.dir is not None:
dir_classifier()
elif args.cam is not None:
cam_clasifier()
else:
parser.print_help()

View File

@@ -0,0 +1,11 @@
cycler==0.10.0
kiwisolver==1.3.1
matplotlib==3.3.4
numpy==1.20.0
opencv-contrib-python==4.4.0.44
Pillow==8.1.0
pip-autoremove==0.9.1
pkg-resources==0.0.0
pyparsing==2.4.7
python-dateutil==2.8.1
six==1.15.0

View File

@@ -0,0 +1,12 @@
#!/usr/bin/env bash
set -ex
opencv_createsamples \
-img ./positive_images/img1.png\
-bg ./negatives.txt \
-info ./annotations/annotations.lst\
-pngoutput \
-maxxangle 0.1 \
-maxyangle 0.1 \
-maxzangle 0.1

View File

@@ -0,0 +1,99 @@
#!/usr/bin/env python3
import argparse as ap
import numpy as np
import cv2 as cv
import sys
class CustomFormatter(ap.HelpFormatter):
def _format_action_invocation(self, action):
if not action.option_strings:
metavar, = self._metavar_formatter(action, action.dest)(1)
return metavar
else:
parts = []
if action.nargs == 0:
parts.extend(action.option_strings)
else:
default = action.dest.upper()
args_string = self._format_args(action, default)
for option_string in action.option_strings:
#parts.append('%s %s' % (option_string, args_string))
parts.append('%s' % option_string)
parts[-1] += ' %s'%args_string
return ', '.join(parts)
# Parser Arguments
parser = ap.ArgumentParser(description='Get bbox / ROI coords and training images from videos', formatter_class=CustomFormatter)
parser.add_argument("-v", "--vid", metavar='', help="specify video to be loaded")
parser.add_argument("-o", "--center", help="select bounding box / ROI from center point", action='store_true')
parser.add_argument("-c", "--csv", metavar='', help="export CSV file with bbox coords")
parser.add_argument("-z", "--scale", metavar='', help="decrease video scale by scale factor", type=int, default=1)
args = parser.parse_args(sys.argv[1:])
class tracker_types:
CSRT = cv.TrackerCSRT_create()
KCF = cv.TrackerKCF_create()
MEDIANFLOW = cv.TrackerMedianFlow_create()
def __init__(self):
pass
def scale(frame, scale_factor):
height, width, channels = frame.shape
scaled_height = int(height/scale_factor)
scaled_width = int(width/scale_factor)
resized_frame = cv.resize(frame, (scaled_width, scaled_height))
return resized_frame
def create_csv(values):
np.savetxt(args.csv, values, delimiter=',', fmt='%s')
if __name__ == '__main__':
if args.vid is not None:
vid = cv.VideoCapture(args.vid)
if not vid.isOpened():
print("Could not open video")
sys.exit()
_, frame = vid.read()
frame = scale(frame, args.scale)
if not _:
print("Cannot read video file")
sys.exit()
bbox = cv.selectROI(frame, showCrosshair=True, fromCenter=args.center)
csv_values = np.array([["x_min", "y_min", "x_max", "y_max", "frame_num"]])
tracker = tracker_types.CSRT
tracker.init(frame, bbox)
while True:
_, frame = vid.read()
frame = scale(frame, args.scale)
frame_number = vid.get(cv.CAP_PROP_POS_FRAMES)
_, roi = tracker.update(frame)
if _:
p1 = (int(roi[0]), int(roi[1]))
p2 = (int(roi[0] + roi[2]), int(roi[1] + roi[3]))
cv.rectangle(frame, p1, p2, (0,255,0), 2, 1)
cpoint_circle = cv.circle(frame, (int(roi[0]+(roi[2]/2)), int(roi[1]+(roi[3]/2))), 3, (0,255,0), 3)
csv_data = np.array([[int(roi[0]), int(roi[1]), int(roi[0] + roi[2]), int(roi[1] + roi[3]), int(frame_number)]])
# If your object is stationary, and you just want to train different lighting conditions
# csv_data = np.array([[441, 328, 612, 482, int(frame_number)]])
csv_values = np.append(csv_values, csv_data, 0)
create_csv(csv_values)
else:
# Tracking failure
cv.putText(frame, "Tracking Failure", (100,80), cv.FONT_HERSHEY_SIMPLEX, 0.75,(0,0,255),2)
# Display result
cv.imshow("Video", frame)
# Quit with "Q"
if cv.waitKey(1) & 0xFF == ord('q'):
break
else:
parser.print_help()

View File

@@ -0,0 +1,170 @@
###############################################################################
# Copyright (c) 2014, Blake Wulfe
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
###############################################################################
"""
File: mergevec.py
Author: blake.w.wulfe@gmail.com
Date: 6/13/2014
File Description:
This file contains a function that merges .vec files called "merge_vec_files".
I made it as a replacement for mergevec.cpp (created by Naotoshi Seo.
See: http://note.sonots.com/SciSoftware/haartraining/mergevec.cpp.html)
in order to avoid recompiling openCV with mergevec.cpp.
To use the function:
(1) Place all .vec files to be merged in a single directory (vec_directory).
(2) Navigate to this file in your CLI (terminal or cmd) and type "python mergevec.py -v your_vec_directory -o your_output_filename".
The first argument (-v) is the name of the directory containing the .vec files
The second argument (-o) is the name of the output file
To test the output of the function:
(1) Install openCV.
(2) Navigate to the output file in your CLI (terminal or cmd).
(2) Type "opencv_createsamples -w img_width -h img_height -vec output_filename".
This should show the .vec files in sequence.
"""
import sys
import glob
import struct
import argparse
import traceback
def exception_response(e):
exc_type, exc_value, exc_traceback = sys.exc_info()
lines = traceback.format_exception(exc_type, exc_value, exc_traceback)
for line in lines:
print(line)
def get_args():
parser = argparse.ArgumentParser()
parser.add_argument('-v', dest='vec_directory')
parser.add_argument('-o', dest='output_filename')
args = parser.parse_args()
return (args.vec_directory, args.output_filename)
def merge_vec_files(vec_directory, output_vec_file):
"""
Iterates throught the .vec files in a directory and combines them.
(1) Iterates through files getting a count of the total images in the .vec files
(2) checks that the image sizes in all files are the same
The format of a .vec file is:
4 bytes denoting number of total images (int)
4 bytes denoting size of images (int)
2 bytes denoting min value (short)
2 bytes denoting max value (short)
ex: 6400 0000 4605 0000 0000 0000
hex 6400 0000 4605 0000 0000 0000
# images size of h * w min max
dec 100 1350 0 0
:type vec_directory: string
:param vec_directory: Name of the directory containing .vec files to be combined.
Do not end with slash. Ex: '/Users/username/Documents/vec_files'
:type output_vec_file: string
:param output_vec_file: Name of aggregate .vec file for output.
Ex: '/Users/username/Documents/aggregate_vec_file.vec'
"""
# Check that the .vec directory does not end in '/' and if it does, remove it.
if vec_directory.endswith('/'):
vec_directory = vec_directory[:-1]
# Get .vec files
files = glob.glob('{0}/*.vec'.format(vec_directory))
# Check to make sure there are .vec files in the directory
if len(files) <= 0:
print('Vec files to be merged could not be found from directory: {0}'.format(vec_directory))
sys.exit(1)
# Check to make sure there are more than one .vec files
if len(files) == 1:
print('Only 1 vec file was found in directory: {0}. Cannot merge a single file.'.format(vec_directory))
sys.exit(1)
# Get the value for the first image size
prev_image_size = 0
try:
with open(files[0], 'rb') as vecfile:
content = ''.join(str(line) for line in vecfile.readlines())
val = struct.unpack('<iihh', content[:12])
prev_image_size = val[1]
except IOError as e:
print('An IO error occured while processing the file: {0}'.format(f))
exception_response(e)
# Get the total number of images
total_num_images = 0
for f in files:
try:
with open(f, 'rb') as vecfile:
content = ''.join(str(line) for line in vecfile.readlines())
val = struct.unpack('<iihh', content[:12])
num_images = val[0]
image_size = val[1]
if image_size != prev_image_size:
err_msg = """The image sizes in the .vec files differ. These values must be the same. \n The image size of file {0}: {1}\n
The image size of previous files: {0}""".format(f, image_size, prev_image_size)
sys.exit(err_msg)
total_num_images += num_images
except IOError as e:
print('An IO error occured while processing the file: {0}'.format(f))
exception_response(e)
# Iterate through the .vec files, writing their data (not the header) to the output file
# '<iihh' means 'little endian, int, int, short, short'
header = struct.pack('<iihh', total_num_images, image_size, 0, 0)
try:
with open(output_vec_file, 'wb') as outputfile:
outputfile.write(header)
for f in files:
with open(f, 'rb') as vecfile:
content = ''.join(str(line) for line in vecfile.readlines())
data = content[12:]
outputfile.write(data)
except Exception as e:
exception_response(e)
if __name__ == '__main__':
vec_directory, output_filename = get_args()
if not vec_directory:
sys.exit('mergvec requires a directory of vec files. Call mergevec.py with -v /your_vec_directory')
if not output_filename:
sys.exit('mergevec requires an output filename. Call mergevec.py with -o your_output_filename')
merge_vec_files(vec_directory, output_filename)