This commit is contained in:
louiscklaw
2025-01-31 22:53:38 +08:00
parent c7fb335275
commit e475cf3096
11 changed files with 546 additions and 0 deletions

View File

@@ -0,0 +1,142 @@
# Cascade_tools
Automate the retraining of cascade classifiers with OpenCV by providing:
- a simple annotation tool for image/video capture and save
- generation of additional positive images from existing positives and negatives
- retraining of OpenCV Cascade classifiers using HAAR, LBP features
<tr>
<th>
<a name="tracker" href=""><img src="./info/qc.png" alt="400" width="400"></a>
</th>
</tr>
# Directory structure
```
.
├── annotate.py -> image/video annotation tool
├── clean.sh -> clean all files directory for a new training
├── data -> cascade classifier output directory
├── genpos -> directory will contain the additional positives generated from the existing ones
├── genpos.sh -> generate additional positives from the existing ones
├── info -> images displayed in the readme file
├── LICENSE
├── neg -> holds the negative images for training
├── pos -> holds the positive images for training
├── prepare_data.py -> generate the dat files needed for OpenCV cascade tool
├── raw -> directory to store annotated images
├── README.md
├── test_cascade.py -> simple tools to test cascade classifier
└── train.sh -> start the training process
```
# How to use
The scrips are relying on the built-in tools of OpenCV, therefore, to work OpenCV must be installed first installed.
### Step 1 - opencv install
```
sudo apt-get install python-opencv
```
Test existence of the OpenCV cascade trainer tools, type in the terminal window:
```
opencv_[TAB]
```
The following list should appear:
```
opencv_annotation opencv_traincascade opencv_visualisation
opencv_createsamples opencv_version
```
If we have the screen above, we are settled.
### Step 2 - Collect negative and positive images
Cascade is very sensitive to the training data, therefore by collecting the negative samples, makes sure that no object which follows to be detected is contained. Negatives can be collected from the internet, the format shall be *.jpg, size does not matter.
Collecting positive samples can happen using the annotation tool. Captures will be saved in the "raw" directory
```
python3 annotate.py --help
usage: annotate.py [-h] [-cam CAM] [-vid VID]
Simple annotation tool Use "a" to start to annotate, ENTER to return
optional arguments:
-h, --help show this help message and exit
-cam CAM Camera index
-vid VID Video file
```
Usually, there are needed about ~300 positive images and 1000-1200 negatives for reasonable training with less false positives.
### Step 3 - Retrain Cascade classifier
The train.sh script does the 'dirty work'. There are many hyperparameters to set, which can influence the classification accuracy but training time too.
```
# Number of training stages
nrStages=20
# detection types [HAAR,LBP]
fn=HAAR
# object size, width, height
imgw=30
imgh=21
```
Usually, a well-trained cascade will reach 20 stages of the training. HAAR is slower than LBP, but more accurate. The image size by default is (24,24), here is settled for rectangle objects. Changing these parameters will have a high impact on training time.
In case everything went well, the training will start...
```
===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 13 : 13
NEG count : acceptanceRatio 13 : 1
Precalculation time: 0
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 1|
+----+---------+---------+
| 2| 1| 0|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 0 seconds.
```
After the training has finished (i.e. max number of stages reached), a 'cascade.xml' fill will be created automatically in the 'data' folder.
### Step 4 - testing the results
After the 'cascade.xml' file was created, testing follows. Use the below provided utility for fast feedback.
```
python3 test_cascade.py --help
usage: test_cascade.py [-h] [-cam CAM] [-n N] [-s S]
Cascade tester. Defaults: -cam 0 -n 0 -s 1.1
optional arguments:
-h, --help show this help message and exit
-cam CAM Camera ID
-n N Number of neighbors for detections
-s S Scale factor
```
If there appears a lot of random detections (false positives):
<tr>
<th>
<a name="tracker" href=""><img src="./info/false_positives.png" alt="400" width="400"></a>
</th>
</tr>
- try to adjust the classifier hyperparameters in 'test_cascade.py' file
```
python3 test_cascade.py -n 2 -s 1.2
```
- increase the number of positive/negative samples, use 'clean.sh' to clear the intermediate files and retrain the cascade classifier till the results are acceptable.
# Resources
https://docs.opencv.org/2.4/doc/user_guide/ug_traincascade.html
/Enjoy.

View File

@@ -0,0 +1,62 @@
#! /usr/bin/python
from datetime import datetime
import cv2
import numpy as np
import argparse
import os
# input arguments
parser = argparse.ArgumentParser(description='Simple annotation tool \n \
Use "a" to start annotate, ENTER to return \n')
parser.add_argument('-cam', type=int, help='Camera index', default=1)
parser.add_argument('-vid', type=str, help='Video file', default='')
args = parser.parse_args()
path, filename = os.path.split(os.path.realpath(__file__))
# select video or camera
if args.vid != '':
cap = cv2.VideoCapture(args.vid)
else:
cap = cv2.VideoCapture(args.cam)
roi = []
frame = np.array((480,640,3), dtype=np.uint8)
img = np.array((120,160,3), dtype=np.uint8)
ccount = 0
cv2.namedWindow('annotate')
cv2.namedWindow('capture')
while(cap.isOpened()):
ret, frame = cap.read()
frame = cv2.resize(frame,(640,480))
k = cv2.waitKey(10)
# quit on ESC key
if k == 27:
cv2.destroyAllWindows()
exit()
#start annotation, save file with timestamp
if k== ord('a'):
roi = cv2.selectROI('annotate',frame)
img = frame[int(roi[1]):int(roi[1]+roi[3]),int(roi[0]):int(roi[0]+roi[2])]
now = datetime.now()
ts = datetime.timestamp(now)
cv2.imwrite(path +'/raw/'+ 'img_' + str(int(ts)) + '.jpg', img)
ccount +=1
print ("Capture count:" + str(ccount))
cv2.imshow('capture',img)
cv2.imshow('annotate',frame)
cap.close()
cv2.destroyAllWindows()

View File

@@ -0,0 +1,9 @@
#!/bin/bash
rm -rfv data/*
rm -rfv genpos/*
rm -f neg.txt
rm -f info.dat
rm -f positives.vec

View File

@@ -0,0 +1,44 @@
#!/bin/bash
echo "Generates positive pictures from existing positives and negatives"
echo "Parameters: postine width, hight and number of positives to generate"
# celanup
rm -rfv genpos/*
# script parameters
imgw=30
imgh=21
# number of positives forma a single image
nrPos=10
# generate positives for all images from ./genpos folder
FILES="./pos"
# create positive dat file and negative txt file
python3 ./prepare_data.py -posWidthX 80 -posWidthY 60 -negWidthX 160 -negWidthY 120
for f in $FILES/*
do
# create positives from a single image
opencv_createsamples -img $f -bg neg.txt -info ./genpos/genpos.dat -pngoutput genpos -maxxangle 1.1 -maxyangle 1.1 -maxzangle 1.1 -num $nrPos -w $imgw -h $imgh
done
#concatenuate the original positives with the generated ones
SCRIPTPATH="$( cd "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )"
FILEGP='./genpos/genpos.dat'
while read line; do
# reading each line, append the absolute path and update .dat file
str=$SCRIPTPATH/pos/$line
echo $str >> ./info.dat
done < $FILEGP
# copy all generated .jpg files to pos folder
cp ./genpos/*jpg ./pos

View File

@@ -0,0 +1,27 @@
import os
import glob
from PIL import Image
for folder in ['neg']:
for f in glob.glob(folder + '/*.png'):
img = Image.open(f)
rgb_img = img.convert('RGB')
rgb_img.save(f.replace('.png', '.jpg'), 'JPEG')
for folder in ['pos']:
for f in glob.glob(folder + '/*.png'):
img = Image.open(f)
rgb_img = img.convert('RGB')
rgb_img.save(f.replace('.png', '.jpg'), 'JPEG')
for f in glob.glob(folder + '/*.jpg'):
img = Image.open(f)
gray_img = img.convert('L')
gray_img.save(f.replace('.png', '.jpg'), 'JPEG')
for f in glob.glob(folder + '/*.jpg'):
img = Image.open(f)
img = img.resize((24,24), Image.ANTIALIAS)
rgb_img = img.convert('RGB')
rgb_img.save(f.replace('.png', '.jpg'), 'JPEG')

View File

@@ -0,0 +1,90 @@
#!/usr/bin/env python3
import cv2
import numpy as np
import sys
import os
import time
import glob
import argparse
# input arguments
parser = argparse.ArgumentParser(description='Prepar dataset for Cascade trainer.\n Defaults: posWidthX=50, posWidthY=50, negWidthX=100, negWidthY=100, location /pos/*.*, /neg/*.*')
parser.add_argument('-posWidthX', type=int, help='Positive sample x Width', default=80)
parser.add_argument('-posWidthY', type=int, help='Positive sample y Width', default=60)
parser.add_argument('-negWidthX', type=int, help='Negative sample x Width', default=160)
parser.add_argument('-negWidthY', type=int, help='Negative sample y Width', default=120)
parser.add_argument('-pos', type=str, help='Positive samples location', default='/pos/*.*')
parser.add_argument('-neg', type=str, help='Negative samples location', default='/neg/*.*')
args = parser.parse_args()
# sample sizes
POS_SIZE=(args.posWidthX,args.posWidthY)
NEG_SIZE = (args.negWidthX,args.negWidthY)
# current path
path, filename = os.path.split(os.path.realpath(__file__))
# read files
pfiles = os.popen('find ./pos |grep -i jpg').read().splitlines()
nfiles = os.popen('find ./neg |grep -i jpg').read().splitlines()
# check directory structure
if len(pfiles)== 0:
print ('no positive images found, check pos dir!')
exit
if len(nfiles)== 0:
print ('no negative images found, check neg dir!')
exit
# create positive images descriptor
f = open(path +'/info.dat','w')
for pf in pfiles:
# create tag to write in info.dat => objnr=1 startx=0 starty=0 endx=50 endy=50
infoline = pf + ' 1 0 0 ' + str(POS_SIZE[0]) + ' ' + str(POS_SIZE[1]) +'\n'
f.write(infoline)
# resize img to desired size
img = cv2.imread(pf)
h,w = img.shape[:2]
# resize if picture dimensiona are different
if (w,h) != POS_SIZE:
img = cv2.resize(img,POS_SIZE)
try:
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
img = cv2.equalizeHist(img)
except:
pass
cv2.imwrite(pf,img)
f.close()
# create negative images descriptor
f = open(path +'/neg.txt','w')
for nf in nfiles:
# print(nf)
f.write(nf + '\n')
img = cv2.imread(nf)
h,w = img.shape[:2]
# resize if picture dimensiona are different
if (w,h) != NEG_SIZE:
img = cv2.resize(img,NEG_SIZE)
try:
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
img = cv2.equalizeHist(img)
except:
pass
cv2.imwrite(nf,img)
f.close()

View File

@@ -0,0 +1,94 @@
#!/usr/bin/env python3
import cv2
import numpy as np
import os
import argparse
class Cascade():
def __init__(self,data_file):
"""[summary]
Args:
data_file (str): path to the cascade file
"""
self.cc = cv2.CascadeClassifier()
self.cc.load(data_file)
self.objects = []
self.scale = 1.1
self.neighbor = 0
def set_parameters(self, scale=1.1, neighbor=0):
"""
Set detection parameters
Args:
scale (float, optional): Scale factor. Defaults to 1.1.
neighbor (int, optional): Number of neighbors for detections. Defaults to 0.
"""
self.scale = scale
self.neighbor = neighbor
def get_detections(self, img):
"""
makes cascade detections
Args:
img (image): input - RGB image
returns a list of (x,y,w,h) for objects detected
"""
img_g = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
img_g = cv2.equalizeHist(img_g)
detections = self.cc.detectMultiScale(img_g,self.scale,self.neighbor, flags=cv2.CASCADE_DO_CANNY_PRUNING)
return detections
def display(self, img):
"""
display on an image the detections
Args:
img (image): color RGB image
"""
detections = self.get_detections(img)
for (x,y,w,h) in detections:
cv2.rectangle(img,(x,y),(x+w,y+h),[0,255,0],2)
return img
if __name__ == '__main__':
# input arguments
parser = argparse.ArgumentParser(description='Cascade tester.\n Defaults: -cam 0 -n 0 -s 1.1')
parser.add_argument('-cam', type=int, help='Camera ID', default=0)
parser.add_argument('-n', type=int, help='Number of neighbors for detections', default=0)
parser.add_argument('-s', type=float, help='Scale factor', default=1.1)
args = parser.parse_args()
path, filename = os.path.split(os.path.realpath(__file__))
f = path + '/data/cascade.xml'
cap = cv2.VideoCapture(args.cam)
cd = Cascade(f)
cd.set_parameters(args.s,args.n)
if not cap.isOpened:
exit(0)
while True:
ret, frame = cap.read()
if frame is None:
break
if cv2.waitKey(10) == 27:
break
frame = cd.display(frame)
cv2.imshow('Capture', frame)

View File

@@ -0,0 +1,78 @@
python png_to_jpg.py
echo "Create vector file from positives, negatives and starts the training"
echo "Parameters: posivies size, number of posties, "
# Number of training stages
nrStages=20
# detection types [HAAR,LBP]
# fn=HAAR
# object size, width, height
imgw=24
imgh=24
# # # create positive dat file and negative txt file
# python3 ./prepare_data.py \
# -posWidthX 24 \
# -posWidthY 24 \
# -negWidthX 64 \
# -negWidthY 64
# number of images containg the object to detect (see pos directory content)
nrf=0
FILES=$(find ./pos -type f -name "*.jpg")
for f in $FILES; do
nrf=$((nrf+1))
done
nrPos=$nrf
nrPosThreeTimes=$(($nrPos * 3))
# number of images NOT containg the object to detect (see neg directory content)
nrf=0
FILES=$(find ./neg -type f -name "*.jpg")
for f in $FILES; do
nrf=$((nrf+1))
done
nrNeg=$nrf
# create vector file from positives
opencv_createsamples \
-info info.dat \
-vec positives.vec \
-bg neg.txt \
-maxxangle 1.1 \
-maxyangle 1.1 \
-maxzangle 1.1 \
-w $imgw -h $imgh \
-num 700
# create positives from a single image
#opencv_createsamples \
# -img object.jpg \
# -bg neg.txt \
# -info info/info.lst \
# -pngoutput info \
# -maxxangle 1.1 \
# -maxyangle 1.1 \
# -maxzangle 1.1 \
# -num $nrPos \
# -w $imgw -h $imgh
# train cascade
opencv_traincascade \
-data data \
-vec positives.vec \
-bg neg.txt \
-numPos 700 -numNeg $nrNeg \
-numStages $nrStages \
-w $imgw -h $imgh \
-minHitRate 0.999 -maxFalseAlarmRate 0.5 \
-mode ALL \
-numThreads 4 \
-precalcValBufSize 2048 -precalcIdxBufSize 2048
# -featureType $fn \