项目作者: Sharpiless

项目描述 :
基于Paddlepaddle的社交行人安全距离检测
高级语言: Jupyter Notebook
项目地址: git://github.com/Sharpiless/Social-safety-distance-detection-with-paddlepaddle.git


【PaddleX助力疫情防护】基于PaddleX的行人社交安全距离检测

项目背景:

在当前危机中,减少传播的措施之一是隔离。尽管许多城市现在都在谨慎地重新开放,但是人们在外出时仍需要保持安全距离。

因此城市对人们的安全距离是否符合规则进行评估并采取相应的行动是很重要的。如果大多数人都遵守疫情期间的命令,那么就可以安全地开放更多的公共场合。

项目简介:

该项目使用PaddleX提供的YOLO3模型,在VOC数据集进行训练;

训练结果能够检测到监控画面的行人等不同类型车辆,mAP为0.6左右;

然后利用这些检测框框计算向量化表示的两两L2范数,用于进行社交距离较近的人的聚类。

一、安装PaddleX

  1. !pip install paddlex -i https://mirror.baidu.com/pypi/simple

二、解压数据集

  1. # !unzip /home/aistudio/data/data4379/pascalvoc.zip -d /home/aistudio/work/

三、设置工作路径

  1. import matplotlib
  2. matplotlib.use('Agg')
  3. import os
  4. os.environ['CUDA_VISIBLE_DEVICES'] = '0'
  5. import paddlex as pdx
  6. os.chdir('/home/aistudio/work/')

四、生成数据集的TXT文件

PaddleX支持VOC格式数据,训练集和测试集需要定义txt文件,该文件保存图片路径和标注文件路径,格式如下:

JPEGImages/2009_003143.jpg Annotations/2009_003143.xml

JPEGImages/2012_001604.jpg Annotations/2012_001604.xml

  1. base = '/home/aistudio/work/pascalvoc/VOCdevkit/VOC2012/'
  2. imgs = os.listdir(os.path.join(base, 'JPEGImages'))
  3. print('total:', len(imgs))
  4. with open(os.path.join(base, 'train_list.txt'), 'w') as f:
  5. for im in imgs[:-200]:
  6. info = 'JPEGImages/'+im+' '
  7. info += 'Annotations/'+im[:-4]+'.xml\n'
  8. f.write(info)
  9. with open(os.path.join(base, 'val_list.txt'), 'w') as f:
  10. for im in imgs[-200:]:
  11. info = 'JPEGImages/'+im+' '
  12. info += 'Annotations/'+im[:-4]+'.xml\n'
  13. f.write(info)
  14. CLASSES = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus',
  15. 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse',
  16. 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa',
  17. 'train', 'tvmonitor']
  18. with open('labels.txt', 'w') as f:
  19. for v in CLASSES:
  20. f.write(v+'\n')
  1. total: 17125

五、定义数据预处理模块

这里使用了图像混合、随机像素变换、随机膨胀、随即裁剪、随机水平翻转等数据增强方法。

  1. from paddlex.det import transforms
  2. train_transforms = transforms.Compose([
  3. transforms.MixupImage(mixup_epoch=250),
  4. transforms.RandomDistort(),
  5. transforms.RandomExpand(),
  6. transforms.RandomCrop(),
  7. transforms.Resize(target_size=512, interp='RANDOM'),
  8. transforms.RandomHorizontalFlip(),
  9. transforms.Normalize(),
  10. ])
  11. eval_transforms = transforms.Compose([
  12. transforms.Resize(target_size=512, interp='CUBIC'),
  13. transforms.Normalize(),
  14. ])

六、定义训练集和测试集

这里取出后200张图片作为测试集

  1. train_dataset = pdx.datasets.VOCDetection(
  2. data_dir=base,
  3. file_list=os.path.join(base, 'train_list.txt'),
  4. label_list='labels.txt',
  5. transforms=train_transforms,
  6. shuffle=True)
  7. eval_dataset = pdx.datasets.VOCDetection(
  8. data_dir=base,
  9. file_list=os.path.join(base, 'val_list.txt'),
  10. label_list='labels.txt',
  11. transforms=eval_transforms)
  1. 2020-07-20 22:13:24,797-INFO: font search path ['/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/ttf', '/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/afm', '/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/pdfcorefonts']
  2. 2020-07-20 22:13:25,134-INFO: generated new fontManager
  3. 2020-07-20 22:13:25 [INFO] Starting to read file list from dataset...
  4. 2020-07-20 22:13:51 [INFO] 16925 samples in file /home/aistudio/work/pascalvoc/VOCdevkit/VOC2012/train_list.txt
  5. creating index...
  6. index created!
  7. 2020-07-20 22:13:52 [INFO] Starting to read file list from dataset...
  8. 2020-07-20 22:13:52 [INFO] 200 samples in file /home/aistudio/work/pascalvoc/VOCdevkit/VOC2012/val_list.txt
  9. creating index...
  10. index created!

七、定义并训练模型

这里定义了一个YOLOv3,使用DarkNet53作为主干网络;

  1. num_classes = len(train_dataset.labels) + 1
  2. print('class num:', num_classes)
  3. model = pdx.det.YOLOv3(
  4. num_classes=num_classes,
  5. backbone='DarkNet53'
  6. )
  7. model.train(
  8. num_epochs=20,
  9. train_dataset=train_dataset,
  10. train_batch_size=4,
  11. eval_dataset=eval_dataset,
  12. learning_rate=0.00025,
  13. lr_decay_epochs=[10, 15],
  14. save_interval_epochs=4,
  15. log_interval_steps=100,
  16. save_dir='./YOLOv3',
  17. pretrain_weights='IMAGENET',
  18. use_vdl=True)

八、评估模型

  1. model = pdx.load_model('./YOLOv3/best_model')
  2. model.evaluate(eval_dataset, batch_size=1, epoch_id=None, metric=None, return_details=False)
  1. 2020-07-20 22:15:01 [INFO] Model[YOLOv3] loaded.
  2. 2020-07-20 22:15:01 [INFO] Start to evaluating(total_samples=200, total_steps=200)...
  3. 100%|██████████| 200/200 [00:17<00:00, 11.76it/s]
  4. OrderedDict([('bbox_map', 61.626957651198275)])

九、测试模型检测结果

  1. import cv2
  2. import time
  3. import numpy as np
  4. import matplotlib.pyplot as plt
  5. %matplotlib inline
  6. image_name = './test.jpg'
  7. start = time.time()
  8. result = model.predict(image_name, eval_transforms)
  9. print('infer time:{:.6f}s'.format(time.time()-start))
  10. print('detected num:', len(result))
  11. im = cv2.imread(image_name)
  12. font = cv2.FONT_HERSHEY_SIMPLEX
  13. threshold = 0.0
  14. for value in result:
  15. xmin, ymin, w, h = np.array(value['bbox']).astype(np.int)
  16. cls = value['category']
  17. score = value['score']
  18. if score < threshold:
  19. continue
  20. cv2.rectangle(im, (xmin, ymin), (xmin+w, ymin+h), (0, 255, 0), 4)
  21. cv2.putText(im, '{:s} {:.3f}'.format(cls, score),
  22. (xmin, ymin), font, 0.5, (255, 0, 0), thickness=2)
  23. cv2.imwrite('result.jpg', im)
  24. plt.figure(figsize=(15,12))
  25. plt.imshow(im[:, :, [2,1,0]])
  26. plt.show()
  1. infer time:0.118807s
  2. detected num: 17

png

十、导出量化模型

这里使用PaddleX提供的slim接口进行模型的后量化。

  1. pdx.slim.export_quant_model(model, eval_dataset, batch_size=2, batch_num=200, save_dir='./quant_model', cache_dir='./temp')

十一、定义行人射角距离检测器:

这里参考论文《Monitoring COVID-19 social distancing with person detection and tracking via fine-tuned YOLO v3 and Deepsort techniques》,使用聚类方法检测间距比较近的人群。

  1. class PersonDistanceDetector(object):
  2. def __init__(self):
  3. self.model = model = pdx.load_model('./YOLOv3/best_model')
  4. self.boxes = []
  5. self.threshold = 0.3
  6. self.process_this_frame = False
  7. def feedCap(self, frame, right_direction='Rear', max_speed=120, task=0):
  8. self.process_this_frame = not self.process_this_frame
  9. retDict = {
  10. 'frame': None, # 检测结果可视化后的图片
  11. 'list_of_cars': [], # 裁剪出来的行人图片(追踪第一帧截取到的)
  12. 'list_of_ids': [], # 行人的ID
  13. }
  14. height, width, channels = frame.shape
  15. if self.process_this_frame:
  16. result = self.model.predict(frame)
  17. self.boxes = [v['bbox'] for v in result if v['score'] > self.threshold]
  18. circles = []
  19. for i in range(len(self.boxes)):
  20. x, y = int(self.boxes[i][0]), int(self.boxes[i][1])
  21. w, h = int(self.boxes[i][2]), int(self.boxes[i][3])
  22. cx, cy = x + w // 2, y + h
  23. frame = cv2.line(
  24. frame, (cx, cy), (cx, cy - h // 2), (0, 255, 0), 2)
  25. frame = cv2.circle(frame, (cx, cy - h // 2), 5, (255, 20, 200), -1)
  26. circles.append([cx, cy - h // 2, h])
  27. int_circles_list = []
  28. indexes = []
  29. for i in range(len(circles)):
  30. x1, y1, r1 = circles[i]
  31. for j in range(i + 1, len(circles)):
  32. x2, y2, r2 = circles[j]
  33. if int_circle(x1, y1, x2, y2, r1 // 2, r2 // 2) >= 0 and abs(y1 - y2) < r1 // 4:
  34. indexes.append(i)
  35. indexes.append(j)
  36. int_circles_list.append([x1, y1, r1])
  37. int_circles_list.append([x2, y2, r2])
  38. cv2.line(frame, (x1, y1), (x2, y2), (0, 0, 255), 2)
  39. circle_img = frame * 0
  40. rows, cols, _ = frame.shape
  41. for i in range(len(circles)):
  42. x, y, r = circles[i]
  43. if i in indexes:
  44. color = (0, 0, 255)
  45. else:
  46. color = (0, 200, 20)
  47. scale = (r) / 100
  48. transparentOverlay1(
  49. frame, dst_circle, (x, y - 5), alphaVal=110, color=color, scale=scale)
  50. cv2.rectangle(frame, (0, rows - 80), (cols, rows), (0, 0, 0), -1)
  51. cv2.putText(frame,
  52. "Total Persons : " + str(len(self.boxes)),
  53. (20, rows - 40),
  54. fontFace=cv2.QT_FONT_NORMAL,
  55. fontScale=1,
  56. color=(215, 220, 245))
  57. retDict['frame'] = frame
  58. return retDict
  59. def int_circle(x1, y1, x2, y2, r1, r2):
  60. distSq = (x1 - x2) * (x1 - x2) + (y1 - y2) * (y1 - y2)
  61. radSumSq = (r1 + r2) * (r1 + r2)
  62. if distSq == radSumSq:
  63. return 1
  64. elif distSq > radSumSq:
  65. return -1
  66. else:
  67. return 0
  68. def get_bounding_box(outs, height, width):
  69. class_ids = []
  70. confidences = []
  71. boxes = []
  72. for out in outs:
  73. for detection in out:
  74. scores = detection[5:]
  75. class_id = np.argmax(scores)
  76. if class_id != 0:
  77. continue # 0 is ID of persons
  78. confidence = scores[class_id]
  79. if confidence > 0.3:
  80. # Object detected
  81. center_x = int(detection[0] * width)
  82. center_y = int(detection[1] * height)
  83. w = int(detection[2] * width)
  84. h = int(detection[3] * height)
  85. # Rectangle coordinates
  86. x = int(center_x - w / 2)
  87. y = int(center_y - h / 2)
  88. boxes.append([x, y, w, h])
  89. confidences.append(float(confidence))
  90. class_ids.append(class_id)
  91. return boxes, confidences, class_ids
  92. def transparentOverlay1(src, overlay, pos=(0, 0), scale=1, color=(0, 200, 100), alphaVal=255):
  93. overlay = cv2.resize(overlay.copy(), (0, 0), fx=scale, fy=scale)
  94. h, w, _ = overlay.shape # Size of foreground
  95. rows, cols, _ = src.shape # Size of background Image
  96. x, y = pos[0], pos[1] # Position of foreground/overlay image
  97. x -= w // 2
  98. background = src[y:min(y + h, rows), x:min(x + w, cols)]
  99. b_h, b_w, _ = background.shape
  100. if b_h <= 0 or b_w <= 0:
  101. return src
  102. foreground = overlay[0:b_h, 0:b_w]
  103. alpha = foreground[:, :, 1].astype(float)
  104. alpha[alpha > 235] = alphaVal
  105. alpha = cv2.merge([alpha, alpha, alpha])
  106. alpha = alpha / 255.0
  107. foreground = foreground.astype(float)
  108. background = background.astype(float)
  109. foreground = np.zeros_like(foreground) + color
  110. foreground = cv2.multiply(alpha, foreground[:, :, :3])
  111. background = cv2.multiply(1.0 - alpha, background)
  112. outImage = cv2.add(foreground, background).astype("uint8")
  113. src[y:y + b_h, x:x + b_w] = outImage
  114. return src
  115. M3 = np.array([
  116. [0.8092, -0.2960, 11],
  117. [0.0131, 0.0910, 30],
  118. [0.0001, -0.0052, 1.0]
  119. ])
  120. circle_img = np.zeros((100, 100, 3))
  121. cv2.circle(circle_img, (50, 50), 40, (0, 240, 0), 4)
  122. dst_circle = cv2.warpPerspective(circle_img, M3, (100, 100))

十二、测试并保存检测视频:

  1. !pip install imutils
  1. Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/
  2. Collecting imutils
  3. Downloading https://mirrors.tuna.tsinghua.edu.cn/pypi/web/packages/b5/94/46dcae8c061e28be31bcaa55c560cb30ee9403c9a4bb2659768ec1b9eb7d/imutils-0.5.3.tar.gz
  4. Building wheels for collected packages: imutils
  5. Building wheel for imutils (setup.py) ... [?25ldone
  6. [?25h Created wheel for imutils: filename=imutils-0.5.3-cp37-none-any.whl size=25850 sha256=3b998c0c60a0574128786ca35813d946e7df9163b8449a34a7f6a3a40245a6e1
  7. Stored in directory: /home/aistudio/.cache/pip/wheels/b0/2e/d2/0771cdc54b4c4f319b7ea5b09731992402961488e7a09f6a7a
  8. Successfully built imutils
  9. Installing collected packages: imutils
  10. Successfully installed imutils-0.5.3
  1. import imutils
  2. import cv2
  3. from tqdm import tqdm
  4. if __name__ == '__main__':
  5. name = 'demo'
  6. path = 'p.mp4'
  7. det = PersonDistanceDetector()
  8. cap = cv2.VideoCapture(path)
  9. fps = int(cap.get(5))
  10. print('fps:', fps)
  11. t = int(1000/fps)
  12. frames_num = int(cap.get(7))
  13. size = None
  14. car = None
  15. print('processing...')
  16. for _ in tqdm(range(frames_num)):
  17. # try:
  18. _, im = cap.read()
  19. if im is None:
  20. break
  21. result = det.feedCap(im)
  22. result = result['frame']
  23. result = imutils.resize(result, height=500)
  24. if size is None:
  25. size = (result.shape[1], result.shape[0])
  26. fourcc = cv2.VideoWriter_fourcc(
  27. 'm', 'p', '4', 'v') # opencv3.0
  28. videoWriter = cv2.VideoWriter(
  29. 'result.mp4', fourcc, fps, size)
  30. videoWriter.write(result)
  31. cv2.waitKey(1)
  32. # except Exception as e:
  33. # print(e)
  34. # break
  35. cap.release()
  36. videoWriter.release()
  37. cv2.destroyAllWindows()
  38. print('done')
  1. 2020-07-20 22:24:24 [INFO] Model[YOLOv3] loaded.
  2. fps: 23
  3. processing...
  4. 100%|██████████| 583/583 [00:39<00:00, 14.62it/s]
  5. done

总结:

在本项目中我们完成了以下任务:

  1. 使用PaddleX在VOC数据集训练了YOLOv3模型;

  2. 使用训练好的YOLOv3模型进行行人的检测;

  3. 使用聚类算法将检测到的行人聚类;

结论:

PaddleX提供的模型训练和部署接口非常实用,帮助提高开发者的开发效率;

关于作者:

北京理工大学 大二在读

感兴趣的方向为:目标检测、人脸识别、EEG识别等

将会定期分享一些小项目,感兴趣的朋友可以互相关注一下:主页链接

也欢迎大家fork、评论交流

作者博客主页:https://blog.csdn.net/weixin_44936889