项目作者: GHRik

项目描述 :
Automated kubernetes cluster for distributed facial recognition
高级语言: Go
项目地址: git://github.com/GHRik/Distributed-k8s-face-recognition.git
创建时间: 2021-06-01T20:38:32Z
项目社区:https://github.com/GHRik/Distributed-k8s-face-recognition

开源协议:

下载


Distributed face recognition

Using kubernetes cluster

Cuda Dlib k8s

Table of contents

  1. Quick Start
  2. Features
  3. Used technology
  4. Describe
  5. Helping ansible tags
  6. CUDA Support
  7. Without CUDA
  8. Example Result
  9. Prepare your own face database
  10. Debug/Known Bugs
  11. License

Quick Start

To deploy:

  1. git clone https://github.com/GHRik/Distributed-k8s-face-recognition.git
  2. cd Distributed-k8s-face-recognition/ansible
  3. ansible-playbook -i inventory.yaml main.yaml

Features

Full automatization deploy:

Used technology:

  1. dlib - module to recognize face
  2. cuda - to accelerate GPU card
  3. ansible - to automatization create cluster
  4. kubernetes - to create cluster
  5. my docker hub repo - to store built images
  6. kubernetes-sample-cluster - to pattern code
  7. nvidia-docker - to passthrought my gpu to containers
  8. Microsoft azure cloud - for testing
  9. Calico - as CNI k8s plugin

Describe

  1. This repo is reworked code from [this repo](https://github.com/Skarlso/kube-cluster-sample) so if you want any info about components or how everything works together , check [this link](https://cheppers.com/deploying-distributed-face-recognition-application-kubernetes)

If you still dont know how it works, maybe this diagram will help you ;)
Example

Where is distrubuted?

  1. [dlib](http://dlib.net/) have a Pool thread using to find face

dis

Helping ansible tags

  1. To deploy this code you can use ansible tags:

No install nvida-docker and kubernetes packages

  1. ansible-playbook -i inventory.yaml main.yaml

Have cluster, but dont have deploy cluster face fecogniton from this repo

  1. ansible-playbook -i inventory.yaml main.yaml --tags "deploy"

Have cluster, have deployed face recognition from this repo,
but you make changes on kube files or known/unknown people images

  1. ansible-playbook -i inventory.yaml main.yaml --tags "redeploy"

Have cluster, this face regoznition deployed, but images not load
or is an error in “recognize” role

  1. ansible-playbook -i inventory.yaml main.yaml --tags "recognize"

Have cluster before , have deployed face recognition, but want to recreate cluster

  1. ansible-playbook -i inventory.yaml main.yaml --tags "destroy_cluster"
  2. ansible-playbook -i inventory.yaml main.yaml

Have deployed face recognition cluster, but want clear it:

  1. ansible-playbook -i inventory.yaml main.yaml --tags: "destroy"

Cuda Support

  1. This code support CUDA. In this case if you want deploy this cluster with CUDA support:

Check your GPU - which version CUDA your GPU is using

  1. nvidia-smi

You will see output like this:

  1. +-----------------------------------------------------------------------------+
  2. | NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 |
  3. |-------------------------------+----------------------+----------------------+
  4. | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
  5. | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
  6. | | | MIG M. |
  7. |===============================+======================+======================|
  8. | 0 NVIDIA Tesla K80 Off | 00000001:00:00.0 Off | 0 |
  9. | N/A 34C P8 32W / 149W | 0MiB / 11441MiB | 0% Default |
  10. | | | N/A |
  11. +-------------------------------+----------------------+----------------------+
  12. +-----------------------------------------------------------------------------+
  13. | Processes: |
  14. | GPU GI CI PID Type Process name GPU Memory |
  15. | ID ID Usage |
  16. |=============================================================================|
  17. | No running processes found |
  18. +-----------------------------------------------------------------------------+

This cluster was tested uising CUDA 11.3 version, but on my docker hub you can pull other version. Only one pod will be running using CUDA support face_recognition
If you want change a CUDA version, change this line on other version:

  1. face_recognition.yaml
  2. 30: image: ghrik/face_recognition:cuda11.3

This script using nvida-docker to deploy GPU Scheduling on k8s cluster. In this case you should uninstall your docker if you have.

Without CUDA Support

  1. You can run this cluster without CUDA.

In this case you have to change

  1. face_recognition.yaml
  2. 30: image: ghrik/face_recognition:1.0

Result from example

  1. Results are in two pleaces:

Result.txt - If ansible end properly this file will be fill with
the calculated time it takes to recognize a given face

  1. $ cat results/results.txt
  2. Server is on: http://10.98.219.249:8081
  3. LOGS:
  4. Checking image: unknown_people/unknown_02.PNG
  5. Time: 0.4799957275390625 sec.
  6. Checking image: unknown_people/unknown_03.PNG
  7. Time: 0.6136119365692139 sec.
  8. Checking image: unknown_people/unknown_04.PNG
  9. Time: 0.5596208572387695 sec.
  10. Checking image: unknown_people/unknown_01.PNG
  11. Time: 0.46269893646240234 sec.

The first line from result.txt is a ip to frontend site.
On this site you will see what faces have been recognized.
Example

Prepare your own face database

  1. As you can see this cluster is checking only faces in ***unknown_people*** dir.

To make your own database with face you change do a small change in

  1. ansible/kube_files/database_setup.sql

So the first step is a create relation people-face

  1. insert into person (name) values('Damian');
  2. insert into person (name) values('Barack');
  3. insert into person (name) values('Duda');
  4. insert into person (name) values('Lewy');

It is very simple, add only something like that

The next step is create relation
picture from known_people - people_id

  1. insert into person_images (image_name, person_id) values ('damian_01.PNG', 1);
  2. insert into person_images (image_name, person_id) values ('damian_02.PNG', 1);
  3. insert into person_images (image_name, person_id) values ('barack_01.jpg', 2);
  4. insert into person_images (image_name, person_id) values ('barack_02.PNG', 2);
  5. insert into person_images (image_name, person_id) values ('duda_01.PNG', 3);
  6. insert into person_images (image_name, person_id) values ('duda_02.PNG', 3);
  7. insert into person_images (image_name, person_id) values ('lewy_01.PNG', 4);
  8. insert into person_images (image_name, person_id) values ('lewy_02.PNG', 4);

Debug / Known Bugs

  1. In any case of error check for the first ***image_processor*** pod
  1. kubectl logs image_processor
  • List_out_of range

    1. Probably one of images (from ***unknown/known_people)*** does not have any face

    to recognize. In this case image_processor cant process this image.

  • Image_processor is not up

    1. Sometimes a ***image_processor*** must have a more time to get up.

    You can see it if you run new cluster. Pulling image to pod can take a long time

  • No such file or directory on image processor pod

    1. Sometimes ***face_recog_unknown_pvc*** is connected to ***face_recog_known_pv***,

    rerun with “redeploy” tag

  • dont_delete dir in unknown_people

    1. Dont delete ***end.jpg*** , it is corelated with show time all recognized faces.
  • Sleep 60 in recognize

    1. Sometimes a other services need more time to get up.

    To fast deploy you can comment “sleep 60”, and after failed deploy recognize,
    rerun with tag: “recognize”

  • Circuitbreaker is engaged

    1. It means you have more than 5images in ***unknown_people*** dir.

    Probably it will unfreeze if not, you can add sleep function in
    ```sh
    ansible/roles/recognize/tasks/main.yaml

40: shell: sleep 10 && curl -d ‘{“path”:”{{ item.path }}”}’ http://{{ receiver_ip.stdout }}:8000/image/post

  1. Or add fewer face pictures ;)
  2. - Core dump using without CUDA image <br />
  3. ***ghrik/face_recognition:1.0*** was builded with AVX acceleration.
  4. All of CUDA images is using SSE4 (not AVX)
  5. If you want to use dlib without AVX acceleration check flags in dlib section:
  6. ```sh
  7. images/face_recognitionGPU/Dockerfile

and colerate this with

  1. images/face_recognition/Dockerfile

License

  1. Free to use ;)