Fixed merge conflicts

This commit is contained in:
Charles Joseph Pierre Beauville 2021-06-26 16:01:38 +02:00
commit 072c89dd06
15 changed files with 171 additions and 153 deletions

View File

@ -5,7 +5,22 @@
name: Tests name: Tests
on: [push, pull_request] on:
push:
paths:
- 'monoloco/**'
- 'test/**'
- 'docs/00*.png'
- 'docs/frame0032.jpg'
- '.github/workflows/tests.yml'
pull_request:
paths:
- 'monoloco/**'
- 'test/**'
- 'docs/00*.png'
- 'docs/frame0032.jpg'
- '.github/workflows/tests.yml'
jobs: jobs:
build: build:

128
README.md
View File

@ -2,12 +2,14 @@
Continuously tested on Linux, MacOS and Windows: [![Tests](https://github.com/vita-epfl/monoloco/workflows/Tests/badge.svg)](https://github.com/vita-epfl/monoloco/actions?query=workflow%3ATests) Continuously tested on Linux, MacOS and Windows: [![Tests](https://github.com/vita-epfl/monoloco/workflows/Tests/badge.svg)](https://github.com/vita-epfl/monoloco/actions?query=workflow%3ATests)
<img src="docs/webcam.gif" width="700" alt="gif" />
<img src="docs/monoloco.gif" alt="gif" /> <br />
<br />
This library is based on three research projects for monocular/stereo 3D human localization (detection), body orientation, and social distancing. Check the __video teaser__ of the library on [__YouTube__](https://www.youtube.com/watch?v=O5zhzi8mwJ4). This library is based on three research projects for monocular/stereo 3D human localization (detection), body orientation, and social distancing. Check the __video teaser__ of the library on [__YouTube__](https://www.youtube.com/watch?v=O5zhzi8mwJ4).
--- ---
> __MonStereo: When Monocular and Stereo Meet at the Tail of 3D Human Localization__<br /> > __MonStereo: When Monocular and Stereo Meet at the Tail of 3D Human Localization__<br />
@ -35,6 +37,11 @@ __[Article](https://arxiv.org/abs/1906.06059)__ &nbsp; &nbsp; &nbsp; &nbsp; &nbs
<img src="docs/surf.jpg" width="700"/> <img src="docs/surf.jpg" width="700"/>
## Library Overview
Visual illustration of the library components:
<img src="docs/monoloco.gif" width="700" alt="gif" />
## License ## License
All projects are built upon [Openpifpaf](https://github.com/vita-epfl/openpifpaf) for the 2D keypoints and share the AGPL Licence. All projects are built upon [Openpifpaf](https://github.com/vita-epfl/openpifpaf) for the 2D keypoints and share the AGPL Licence.
@ -52,6 +59,7 @@ For quick installation, do not clone this repository, make sure there is no fold
``` ```
pip3 install monoloco pip3 install monoloco
pip3 install matplotlib
``` ```
For development of the source code itself, you need to clone this repository and then: For development of the source code itself, you need to clone this repository and then:
@ -102,27 +110,6 @@ When processing KITTI images, the network uses the provided intrinsic matrix of
In all the other cases, we use the parameters of nuScenes cameras, with "1/1.8'' CMOS sensors of size 7.2 x 5.4 mm. In all the other cases, we use the parameters of nuScenes cameras, with "1/1.8'' CMOS sensors of size 7.2 x 5.4 mm.
The default focal length is 5.7mm and this parameter can be modified using the argument `--focal`. The default focal length is 5.7mm and this parameter can be modified using the argument `--focal`.
## Webcam
You can use the webcam as input by using the `--webcam` argument. By default the `--z_max` is set to 10 while using the webcam and the `--long-edge` is set to 144. If multiple webcams are plugged in you can choose between them using `--camera`, for instance to use the second camera you can add `--camera 1`.
we can see a few examples below, obtained we the following commands :
For the first and last visualization:
```
python -m monoloco.run predict \
--webcam \
--activities raise_hand
```
For the second one :
```
python -m monoloco.run predict \
--webcam \
--activities raise_hand social_distance
```
![webcam](docs/webcam.gif)
With `social_distance` in `--activities`, only the keypoints will be shown, with no image, allowing total anonimity.
## A) 3D Localization ## A) 3D Localization
@ -138,7 +125,7 @@ If you provide a ground-truth json file to compare the predictions of the networ
For an example image, run the following command: For an example image, run the following command:
```sh ```sh
python -m monoloco.run predict docs/002282.png \ python3 -m monoloco.run predict docs/002282.png \
--path_gt names-kitti-200615-1022.json \ --path_gt names-kitti-200615-1022.json \
-o <output directory> \ -o <output directory> \
--long-edge <rescale the image by providing dimension of long side> --long-edge <rescale the image by providing dimension of long side>
@ -153,7 +140,7 @@ To show all the instances estimated by MonoLoco add the argument `--show_all` to
It is also possible to run [openpifpaf](https://github.com/vita-epfl/openpifpaf) directly It is also possible to run [openpifpaf](https://github.com/vita-epfl/openpifpaf) directly
by using `--mode keypoints`. All the other pifpaf arguments are also supported by using `--mode keypoints`. All the other pifpaf arguments are also supported
and can be checked with `python -m monoloco.run predict --help`. and can be checked with `python3 -m monoloco.run predict --help`.
![predict](docs/out_002282_pifpaf.jpg) ![predict](docs/out_002282_pifpaf.jpg)
@ -191,7 +178,7 @@ To visualize social distancing compliance, simply add the argument `social_dista
Threshold distance and radii (for F-formations) can be set using `--threshold-dist` and `--radii`, respectively. Threshold distance and radii (for F-formations) can be set using `--threshold-dist` and `--radii`, respectively.
For more info, run: For more info, run:
`python -m monoloco.run predict --help` `python3 -m monoloco.run predict --help`
**Examples** <br> **Examples** <br>
An example from the Collective Activity Dataset is provided below. An example from the Collective Activity Dataset is provided below.
@ -201,66 +188,79 @@ An example from the Collective Activity Dataset is provided below.
To visualize social distancing run the below, command: To visualize social distancing run the below, command:
```sh ```sh
python -m monoloco.run predict docs/frame0032.jpg \ pip3 install scipy
```
```sh
python3 -m monoloco.run predict docs/frame0032.jpg \
--activities social_distance --output_types front bird --activities social_distance --output_types front bird
``` ```
<img src="docs/out_frame0032_front_bird.jpg" width="700"/> <img src="docs/out_frame0032_front_bird.jpg" width="700"/>
## C) Raise hand detection ## C) Hand-raising detection
To detect a risen hand, you can add `raise_hand` to `--activities`. To detect raised hand, you can add the argument `--activities raise_hand` to the prediction command.
For example, the below image is obtained with:
```sh
python3 -m monoloco.run predict docs/raising_hand.jpg \
--activities raise_hand social_distance --output_types front
```
<img src="docs/out_raising_hand.jpg.front.jpg" width="500"/>
For more info, run: For more info, run:
`python -m monoloco.run predict --help` `python3 -m monoloco.run predict --help`
**Examples** <br>
The command below:
```
python -m monoloco.run predict .\docs\raising_hand.jpg \
--output_types front \
--activities raise_hand
```
Yields the following:
![raise_hand_taxi](docs/out_raising_hand.jpg.front.png)
## D) Orientation and Bounding Box dimensions ## D) Orientation and Bounding Box dimensions
The network estimates orientation and box dimensions as well. Results are saved in a json file when using the command The network estimates orientation and box dimensions as well. Results are saved in a json file when using the command
`--output_types json`. At the moment, the only visualization including orientation is the social distancing one. `--output_types json`. At the moment, the only visualization including orientation is the social distancing one.
<br /> <br />
## Training ## E) Webcam
You can use the webcam as input by using the `--webcam` argument. By default the `--z_max` is set to 10 while using the webcam and the `--long-edge` is set to 144. If multiple webcams are plugged in you can choose between them using `--camera`, for instance to use the second camera you can add `--camera 1`.
You also need to install `opencv-python` to use this feature :
```sh
pip3 install opencv-python
```
Example command:
```sh
python3 -m monoloco.run predict --webcam \
--activities raise_hand social_distance
```
# Training
We train on the KITTI dataset (MonoLoco/Monoloco++/MonStereo) or the nuScenes dataset (MonoLoco) specifying the path of the json file containing the input joints. Please download them [here](https://drive.google.com/drive/folders/1j0riwbS9zuEKQ_3oIs_dWlYBnfuN2WVN?usp=sharing) or follow [preprocessing instructions](#Preprocessing). We train on the KITTI dataset (MonoLoco/Monoloco++/MonStereo) or the nuScenes dataset (MonoLoco) specifying the path of the json file containing the input joints. Please download them [here](https://drive.google.com/drive/folders/1j0riwbS9zuEKQ_3oIs_dWlYBnfuN2WVN?usp=sharing) or follow [preprocessing instructions](#Preprocessing).
Results for [MonoLoco++](###Tables) are obtained with: Results for [MonoLoco++](###Tables) are obtained with:
``` ```sh
python -m monoloco.run train --joints data/arrays/joints-kitti-mono-210422-1600.json python3 -m monoloco.run train --joints data/arrays/joints-kitti-mono-210422-1600.json
``` ```
While for the [MonStereo](###Tables) results run: While for the [MonStereo](###Tables) results run:
```sh ```sh
python -m monoloco.run train --joints data/arrays/joints-kitti-stereo-210422-1601.json --lr 0.003 --mode stereo python3 -m monoloco.run train --joints data/arrays/joints-kitti-stereo-210422-1601.json \
--lr 0.003 --mode stereo
``` ```
If you are interested in the original results of the MonoLoco ICCV article (now improved with MonoLoco++), please refer to the tag v0.4.9 in this repository. If you are interested in the original results of the MonoLoco ICCV article (now improved with MonoLoco++), please refer to the tag v0.4.9 in this repository.
Finally, for a more extensive list of available parameters, run: Finally, for a more extensive list of available parameters, run:
`python -m monstereo.run train --help` `python3 -m monstereo.run train --help`
<br /> <br />
## Preprocessing # Preprocessing
Preprocessing and training step are already fully supported by the code provided, Preprocessing and training step are already fully supported by the code provided,
but require first to run a pose detector over but require first to run a pose detector over
all the training images and collect the annotations. all the training images and collect the annotations.
The code supports this option (by running the predict script and using `--mode keypoints`). The code supports this option (by running the predict script and using `--mode keypoints`).
### Data structure ## Data structure
data data
├── outputs ├── outputs
@ -275,7 +275,7 @@ mkdir outputs arrays kitti
``` ```
### Kitti Dataset ## Kitti Dataset
Download kitti images (from left and right cameras), ground-truth files (labels), and calibration files from their [website](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) and save them inside the `data` folder as shown below. Download kitti images (from left and right cameras), ground-truth files (labels), and calibration files from their [website](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) and save them inside the `data` folder as shown below.
data data
@ -289,7 +289,7 @@ Download kitti images (from left and right cameras), ground-truth files (labels)
The network takes as inputs 2D keypoints annotations. To create them run PifPaf over the saved images: The network takes as inputs 2D keypoints annotations. To create them run PifPaf over the saved images:
```sh ```sh
python -m openpifpaf.predict \ python3 -m openpifpaf.predict \
--glob "data/kitti/images/*.png" \ --glob "data/kitti/images/*.png" \
--json-output <directory to contain predictions> \ --json-output <directory to contain predictions> \
--checkpoint=shufflenetv2k30 \ --checkpoint=shufflenetv2k30 \
@ -311,15 +311,15 @@ Once this step is complete, the below commands transform all the annotations int
For MonoLoco++: For MonoLoco++:
```sh ```sh
python -m monoloco.run prep --dir_ann <directory that contains annotations> python3 -m monoloco.run prep --dir_ann <directory that contains annotations>
``` ```
For MonStereo: For MonStereo:
```sh ```sh
python -m monoloco.run prep --mode stereo --dir_ann <directory that contains left annotations> python3 -m monoloco.run prep --mode stereo --dir_ann <directory that contains left annotations>
``` ```
### Collective Activity Dataset ## Collective Activity Dataset
To evaluate on of the [collective activity dataset](http://vhosts.eecs.umich.edu/vision//activity-dataset.html) To evaluate on of the [collective activity dataset](http://vhosts.eecs.umich.edu/vision//activity-dataset.html)
(without any training) we selected 6 scenes that contain people talking to each other. (without any training) we selected 6 scenes that contain people talking to each other.
This allows for a balanced dataset, but any other configuration will work. This allows for a balanced dataset, but any other configuration will work.
@ -346,7 +346,7 @@ which for example change the name of all the jpg images in that folder adding th
Pifpaf annotations should also be saved in a single folder and can be created with: Pifpaf annotations should also be saved in a single folder and can be created with:
```sh ```sh
python -m openpifpaf.predict \ python3 -m openpifpaf.predict \
--glob "data/collective_activity/images/*.jpg" \ --glob "data/collective_activity/images/*.jpg" \
--checkpoint=shufflenetv2k30 \ --checkpoint=shufflenetv2k30 \
--instance-threshold=0.05 --seed-threshold 0.05 \--force-complete-pose \ --instance-threshold=0.05 --seed-threshold 0.05 \--force-complete-pose \
@ -354,9 +354,9 @@ python -m openpifpaf.predict \
``` ```
## Evaluation # Evaluation
### 3D Localization ## 3D Localization
We provide evaluation on KITTI for models trained on nuScenes or KITTI. Download the ground-truths of KITTI dataset and the calibration files from their [website](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d). Save the training labels (one .txt file for each image) into the folder `data/kitti/gt` and the camera calibration matrices (one .txt file for each image) into `data/kitti/calib`. We provide evaluation on KITTI for models trained on nuScenes or KITTI. Download the ground-truths of KITTI dataset and the calibration files from their [website](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d). Save the training labels (one .txt file for each image) into the folder `data/kitti/gt` and the camera calibration matrices (one .txt file for each image) into `data/kitti/calib`.
To evaluate a pre-trained model, download the latest models from [here](https://drive.google.com/drive/u/0/folders/1kQpaTcDsiNyY6eh1kUurcpptfAXkBjAJ) and save them into `data/outputs. To evaluate a pre-trained model, download the latest models from [here](https://drive.google.com/drive/u/0/folders/1kQpaTcDsiNyY6eh1kUurcpptfAXkBjAJ) and save them into `data/outputs.
@ -386,7 +386,7 @@ To include also geometric baselines and MonoLoco, download a monoloco model, sav
The evaluation file will run the model over all the annotations and compare the results with KITTI ground-truth and the downloaded baselines. For this run: The evaluation file will run the model over all the annotations and compare the results with KITTI ground-truth and the downloaded baselines. For this run:
```sh ```sh
python -m monoloco.run eval \ python3 -m monoloco.run eval \
--dir_ann <annotation directory> \ --dir_ann <annotation directory> \
--model data/outputs/monoloco_pp-210422-1601.pkl \ --model data/outputs/monoloco_pp-210422-1601.pkl \
--generate \ --generate \
@ -395,14 +395,14 @@ python -m monoloco.run eval \
For stereo results add `--mode stereo` and select `--model=monstereo-210422-1620.pkl`. Below, the resulting table of results and an example of the saved figures. For stereo results add `--mode stereo` and select `--model=monstereo-210422-1620.pkl`. Below, the resulting table of results and an example of the saved figures.
### Tables ## Tables
<img src="docs/quantitative.jpg" width="700"/> <img src="docs/quantitative.jpg" width="700"/>
<img src="docs/results_monstereo.jpg" width="700"/> <img src="docs/results_monstereo.jpg" width="700"/>
### Relative Average Precision Localization: RALP-5% (MonStereo) ## Relative Average Precision Localization: RALP-5% (MonStereo)
We modified the original C++ evaluation of KITTI to make it relative to distance. We use **cmake**. We modified the original C++ evaluation of KITTI to make it relative to distance. We use **cmake**.
To run the evaluation, first generate the txt file with the standard command for evaluation (above). To run the evaluation, first generate the txt file with the standard command for evaluation (above).
@ -410,20 +410,20 @@ Then follow the instructions of this [repository](https://github.com/cguindel/ev
to prepare the folders accordingly (or follow kitti guidelines) and run evaluation. to prepare the folders accordingly (or follow kitti guidelines) and run evaluation.
The modified file is called *evaluate_object.cpp* and runs exactly as the original kitti evaluation. The modified file is called *evaluate_object.cpp* and runs exactly as the original kitti evaluation.
### Activity Estimation (Talking) ## Activity Estimation (Talking)
Please follow preprocessing steps for Collective activity dataset and run pifpaf over the dataset images. Please follow preprocessing steps for Collective activity dataset and run pifpaf over the dataset images.
Evaluation on this dataset is done with models trained on either KITTI or nuScenes. Evaluation on this dataset is done with models trained on either KITTI or nuScenes.
For optimal performances, we suggest the model trained on nuScenes teaser. For optimal performances, we suggest the model trained on nuScenes teaser.
```sh ```sh
python -m monstereo.run eval \ python3 -m monstereo.run eval \
--activity \ --activity \
--dataset collective \ --dataset collective \
--model <path to the model> \ --model <path to the model> \
--dir_ann <annotation directory> --dir_ann <annotation directory>
``` ```
## Citation # Citation
When using this library in your research, we will be happy if you cite us! When using this library in your research, we will be happy if you cite us!
``` ```

0
docs/000840.png Executable file → Normal file
View File

Before

Width:  |  Height:  |  Size: 736 KiB

After

Width:  |  Height:  |  Size: 736 KiB

0
docs/000840_right.png Executable file → Normal file
View File

Before

Width:  |  Height:  |  Size: 732 KiB

After

Width:  |  Height:  |  Size: 732 KiB

0
docs/002282.png Executable file → Normal file
View File

Before

Width:  |  Height:  |  Size: 831 KiB

After

Width:  |  Height:  |  Size: 831 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 544 KiB

View File

@ -248,7 +248,7 @@ def is_raising_hand(kp):
if is_right_risen: if is_right_risen:
return 'right' return 'right'
return 'none' return None
def check_f_formations(idx, idx_t, centers, angles, radii, social_distance=False): def check_f_formations(idx, idx_t, centers, angles, radii, social_distance=False):
@ -308,8 +308,6 @@ def show_activities(args, image_t, output_path, annotations, dic_out):
if 'social_distance' in args.activities: if 'social_distance' in args.activities:
colors = social_distance_colors(colors, dic_out) colors = social_distance_colors(colors, dic_out)
print("Size of the image :", image_t.size)
angles = dic_out['angles'] angles = dic_out['angles']
stds = dic_out['stds_ale'] stds = dic_out['stds_ale']
xz_centers = [[xx[0], xx[2]] for xx in dic_out['xyz_pred']] xz_centers = [[xx[0], xx[2]] for xx in dic_out['xyz_pred']]

View File

@ -18,7 +18,6 @@ import torch
import PIL import PIL
import openpifpaf import openpifpaf
import openpifpaf.datasets as datasets import openpifpaf.datasets as datasets
from openpifpaf.predict import processor_factory, preprocess_factory
from openpifpaf import decoder, network, visualizer, show, logger from openpifpaf import decoder, network, visualizer, show, logger
try: try:
import gdown import gdown
@ -53,14 +52,17 @@ def get_torch_checkpoints_dir():
def download_checkpoints(args): def download_checkpoints(args):
torch_dir = get_torch_checkpoints_dir() torch_dir = get_torch_checkpoints_dir()
os.makedirs(torch_dir, exist_ok=True)
if args.checkpoint is None: if args.checkpoint is None:
os.makedirs(torch_dir, exist_ok=True)
pifpaf_model = os.path.join(torch_dir, 'shufflenetv2k30-201104-224654-cocokp-d75ed641.pkl') pifpaf_model = os.path.join(torch_dir, 'shufflenetv2k30-201104-224654-cocokp-d75ed641.pkl')
print(pifpaf_model) print(pifpaf_model)
else: else:
pifpaf_model = args.checkpoint pifpaf_model = args.checkpoint
dic_models = {'keypoints': pifpaf_model} dic_models = {'keypoints': pifpaf_model}
if not os.path.exists(pifpaf_model): if not os.path.exists(pifpaf_model):
assert DOWNLOAD is not None, "pip install gdown to download pifpaf model, or pass it as --checkpoint" assert DOWNLOAD is not None, \
"pip install gdown to download a pifpaf model, or pass the model path as --checkpoint"
LOG.info('Downloading OpenPifPaf model in %s', torch_dir) LOG.info('Downloading OpenPifPaf model in %s', torch_dir)
DOWNLOAD(OPENPIFPAF_MODEL, pifpaf_model, quiet=False) DOWNLOAD(OPENPIFPAF_MODEL, pifpaf_model, quiet=False)
@ -74,7 +76,7 @@ def download_checkpoints(args):
assert not args.social_distance, "Social distance not supported in stereo modality" assert not args.social_distance, "Social distance not supported in stereo modality"
path = MONSTEREO_MODEL path = MONSTEREO_MODEL
name = 'monstereo-201202-1212.pkl' name = 'monstereo-201202-1212.pkl'
elif (args.activities and 'social_distance' in args.activities) or args.webcam: elif ('social_distance' in args.activities) or args.webcam:
path = MONOLOCO_MODEL_NU path = MONOLOCO_MODEL_NU
name = 'monoloco_pp-201207-1350.pkl' name = 'monoloco_pp-201207-1350.pkl'
else: else:
@ -85,7 +87,9 @@ def download_checkpoints(args):
print(name) print(name)
dic_models[args.mode] = model dic_models[args.mode] = model
if not os.path.exists(model): if not os.path.exists(model):
assert DOWNLOAD is not None, "pip install gdown to download monoloco model, or pass it as --model" os.makedirs(torch_dir, exist_ok=True)
assert DOWNLOAD is not None, \
"pip install gdown to download a monoloco model, or pass the model path as --model"
LOG.info('Downloading model in %s', torch_dir) LOG.info('Downloading model in %s', torch_dir)
DOWNLOAD(path, model, quiet=False) DOWNLOAD(path, model, quiet=False)
return dic_models return dic_models
@ -166,12 +170,11 @@ def predict(args):
casr=args.casr, casr=args.casr,
casr_model=args.casr_model) casr_model=args.casr_model)
# data # for openpifpaf predicitons
processor, pifpaf_model = processor_factory(args) predictor = openpifpaf.Predictor(checkpoint=args.checkpoint)
preprocess = preprocess_factory(args)
# data # data
data = datasets.ImageList(args.images, preprocess=preprocess) data = datasets.ImageList(args.images, preprocess=predictor.preprocess)
if args.mode == 'stereo': if args.mode == 'stereo':
assert len( assert len(
data.image_paths) % 2 == 0, "Odd number of images in a stereo setting" data.image_paths) % 2 == 0, "Odd number of images in a stereo setting"
@ -180,22 +183,19 @@ def predict(args):
data, batch_size=args.batch_size, shuffle=False, data, batch_size=args.batch_size, shuffle=False,
pin_memory=False, collate_fn=datasets.collate_images_anns_meta) pin_memory=False, collate_fn=datasets.collate_images_anns_meta)
for batch_i, (image_tensors_batch, _, meta_batch) in enumerate(data_loader): for batch_i, (_, _, meta_batch) in enumerate(data_loader):
pred_batch = processor.batch(
pifpaf_model, image_tensors_batch, device=args.device)
# unbatch (only for MonStereo) # unbatch (only for MonStereo)
for idx, (pred, meta) in enumerate(zip(pred_batch, meta_batch)): for idx, (preds, _, meta) in enumerate(predictor.dataset(data)):
LOG.info('batch %d: %s', batch_i, meta['file_name']) LOG.info('batch %d: %s', batch_i, meta['file_name'])
pred = [ann.inverse_transform(meta) for ann in pred]
# Load image and collect pifpaf results # Load image and collect pifpaf results
if idx == 0: if idx == 0:
with open(meta_batch[0]['file_name'], 'rb') as f: with open(meta_batch[0]['file_name'], 'rb') as f:
cpu_image = PIL.Image.open(f).convert('RGB') cpu_image = PIL.Image.open(f).convert('RGB')
pifpaf_outs = { pifpaf_outs = {
'pred': pred, 'pred': preds,
'left': [ann.json_data() for ann in pred], 'left': [ann.json_data() for ann in preds],
'image': cpu_image} 'image': cpu_image}
# Set output image name # Set output image name
@ -212,7 +212,7 @@ def predict(args):
# Only for MonStereo # Only for MonStereo
else: else:
pifpaf_outs['right'] = [ann.json_data() for ann in pred] pifpaf_outs['right'] = [ann.json_data() for ann in preds]
# 3D Predictions # 3D Predictions
if args.mode != 'keypoints': if args.mode != 'keypoints':
@ -229,7 +229,6 @@ def predict(args):
dic_out = net.forward(keypoints, kk) dic_out = net.forward(keypoints, kk)
dic_out = net.post_process( dic_out = net.post_process(
dic_out, boxes, keypoints, kk, dic_gt) dic_out, boxes, keypoints, kk, dic_gt)
if args.activities:
if 'social_distance' in args.activities: if 'social_distance' in args.activities:
dic_out = net.social_distance(dic_out, args) dic_out = net.social_distance(dic_out, args)
if 'raise_hand' in args.activities: if 'raise_hand' in args.activities:
@ -264,7 +263,7 @@ def factory_outputs(args, pifpaf_outs, dic_out, output_path, kk=None):
else: else:
assert 'json' in args.output_types or args.mode == 'keypoints', \ assert 'json' in args.output_types or args.mode == 'keypoints', \
"No output saved, please select one among front, bird, multi, json, or pifpaf arguments" "No output saved, please select one among front, bird, multi, json, or pifpaf arguments"
if args.activities and 'social_distance' in args.activities: if 'social_distance' in args.activities:
assert args.mode == 'mono', "Social distancing only works with monocular network" assert args.mode == 'mono', "Social distancing only works with monocular network"
if args.mode == 'keypoints': if args.mode == 'keypoints':

View File

@ -51,7 +51,7 @@ def cli():
# Monoloco # Monoloco
predict_parser.add_argument('--activities', nargs='+', choices=['raise_hand', 'social_distance', 'using_phone', 'is_turning'], predict_parser.add_argument('--activities', nargs='+', choices=['raise_hand', 'social_distance', 'using_phone', 'is_turning'],
help='Choose activities to show: social_distance, raise_hand') help='Choose activities to show: social_distance, raise_hand', default=[])
predict_parser.add_argument('--mode', help='keypoints, mono, stereo', default='mono') predict_parser.add_argument('--mode', help='keypoints, mono, stereo', default='mono')
predict_parser.add_argument('--model', help='path of MonoLoco/MonStereo model to load') predict_parser.add_argument('--model', help='path of MonoLoco/MonStereo model to load')
predict_parser.add_argument('--casr_model', help='path of casr model to load') predict_parser.add_argument('--casr_model', help='path of casr model to load')

View File

@ -16,7 +16,11 @@ import sys
import time import time
from itertools import chain from itertools import chain
try:
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
except ImportError:
plt = None
import torch import torch
from torch.utils.data import DataLoader from torch.utils.data import DataLoader
from torch.optim import lr_scheduler from torch.optim import lr_scheduler
@ -328,6 +332,10 @@ class Trainer:
if not self.print_loss: if not self.print_loss:
return return
os.makedirs(self.dir_figures, exist_ok=True) os.makedirs(self.dir_figures, exist_ok=True)
if plt is None:
raise Exception('please install matplotlib')
for idx, phase in enumerate(epoch_losses): for idx, phase in enumerate(epoch_losses):
for idx_2, el in enumerate(epoch_losses['train']): for idx_2, el in enumerate(epoch_losses['train']):
plt.figure(idx + idx_2) plt.figure(idx + idx_2)

View File

@ -125,7 +125,7 @@ def show_spread(dic_stats, clusters, net, dir_fig, show=False, save=False):
def show_task_error(dir_fig, show, save): def show_task_error(dir_fig, show, save):
"""Task error figure""" """Task error figure"""
plt.figure(3, figsize=FIGSIZE) plt.figure(3, figsize=FIGSIZE)
xx = np.linspace(0.1, 50, 100) xx = np.linspace(0.1, 40, 100)
mu_men = 178 mu_men = 178
mu_women = 165 mu_women = 165
mu_child_m = 164 mu_child_m = 164
@ -145,8 +145,9 @@ def show_task_error(dir_fig, show, save):
plt.plot(xx, yy_gender, '--', color='lightgreen', linewidth=2.8, label='Generic adult (task error)') plt.plot(xx, yy_gender, '--', color='lightgreen', linewidth=2.8, label='Generic adult (task error)')
plt.plot(xx, yy_female, '-.', linewidth=1.7, color='darkorange', label='Adult female') plt.plot(xx, yy_female, '-.', linewidth=1.7, color='darkorange', label='Adult female')
plt.plot(xx, yy_male, '-.', linewidth=1.7, color='b', label='Adult male') plt.plot(xx, yy_male, '-.', linewidth=1.7, color='b', label='Adult male')
plt.plot(xx, yy_stereo, linewidth=1.7, color='k', label='Pixel error') plt.plot(xx, yy_stereo, linewidth=2.5, color='k', label='Pixel error')
plt.xlim(np.min(xx), np.max(xx)) plt.xlim(np.min(xx), np.max(xx))
plt.ylim(0, 5)
plt.xlabel("Ground-truth distance from the camera $d_{gt}$ [m]") plt.xlabel("Ground-truth distance from the camera $d_{gt}$ [m]")
plt.ylabel("Localization error $\hat{e}$ due to human height variation [m]") # pylint: disable=W1401 plt.ylabel("Localization error $\hat{e}$ due to human height variation [m]") # pylint: disable=W1401
plt.legend(loc=(0.01, 0.55)) # Location from 0 to 1 from lower left plt.legend(loc=(0.01, 0.55)) # Location from 0 to 1 from lower left

View File

@ -11,14 +11,10 @@ import math
import numpy as np import numpy as np
from PIL import Image from PIL import Image
try:
import matplotlib import matplotlib
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
from matplotlib.patches import Circle, FancyArrow from matplotlib.patches import Circle, FancyArrow
import scipy.ndimage as ndimage import scipy.ndimage as ndimage
except ImportError:
ndimage = None
plt = None
COCO_PERSON_SKELETON = [ COCO_PERSON_SKELETON = [
@ -49,6 +45,10 @@ def image_canvas(image, fig_file=None, show=True, dpi_factor=1.0, fig_width=10.0
if 'figsize' not in kwargs: if 'figsize' not in kwargs:
kwargs['figsize'] = (fig_width, fig_width * image.size[1] / image.size[0]) kwargs['figsize'] = (fig_width, fig_width * image.size[1] / image.size[0])
if plt is None:
raise Exception('please install matplotlib')
if ndimage is None:
raise Exception('please install scipy')
fig = plt.figure(**kwargs) fig = plt.figure(**kwargs)
ax = plt.Axes(fig, [0.0, 0.0, 1.0, 1.0]) ax = plt.Axes(fig, [0.0, 0.0, 1.0, 1.0])
ax.set_axis_off() ax.set_axis_off()
@ -128,7 +128,6 @@ class KeypointPainter:
c = color c = color
linewidth = self.linewidth linewidth = self.linewidth
if activities:
if 'raise_hand' in activities: if 'raise_hand' in activities:
c, linewidth = highlighted_arm(x, y, connection, c, linewidth, c, linewidth = highlighted_arm(x, y, connection, c, linewidth,
dic_out['raising_hand'][:][i], size=size) dic_out['raising_hand'][:][i], size=size)

View File

@ -113,6 +113,10 @@ class Printer:
def factory_axes(self, dic_out): def factory_axes(self, dic_out):
"""Create axes for figures: front bird multi""" """Create axes for figures: front bird multi"""
if self.webcam:
plt.style.use('dark_background')
axes = [] axes = []
figures = [] figures = []
@ -190,15 +194,10 @@ class Printer:
else: else:
scores=None scores=None
if activities:
keypoint_painter.keypoints( keypoint_painter.keypoints(
axis, keypoint_sets, size=self.im.size, axis, keypoint_sets, size=self.im.size,
scores=scores, colors=colors, activities=activities, dic_out=dic_out) scores=scores, colors=colors, activities=activities, dic_out=dic_out)
else:
keypoint_painter.keypoints(
axis, keypoint_sets, size=self.im.size, colors=colors, scores=scores)
draw_orientation(axis, self.centers, draw_orientation(axis, self.centers,
sizes, self.angles, colors, mode='front') sizes, self.angles, colors, mode='front')
@ -219,7 +218,8 @@ class Printer:
def _bird_loop(self, iterator, axes, colors, number): def _bird_loop(self, iterator, axes, colors, number):
for idx in iterator: for idx in iterator:
if any(xx in self.output_types for xx in ['bird', 'multi']) and self.zz_pred[idx] > 0: if any(xx in self.output_types for xx in ['bird', 'multi']) and self.zz_pred[idx] > 0:
draw_orientation(axes[1], self.xz_centers, [], self.angles, colors, mode='bird') draw_orientation(axes[1], self.xz_centers[:len(iterator)], [],
self.angles[:len(iterator)], colors, mode='bird')
# Draw ground truth and uncertainty # Draw ground truth and uncertainty
self._draw_uncertainty(axes, idx) self._draw_uncertainty(axes, idx)
@ -232,7 +232,6 @@ class Printer:
def draw(self, figures, axes, image, dic_out=None, annotations=None): def draw(self, figures, axes, image, dic_out=None, annotations=None):
colors = ['deepskyblue' for _ in self.uv_heads] colors = ['deepskyblue' for _ in self.uv_heads]
if self.activities:
if 'social_distance' in self.activities: if 'social_distance' in self.activities:
colors = social_distance_colors(colors, dic_out) colors = social_distance_colors(colors, dic_out)
@ -246,7 +245,7 @@ class Printer:
if any(xx in self.output_types for xx in ['front', 'multi']): if any(xx in self.output_types for xx in ['front', 'multi']):
number['flag'] = True # add numbers number['flag'] = True # add numbers
# Remove image if social distance is activated # Remove image if social distance is activated
if not self.activities or 'social_distance' not in self.activities: if 'social_distance' not in self.activities:
self.mpl_im0.set_data(image) self.mpl_im0.set_data(image)
self._front_loop(iterator, axes, number, colors, annotations, dic_out) self._front_loop(iterator, axes, number, colors, annotations, dic_out)
@ -427,15 +426,17 @@ class Printer:
ax.get_yaxis().set_visible(False) ax.get_yaxis().set_visible(False)
else: else:
line_style = 'w--' if self.webcam else 'k--'
uv_max = [0., float(self.height)] uv_max = [0., float(self.height)]
xyz_max = pixel_to_camera(uv_max, self.kk, self.z_max) xyz_max = pixel_to_camera(uv_max, self.kk, self.z_max)
x_max = abs(xyz_max[0]) # shortcut to avoid oval circles in case of different kk x_max = abs(xyz_max[0]) # shortcut to avoid oval circles in case of different kk
corr = round(float(x_max / 3)) corr = round(float(x_max / 3))
ax.plot([0, x_max], [0, self.z_max], 'k--') ax.plot([0, x_max], [0, self.z_max], line_style)
ax.plot([0, -x_max], [0, self.z_max], 'k--') ax.plot([0, -x_max], [0, self.z_max], line_style)
ax.set_xlim(-x_max + corr, x_max - corr) ax.set_xlim(-x_max + corr, x_max - corr)
ax.set_ylim(0, self.z_max + 1) ax.set_ylim(0, self.z_max + 1)
ax.set_xlabel("X [m]") ax.set_xlabel("X [m]")
if self.webcam:
ax.set_box_aspect(.8) ax.set_box_aspect(.8)
plt.xlim((-x_max, x_max)) plt.xlim((-x_max, x_max))
plt.xticks(fontsize=self.attr['fontsize_ax']) plt.xticks(fontsize=self.attr['fontsize_ax'])

View File

@ -17,9 +17,9 @@ try:
except ImportError: except ImportError:
cv2 = None cv2 = None
import openpifpaf
from openpifpaf import decoder, network, visualizer, show, logger from openpifpaf import decoder, network, visualizer, show, logger
import openpifpaf.datasets as datasets import openpifpaf.datasets as datasets
from openpifpaf.predict import processor_factory, preprocess_factory
from ..visuals import Printer from ..visuals import Printer
from ..network import Loco from ..network import Loco
@ -73,6 +73,7 @@ def factory_from_args(args):
def webcam(args): def webcam(args):
assert args.mode in 'mono' assert args.mode in 'mono'
assert cv2
args, dic_models = factory_from_args(args) args, dic_models = factory_from_args(args)
@ -80,8 +81,8 @@ def webcam(args):
net = Loco(model=dic_models[args.mode], mode=args.mode, device=args.device, net = Loco(model=dic_models[args.mode], mode=args.mode, device=args.device,
n_dropout=args.n_dropout, p_dropout=args.dropout) n_dropout=args.n_dropout, p_dropout=args.dropout)
processor, pifpaf_model = processor_factory(args) # for openpifpaf predicitons
preprocess = preprocess_factory(args) predictor = openpifpaf.Predictor(checkpoint=args.checkpoint)
# Start recording # Start recording
cam = cv2.VideoCapture(args.camera) cam = cv2.VideoCapture(args.camera)
@ -93,28 +94,25 @@ def webcam(args):
scale = (args.long_edge)/frame.shape[0] scale = (args.long_edge)/frame.shape[0]
image = cv2.resize(frame, None, fx=scale, fy=scale) image = cv2.resize(frame, None, fx=scale, fy=scale)
height, width, _ = image.shape height, width, _ = image.shape
print('resized image size: {}'.format(image.shape)) LOG.debug('resized image size: {}'.format(image.shape))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
pil_image = Image.fromarray(image) pil_image = Image.fromarray(image)
data = datasets.PilImageList( data = datasets.PilImageList(
[pil_image], preprocess=preprocess) [pil_image], preprocess=predictor.preprocess)
data_loader = torch.utils.data.DataLoader( data_loader = torch.utils.data.DataLoader(
data, batch_size=1, shuffle=False, data, batch_size=1, shuffle=False,
pin_memory=False, collate_fn=datasets.collate_images_anns_meta) pin_memory=False, collate_fn=datasets.collate_images_anns_meta)
for (image_tensors_batch, _, meta_batch) in data_loader: for (_, _, _) in data_loader:
pred_batch = processor.batch(
pifpaf_model, image_tensors_batch, device=args.device)
for idx, (pred, meta) in enumerate(zip(pred_batch, meta_batch)): for idx, (preds, _, _) in enumerate(predictor.dataset(data)):
pred = [ann.inverse_transform(meta) for ann in pred]
if idx == 0: if idx == 0:
pifpaf_outs = { pifpaf_outs = {
'pred': pred, 'pred': preds,
'left': [ann.json_data() for ann in pred], 'left': [ann.json_data() for ann in preds],
'image': image} 'image': image}
if not ret: if not ret:
@ -122,7 +120,7 @@ def webcam(args):
key = cv2.waitKey(1) key = cv2.waitKey(1)
if key % 256 == 27: if key % 256 == 27:
# ESC pressed # ESC pressed
print("Escape hit, closing...") LOG.info("Escape hit, closing...")
break break
kk, dic_gt = factory_for_gt(pil_image.size, focal_length=args.focal) kk, dic_gt = factory_for_gt(pil_image.size, focal_length=args.focal)
@ -132,7 +130,6 @@ def webcam(args):
dic_out = net.forward(keypoints, kk) dic_out = net.forward(keypoints, kk)
dic_out = net.post_process(dic_out, boxes, keypoints, kk, dic_gt) dic_out = net.post_process(dic_out, boxes, keypoints, kk, dic_gt)
if args.activities:
if 'social_distance' in args.activities: if 'social_distance' in args.activities:
dic_out = net.social_distance(dic_out, args) dic_out = net.social_distance(dic_out, args)
if 'raise_hand' in args.activities: if 'raise_hand' in args.activities:
@ -141,11 +138,11 @@ def webcam(args):
visualizer_mono = Visualizer(kk, args)(pil_image) # create it with the first image visualizer_mono = Visualizer(kk, args)(pil_image) # create it with the first image
visualizer_mono.send(None) visualizer_mono.send(None)
print(dic_out) LOG.debug(dic_out)
visualizer_mono.send((pil_image, dic_out, pifpaf_outs)) visualizer_mono.send((pil_image, dic_out, pifpaf_outs))
end = time.time() end = time.time()
print("run-time: {:.2f} ms".format((end-start)*1000)) LOG.info("run-time: {:.2f} ms".format((end-start)*1000))
cam.release() cam.release()

View File

@ -32,7 +32,7 @@ setup(
zip_safe=False, zip_safe=False,
install_requires=[ install_requires=[
'openpifpaf>=v0.12.1', 'openpifpaf>=v0.12.10',
'matplotlib', 'matplotlib',
], ],
extras_require={ extras_require={