Packaging (#6)

* add box visualization

* add box visualization and change thresholds for pif preprocessing

* refactor printer

* change default values

* change confidence definition

* remove redundant function

* add debug plot in preprocessing

* add task error in evaluation

* add horizontal flipping

* add evaluation table

* add evaluation table with verbosity

* add tabulate requirement and command line option verbose

* refactor evaluate

* add task error with mean absolute deviation

* add stereo baseline

* integrate stereo baseline

* refactor factory preprocessing

* add stereo command for evaluation

* fix category bug

* add interquartile range for stereo

* use left tt for translation

* refactor stereo functions

* remvove redundant functions

* change names of constants

* add pixel error as function of depth

* fix bug on output directory

* add now time at the moment of saving

* add person sitting category

* remove box in pifpaf predictions

* fix printing name

* add printing of number of matches

* add cyclist category

* fix assertion error

* add travis file

* working eval

* working eval

* change source file

* renaming

* add pylint file

* fix pylint

* fix import

* add pyc files in gitignore

* pylint fix

* pylint fix

* add pytest cache

* update readme

* fix pylint

* fix pylint

* add travis file

* add pylint in pip install

* fix pylint
This commit is contained in:
Lorenzo Bertoni 2019-07-19 15:39:03 +02:00 committed by GitHub
parent 519de28f4e
commit 8968f3c8a2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
45 changed files with 1282 additions and 950 deletions

2
.gitignore vendored
View File

@ -2,3 +2,5 @@
data/ data/
.DS_store .DS_store
__pycache__ __pycache__
Monoloco/*.pyc
.pytest*

26
.pylintrc Normal file
View File

@ -0,0 +1,26 @@
[BASIC]
variable-rgx=[a-z0-9_]{1,30}$ # to accept 2 (dfferent) letters variables
Good-names=xx,dd,zz,hh,ww,pp,kk,lr,w1,w2,w3,mm,im,uv,ax,COV_MIN,CONF_MIN
[TYPECHECK]
disable=E1102,missing-docstring,useless-object-inheritance,duplicate-code,too-many-arguments,too-many-instance-attributes,too-many-locals,too-few-public-methods,arguments-differ,logging-format-interpolation
# List of members which are set dynamically and missed by pylint inference
# system, and so shouldn't trigger E1101 when accessed. Python regular
# expressions are accepted.
generated-members=numpy.*,torch.*,cv2.*
ignored-modules=nuscenes, tabulate, cv2
[FORMAT]
max-line-length=120

13
.travis.yml Normal file
View File

@ -0,0 +1,13 @@
dist: xenial
language: python
python:
- "3.6"
- "3.7"
install:
- pip install openpifpaf
- pip install nuscenes-devkit
- pip install tabulate
- pip install pylint
script:
- pylint monoloco --disable=unused-variable,fixme
- pytest -vv

View File

@ -31,7 +31,7 @@ All details for Pifpaf pose detector at [openpifpaf](https://github.com/vita-epf
``` ```
pip install nuscenes-devkit openpifpaf pip install openpifpaf nuscenes-devkit tabulate
``` ```
### Data structure ### Data structure
@ -63,14 +63,14 @@ Alternatively, you can download a Pifpaf pre-trained model from [openpifpaf](htt
# Interfaces # Interfaces
All the commands are run through a main file called `main.py` using subparsers. All the commands are run through a main file called `main.py` using subparsers.
To check all the commands for the parser and the subparsers run: To check all the commands for the parser and the subparsers (including openpifpaf ones) run:
* `python3 src/main.py --help`
* `python3 src/main.py prep --help`
* `python3 src/main.py predict --help`
* `python3 src/main.py train --help`
* `python3 src/main.py eval --help`
* `python3 -m monoloco.run --help`
* `python3 -m monoloco.run predict --help`
* `python3 -m monoloco.run train --help`
* `python3 -m monoloco.run eval --help`
* `python3 -m monoloco.run prep --help`
or check the file `monoloco/run.py`
# Predict # Predict
The predict script receives an image (or an entire folder using glob expressions), The predict script receives an image (or an entire folder using glob expressions),
@ -96,7 +96,7 @@ If it does not find the file, it will generate images
with all the predictions without ground-truth matching. with all the predictions without ground-truth matching.
Below an example with and without ground-truth matching. They have been created (adding or removing `--path_gt`) with: Below an example with and without ground-truth matching. They have been created (adding or removing `--path_gt`) with:
`python3 src/main.py predict --networks monoloco --glob docs/002282.png --output_types combined --scale 2 `python3 -m monoloco.run predict --networks monoloco --glob docs/002282.png --output_types combined --scale 2
--model data/models/monoloco-190513-1437.pkl --n_dropout 100 --z_max 30` --model data/models/monoloco-190513-1437.pkl --n_dropout 100 --z_max 30`
With ground truth matching (only matching people): With ground truth matching (only matching people):
@ -110,7 +110,7 @@ To accurately estimate distance, the focal length is necessary.
However, it is still possible to test Monoloco on images where the calibration matrix is not available. However, it is still possible to test Monoloco on images where the calibration matrix is not available.
Absolute distances are not meaningful but relative distance still are. Absolute distances are not meaningful but relative distance still are.
Below an example on a generic image from the web, created with: Below an example on a generic image from the web, created with:
`python3 src/main.py predict --networks monoloco --glob docs/surf.jpg --output_types combined --model data/models/monoloco-190513-1437.pkl --n_dropout 100 --z_max 25` `python3 -m monoloco.run predict --networks monoloco --glob docs/surf.jpg --output_types combined --model data/models/monoloco-190513-1437.pkl --n_dropout 100 --z_max 25`
![no calibration](docs/surf.jpg.combined.png) ![no calibration](docs/surf.jpg.combined.png)
@ -124,7 +124,7 @@ Multiple visualizations can be combined in different windows.
The above gif has been obtained running on a Macbook the command: The above gif has been obtained running on a Macbook the command:
`python src/main.py predict --webcam --scale 0.2 --output_types combined --z_max 10 --checkpoint resnet50` `python3 -m monoloco.run predict --webcam --scale 0.2 --output_types combined --z_max 10 --checkpoint resnet50`
# Preprocess # Preprocess
@ -148,7 +148,7 @@ You can create them running the predict script and using `--networks pifpaf`.
### Inputs joints for training ### Inputs joints for training
MonoLoco is trained using 2D human pose joints matched with the ground truth location provided by MonoLoco is trained using 2D human pose joints matched with the ground truth location provided by
nuScenes or KITTI Dataset. To create the joints run: `python src/main.py prep` specifying: nuScenes or KITTI Dataset. To create the joints run: `python3 -m monoloco.run prep` specifying:
1. `--dir_ann` annotation directory containing Pifpaf joints of KITTI or nuScenes. 1. `--dir_ann` annotation directory containing Pifpaf joints of KITTI or nuScenes.
2. `--dataset` Which dataset to preprocess. For nuscenes, all three versions of the 2. `--dataset` Which dataset to preprocess. For nuscenes, all three versions of the
@ -163,12 +163,12 @@ by the image name to easily access ground truth files for evaluation and predict
# Train # Train
Provide the json file containing the preprocess joints as argument. Provide the json file containing the preprocess joints as argument.
As simple as `python3 src/main.py --train --joints <json file path>` As simple as `python3 -m monoloco.run --train --joints <json file path>`
All the hyperparameters options can be checked at `python3 src/main.py train --help`. All the hyperparameters options can be checked at `python3 -m monoloco.run train --help`.
### Hyperparameters tuning ### Hyperparameters tuning
Random search in log space is provided. An example: `python3 src/main.py train --hyp --multiplier 10 --r_seed 1`. Random search in log space is provided. An example: `python3 -m monoloco.run train --hyp --multiplier 10 --r_seed 1`.
One iteration of the multiplier includes 6 runs. One iteration of the multiplier includes 6 runs.
@ -176,7 +176,7 @@ One iteration of the multiplier includes 6 runs.
Evaluate performances of the trained model on KITTI or Nuscenes Dataset. Evaluate performances of the trained model on KITTI or Nuscenes Dataset.
### 1) nuScenes ### 1) nuScenes
Evaluation on nuScenes is already provided during training. It is also possible to evaluate an existing model running Evaluation on nuScenes is already provided during training. It is also possible to evaluate an existing model running
`python src/main.py eval --dataset nuscenes --model <model to evaluate>` `python3 -m monoloco.run eval --dataset nuscenes --model <model to evaluate>`
### 2) KITTI ### 2) KITTI
### Baselines ### Baselines
@ -186,7 +186,7 @@ and stereo Baselines:
[Mono3D](https://www.cs.toronto.edu/~urtasun/publications/chen_etal_cvpr16.pdf), [Mono3D](https://www.cs.toronto.edu/~urtasun/publications/chen_etal_cvpr16.pdf),
[3DOP](https://xiaozhichen.github.io/papers/nips15chen.pdf), [3DOP](https://xiaozhichen.github.io/papers/nips15chen.pdf),
[MonoDepth](https://arxiv.org/abs/1609.03677) and our [MonoDepth](https://arxiv.org/abs/1609.03677) and our
[Geometrical Baseline](src/eval/geom_baseline.py). [Geometrical Baseline](monoloco/eval/geom_baseline.py).
* **Mono3D**: download validation files from [here](http://3dimage.ee.tsinghua.edu.cn/cxz/mono3d) * **Mono3D**: download validation files from [here](http://3dimage.ee.tsinghua.edu.cn/cxz/mono3d)
and save them into `data/kitti/m3d` and save them into `data/kitti/m3d`
@ -196,7 +196,7 @@ and save them into `data/kitti/3dop`
[here](https://github.com/Parrotlife/pedestrianDepth-baseline/tree/master/MonoDepth-PyTorch) [here](https://github.com/Parrotlife/pedestrianDepth-baseline/tree/master/MonoDepth-PyTorch)
and save them into `data/kitti/monodepth` and save them into `data/kitti/monodepth`
* **GeometricalBaseline**: A geometrical baseline comparison is provided. * **GeometricalBaseline**: A geometrical baseline comparison is provided.
The best average value for comparison can be created running `python src/main.py eval --geometric` The best average value for comparison can be created running `python3 -m monoloco.run eval --geometric`
#### Evaluation #### Evaluation
First the model preprocess the joints starting from json annotations predicted from pifpaf, First the model preprocess the joints starting from json annotations predicted from pifpaf,
@ -205,7 +205,7 @@ in txt file with format comparable to other baseline.
Then the model performs evaluation. Then the model performs evaluation.
The following graph is obtained running: The following graph is obtained running:
`python3 src/main.py eval --dataset kitti --generate --model data/models/monoloco-190513-1437.pkl `python3 -m monoloco.run eval --dataset kitti --generate --model data/models/monoloco-190513-1437.pkl
--dir_ann <folder containing pifpaf annotations of KITTI images>` --dir_ann <folder containing pifpaf annotations of KITTI images>`
![kitti_evaluation](docs/results.png) ![kitti_evaluation](docs/results.png)

0
monoloco/__init__.py Normal file
View File

View File

View File

@ -1,38 +1,45 @@
"""Evaluate Monoloco code on KITTI dataset using ALE and ALP metrics""" """Evaluate Monoloco code on KITTI dataset using ALE and ALP metrics with the following baselines:
import os
import math
import logging
from collections import defaultdict
import datetime
from utils.iou import get_iou_matches
from utils.misc import get_task_error
from utils.kitti import check_conditions, get_category, split_training, parse_ground_truth
from visuals.results import print_results
class KittiEval:
"""
Evaluate Monoloco code and compare it with the following baselines:
- Mono3D - Mono3D
- 3DOP - 3DOP
- MonoDepth - MonoDepth
""" """
import os
import math
import logging
import datetime
from collections import defaultdict
from itertools import chain
from tabulate import tabulate
from ..utils.iou import get_iou_matches
from ..utils.misc import get_task_error, get_pixel_error
from ..utils.kitti import check_conditions, get_category, split_training, parse_ground_truth
from ..visuals.results import print_results
class EvalKitti:
logging.basicConfig(level=logging.INFO) logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
CLUSTERS = ('easy', 'moderate', 'hard', 'all', '6', '10', '15', '20', '25', '30', '40', '50', '>50') CLUSTERS = ('easy', 'moderate', 'hard', 'all', '6', '10', '15', '20', '25', '30', '40', '50', '>50')
dic_stds = defaultdict(lambda: defaultdict(list)) METHODS = ['m3d', 'geom', 'task_error', '3dop', 'our']
dic_stats = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: defaultdict(float)))) HEADERS = ['method', '<0.5', '<1m', '<2m', 'easy', 'moderate', 'hard', 'all']
dic_cnt = defaultdict(int) CATEGORIES = ['pedestrian', 'cyclist']
errors = defaultdict(lambda: defaultdict(list))
def __init__(self, thresh_iou_our=0.3, thresh_iou_m3d=0.3, thresh_conf_m3d=0.3, thresh_conf_our=0.3,
verbose=False, stereo=False):
def __init__(self, thresh_iou_our=0.3, thresh_iou_m3d=0.5, thresh_conf_m3d=0.5, thresh_conf_our=0.3):
self.dir_gt = os.path.join('data', 'kitti', 'gt') self.dir_gt = os.path.join('data', 'kitti', 'gt')
self.dir_m3d = os.path.join('data', 'kitti', 'm3d') self.dir_m3d = os.path.join('data', 'kitti', 'm3d')
self.dir_3dop = os.path.join('data', 'kitti', '3dop') self.dir_3dop = os.path.join('data', 'kitti', '3dop')
self.dir_md = os.path.join('data', 'kitti', 'monodepth') self.dir_md = os.path.join('data', 'kitti', 'monodepth')
self.dir_our = os.path.join('data', 'kitti', 'monoloco') self.dir_our = os.path.join('data', 'kitti', 'monoloco')
self.stereo = stereo
if self.stereo:
self.dir_our_stereo = os.path.join('data', 'kitti', 'monoloco_stereo')
self.METHODS.extend(['our_stereo', 'pixel_error'])
path_train = os.path.join('splits', 'kitti_train.txt') path_train = os.path.join('splits', 'kitti_train.txt')
path_val = os.path.join('splits', 'kitti_val.txt') path_val = os.path.join('splits', 'kitti_val.txt')
dir_logs = os.path.join('data', 'logs') dir_logs = os.path.join('data', 'logs')
@ -41,106 +48,101 @@ class KittiEval:
now = datetime.datetime.now() now = datetime.datetime.now()
now_time = now.strftime("%Y%m%d-%H%M")[2:] now_time = now.strftime("%Y%m%d-%H%M")[2:]
self.path_results = os.path.join(dir_logs, 'eval-' + now_time + '.json') self.path_results = os.path.join(dir_logs, 'eval-' + now_time + '.json')
self.verbose = verbose
assert os.path.exists(self.dir_m3d) and os.path.exists(self.dir_our) \ assert os.path.exists(self.dir_m3d) and os.path.exists(self.dir_our) \
and os.path.exists(self.dir_3dop) and os.path.exists(self.dir_3dop)
self.dic_thresh_iou = {'m3d': thresh_iou_m3d, '3dop': thresh_iou_m3d, self.dic_thresh_iou = {'m3d': thresh_iou_m3d, '3dop': thresh_iou_m3d,
'md': thresh_iou_our, 'our': thresh_iou_our} 'md': thresh_iou_our, 'our': thresh_iou_our, 'our_stereo': thresh_iou_our}
self.dic_thresh_conf = {'m3d': thresh_conf_m3d, '3dop': thresh_conf_m3d, 'our': thresh_conf_our} self.dic_thresh_conf = {'m3d': thresh_conf_m3d, '3dop': thresh_conf_m3d,
'our': thresh_conf_our, 'our_stereo': thresh_conf_our}
# Extract validation images for evaluation # Extract validation images for evaluation
names_gt = tuple(os.listdir(self.dir_gt)) names_gt = tuple(os.listdir(self.dir_gt))
_, self.set_val = split_training(names_gt, path_train, path_val) _, self.set_val = split_training(names_gt, path_train, path_val)
# Define variables to save statistics
self.errors = None
self.dic_stds = None
self.dic_stats = None
self.dic_cnt = None
self.cnt_stereo_error = None
self.cnt_gt = 0
def run(self): def run(self):
"""Evaluate Monoloco performances on ALP and ALE metrics""" """Evaluate Monoloco performances on ALP and ALE metrics"""
# Iterate over each ground truth file in the training set for category in self.CATEGORIES:
cnt_gt = 0
for name in self.set_val:
path_gt = os.path.join(self.dir_gt, name)
path_m3d = os.path.join(self.dir_m3d, name)
path_our = os.path.join(self.dir_our, name)
path_3dop = os.path.join(self.dir_3dop, name)
path_md = os.path.join(self.dir_md, name)
# Iterate over each line of the gt file and save box location and distances # Initialize variables
out_gt = parse_ground_truth(path_gt) self.errors = defaultdict(lambda: defaultdict(list))
cnt_gt += len(out_gt[0]) self.dic_stds = defaultdict(lambda: defaultdict(list))
self.dic_stats = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: defaultdict(float))))
self.dic_cnt = defaultdict(int)
self.cnt_gt = 0
self.cnt_stereo_error = 0
# Extract annotations for the same file # Iterate over each ground truth file in the training set
if out_gt[0]: for name in self.set_val:
out_m3d = self._parse_txts(path_m3d, method='m3d') path_gt = os.path.join(self.dir_gt, name)
out_3dop = self._parse_txts(path_3dop, method='3dop') path_m3d = os.path.join(self.dir_m3d, name)
out_md = self._parse_txts(path_md, method='md') path_our = os.path.join(self.dir_our, name)
out_our = self._parse_txts(path_our, method='our') if self.stereo:
path_our_stereo = os.path.join(self.dir_our_stereo, name)
path_3dop = os.path.join(self.dir_3dop, name)
path_md = os.path.join(self.dir_md, name)
# Compute the error with ground truth # Iterate over each line of the gt file and save box location and distances
self._estimate_error(out_gt, out_m3d, method='m3d') out_gt = parse_ground_truth(path_gt, category)
self._estimate_error(out_gt, out_3dop, method='3dop') self.cnt_gt += len(out_gt[0])
self._estimate_error(out_gt, out_md, method='md')
self._estimate_error(out_gt, out_our, method='our')
# Iterate over all the files together to find a pool of common annotations # Extract annotations for the same file
self._compare_error(out_gt, out_m3d, out_3dop, out_md, out_our) if out_gt[0]:
out_m3d = self._parse_txts(path_m3d, category, method='m3d')
out_3dop = self._parse_txts(path_3dop, category, method='3dop')
# out_md = self._parse_txts(path_md, category, method='md')
out_md = out_m3d
out_our = self._parse_txts(path_our, category, method='our')
out_our_stereo = self._parse_txts(path_our_stereo, category, method='our') if self.stereo else []
# Update statistics of errors and uncertainty # Compute the error with ground truth
for key in self.errors: self._estimate_error(out_gt, out_m3d, method='m3d')
add_true_negatives(self.errors[key], cnt_gt) self._estimate_error(out_gt, out_3dop, method='3dop')
for clst in self.CLUSTERS[:-2]: # M3d and pifpaf does not have annotations above 40 meters # self._estimate_error(out_gt, out_md, method='md')
get_statistics(self.dic_stats['test'][key][clst], self.errors[key][clst], self.dic_stds[clst], key) self._estimate_error(out_gt, out_our, method='our')
if self.stereo:
self._estimate_error(out_gt, out_our_stereo, method='our_stereo')
# Show statistics # Iterate over all the files together to find a pool of common annotations
print(" Number of GT annotations: {} ".format(cnt_gt)) self._compare_error(out_gt, out_m3d, out_3dop, out_md, out_our, out_our_stereo)
for key in self.errors:
if key in ['our', 'm3d', '3dop']:
print(" Number of {} annotations with confidence >= {} : {} "
.format(key, self.dic_thresh_conf[key], self.dic_cnt[key]))
for clst in self.CLUSTERS[:-9]: # Update statistics of errors and uncertainty
print(" {} Average error in cluster {}: {:.2f} with a max error of {:.1f}, " for key in self.errors:
"for {} annotations" add_true_negatives(self.errors[key], self.cnt_gt)
.format(key, clst, self.dic_stats['test'][key][clst]['mean'], for clst in self.CLUSTERS[:-2]: # M3d and pifpaf does not have annotations above 40 meters
self.dic_stats['test'][key][clst]['max'], get_statistics(self.dic_stats['test'][key][clst], self.errors[key][clst], self.dic_stds[clst], key)
self.dic_stats['test'][key][clst]['cnt']))
if key == 'our': # Show statistics
print("% of annotation inside the confidence interval: {:.1f} %, " print('\n' + category.upper() + ':')
"of which {:.1f} % at higher risk" self.show_statistics()
.format(100 * self.dic_stats['test'][key][clst]['interval'],
100 * self.dic_stats['test'][key][clst]['at_risk']))
for perc in ['<0.5m', '<1m', '<2m']:
print("{} Instances with error {}: {:.2f} %"
.format(key, perc, 100 * sum(self.errors[key][perc])/len(self.errors[key][perc])))
print("\n Number of matched annotations: {:.1f} %".format(self.errors[key]['matched']))
print("-"*100)
print("\n Annotations inside the confidence interval: {:.1f} %"
.format(100 * self.dic_stats['test']['our']['all']['interval']))
print("precision 1: {:.2f}".format(self.dic_stats['test']['our']['all']['prec_1']))
print("precision 2: {:.2f}".format(self.dic_stats['test']['our']['all']['prec_2']))
def printer(self, show): def printer(self, show):
print_results(self.dic_stats, show) print_results(self.dic_stats, show)
def _parse_txts(self, path, method): def _parse_txts(self, path, category, method):
boxes = [] boxes = []
dds = [] dds = []
stds_ale = [] stds_ale = []
stds_epi = [] stds_epi = []
dds_geom = [] dds_geom = []
# xyzs = []
# xy_kps = []
# Iterate over each line of the txt file # Iterate over each line of the txt file
if method in ['3dop', 'm3d']: if method in ['3dop', 'm3d']:
try: try:
with open(path, "r") as ff: with open(path, "r") as ff:
for line in ff: for line in ff:
if check_conditions(line, thresh=self.dic_thresh_conf[method], mode=method): if check_conditions(line, category, method=method, thresh=self.dic_thresh_conf[method]):
boxes.append([float(x) for x in line.split()[4:8]]) boxes.append([float(x) for x in line.split()[4:8]])
loc = ([float(x) for x in line.split()[11:14]]) loc = ([float(x) for x in line.split()[11:14]])
dds.append(math.sqrt(loc[0] ** 2 + loc[1] ** 2 + loc[2] ** 2)) dds.append(math.sqrt(loc[0] ** 2 + loc[1] ** 2 + loc[2] ** 2))
@ -155,7 +157,7 @@ class KittiEval:
with open(path, "r") as ff: with open(path, "r") as ff:
for line in ff: for line in ff:
box = [float(x[:-1]) for x in line.split()[0:4]] box = [float(x[:-1]) for x in line.split()[0:4]]
delta_h = (box[3] - box[1]) / 10 delta_h = (box[3] - box[1]) / 10 # TODO Add new value
delta_w = (box[2] - box[0]) / 10 delta_w = (box[2] - box[0]) / 10
assert delta_h > 0 and delta_w > 0, "Bounding box <=0" assert delta_h > 0 and delta_w > 0, "Bounding box <=0"
box[0] -= delta_w box[0] -= delta_w
@ -178,13 +180,14 @@ class KittiEval:
for line_our in file_lines[:-1]: for line_our in file_lines[:-1]:
line_list = [float(x) for x in line_our.split()] line_list = [float(x) for x in line_our.split()]
if check_conditions(line_list, thresh=self.dic_thresh_conf[method], mode=method): if check_conditions(line_list, category, method=method, thresh=self.dic_thresh_conf[method]):
boxes.append(line_list[:4]) boxes.append(line_list[:4])
dds.append(line_list[8]) dds.append(line_list[8])
stds_ale.append(line_list[9]) stds_ale.append(line_list[9])
stds_epi.append(line_list[10]) stds_epi.append(line_list[10])
dds_geom.append(line_list[11]) dds_geom.append(line_list[11])
self.dic_cnt[method] += 1 self.dic_cnt[method] += 1
self.dic_cnt['geom'] += 1
# kk_list = [float(x) for x in file_lines[-1].split()] # kk_list = [float(x) for x in file_lines[-1].split()]
@ -196,8 +199,8 @@ class KittiEval:
def _estimate_error(self, out_gt, out, method): def _estimate_error(self, out_gt, out, method):
"""Estimate localization error""" """Estimate localization error"""
boxes_gt, _, dds_gt, truncs_gt, occs_gt = out_gt boxes_gt, _, dds_gt, zzs_gt, truncs_gt, occs_gt = out_gt
if method == 'our': if method[:3] == 'our':
boxes, dds, stds_ale, stds_epi, dds_geom = out boxes, dds, stds_ale, stds_epi, dds_geom = out
else: else:
boxes, dds = out boxes, dds = out
@ -208,19 +211,28 @@ class KittiEval:
# Update error if match is found # Update error if match is found
cat = get_category(boxes_gt[idx_gt], truncs_gt[idx_gt], occs_gt[idx_gt]) cat = get_category(boxes_gt[idx_gt], truncs_gt[idx_gt], occs_gt[idx_gt])
self.update_errors(dds[idx], dds_gt[idx_gt], cat, self.errors[method]) self.update_errors(dds[idx], dds_gt[idx_gt], cat, self.errors[method])
if method == 'our': if method == 'our':
self.update_errors(dds_geom[idx], dds_gt[idx_gt], cat, self.errors['geom']) self.update_errors(dds_geom[idx], dds_gt[idx_gt], cat, self.errors['geom'])
self.update_uncertainty(stds_ale[idx], stds_epi[idx], dds[idx], dds_gt[idx_gt], cat) self.update_uncertainty(stds_ale[idx], stds_epi[idx], dds[idx], dds_gt[idx_gt], cat)
dd_task_error = dds_gt[idx_gt] + (get_task_error(dds_gt[idx_gt], mode='mad'))**2
self.update_errors(dd_task_error, dds_gt[idx_gt], cat, self.errors['task_error'])
def _compare_error(self, out_gt, out_m3d, out_3dop, out_md, out_our): elif method == 'our_stereo':
dd_pixel_error = get_pixel_error(dds_gt[idx_gt], zzs_gt[idx_gt])
self.update_errors(dd_pixel_error, dds_gt[idx_gt], cat, self.errors['pixel_error'])
def _compare_error(self, out_gt, out_m3d, out_3dop, out_md, out_our, out_our_stereo):
"""Compare the error for a pool of instances commonly matched by all methods""" """Compare the error for a pool of instances commonly matched by all methods"""
# Extract outputs of each method # Extract outputs of each method
boxes_gt, _, dds_gt, truncs_gt, occs_gt = out_gt boxes_gt, _, dds_gt, zzs_gt, truncs_gt, occs_gt = out_gt
boxes_m3d, dds_m3d = out_m3d boxes_m3d, dds_m3d = out_m3d
boxes_3dop, dds_3dop = out_3dop boxes_3dop, dds_3dop = out_3dop
boxes_md, dds_md = out_md boxes_md, dds_md = out_md
boxes_our, dds_our, _, _, dds_geom = out_our boxes_our, dds_our, _, _, dds_geom = out_our
if self.stereo:
boxes_our_stereo, dds_our_stereo, _, _, dds_geom_stereo = out_our_stereo
# Find IoU matches # Find IoU matches
matches_our = get_iou_matches(boxes_our, boxes_gt, self.dic_thresh_iou['our']) matches_our = get_iou_matches(boxes_our, boxes_gt, self.dic_thresh_iou['our'])
@ -234,12 +246,25 @@ class KittiEval:
if check: if check:
cat = get_category(boxes_gt[idx_gt], truncs_gt[idx_gt], occs_gt[idx_gt]) cat = get_category(boxes_gt[idx_gt], truncs_gt[idx_gt], occs_gt[idx_gt])
dd_gt = dds_gt[idx_gt] dd_gt = dds_gt[idx_gt]
self.update_errors(dds_our[idx], dd_gt, cat, self.errors['our_merged']) self.update_errors(dds_our[idx], dd_gt, cat, self.errors['our_merged'])
self.update_errors(dds_geom[idx], dd_gt, cat, self.errors['geom_merged']) self.update_errors(dds_geom[idx], dd_gt, cat, self.errors['geom_merged'])
self.update_errors(dd_gt + get_task_error(dd_gt, mode='mad'),
dd_gt, cat, self.errors['task_error_merged'])
self.update_errors(dds_m3d[indices[0]], dd_gt, cat, self.errors['m3d_merged']) self.update_errors(dds_m3d[indices[0]], dd_gt, cat, self.errors['m3d_merged'])
self.update_errors(dds_3dop[indices[1]], dd_gt, cat, self.errors['3dop_merged']) self.update_errors(dds_3dop[indices[1]], dd_gt, cat, self.errors['3dop_merged'])
self.update_errors(dds_md[indices[2]], dd_gt, cat, self.errors['md_merged']) self.update_errors(dds_md[indices[2]], dd_gt, cat, self.errors['md_merged'])
self.dic_cnt['merged'] += 1 if self.stereo:
self.update_errors(dds_our_stereo[idx], dd_gt, cat, self.errors['our_stereo_merged'])
dd_pixel = get_pixel_error(dd_gt, zzs_gt[idx_gt])
self.update_errors(dd_pixel, dd_gt, cat, self.errors['pixel_error_merged'])
error = abs(dds_our[idx] - dd_gt)
error_stereo = abs(dds_our_stereo[idx] - dd_gt)
if error_stereo > (error + 0.1):
self.cnt_stereo_error += 1
for key in self.METHODS:
self.dic_cnt[key + '_merged'] += 1
def update_errors(self, dd, dd_gt, cat, errors): def update_errors(self, dd, dd_gt, cat, errors):
"""Compute and save errors between a single box and the gt box which match""" """Compute and save errors between a single box and the gt box which match"""
@ -320,21 +345,74 @@ class KittiEval:
self.dic_stds[clst]['prec_2'].append(prec_2) self.dic_stds[clst]['prec_2'].append(prec_2)
self.dic_stds[cat]['prec_2'].append(prec_2) self.dic_stds[cat]['prec_2'].append(prec_2)
def show_statistics(self):
print('-'*90)
alp = [[str(100 * average(self.errors[key][perc]))[:4]
for perc in ['<0.5m', '<1m', '<2m']]
for key in self.METHODS]
ale = [[str(self.dic_stats['test'][key + '_merged'][clst]['mean'])[:4] + ' (' +
str(self.dic_stats['test'][key][clst]['mean'])[:4] + ')'
for clst in self.CLUSTERS[:4]]
for key in self.METHODS]
results = [[key] + alp[idx] + ale[idx] for idx, key in enumerate(self.METHODS)]
print(tabulate(results, headers=self.HEADERS))
print('-'*90 + '\n')
if self.verbose:
methods_all = list(chain.from_iterable((method, method + '_merged') for method in self.METHODS))
for key in methods_all:
for clst in self.CLUSTERS[:4]:
print(" {} Average error in cluster {}: {:.2f} with a max error of {:.1f}, "
"for {} annotations"
.format(key, clst, self.dic_stats['test'][key][clst]['mean'],
self.dic_stats['test'][key][clst]['max'],
self.dic_stats['test'][key][clst]['cnt']))
if key == 'our':
print("% of annotation inside the confidence interval: {:.1f} %, "
"of which {:.1f} % at higher risk"
.format(self.dic_stats['test'][key][clst]['interval'],
self.dic_stats['test'][key][clst]['at_risk']))
for perc in ['<0.5m', '<1m', '<2m']:
print("{} Instances with error {}: {:.2f} %"
.format(key, perc, 100 * average(self.errors[key][perc])))
print("\nMatched annotations: {:.1f} %".format(self.errors[key]['matched']))
print(" Detected annotations : {}/{} ".format(self.dic_cnt[key], self.cnt_gt))
print("-" * 100)
print("\n Annotations inside the confidence interval: {:.1f} %"
.format(self.dic_stats['test']['our']['all']['interval']))
print("precision 1: {:.2f}".format(self.dic_stats['test']['our']['all']['prec_1']))
print("precision 2: {:.2f}".format(self.dic_stats['test']['our']['all']['prec_2']))
if self.stereo:
print("Stereo error greater than mono: {:.1f} %"
.format(100 * self.cnt_stereo_error / self.dic_cnt['our_merged']))
def get_statistics(dic_stats, errors, dic_stds, key): def get_statistics(dic_stats, errors, dic_stds, key):
"""Update statistics of a cluster""" """Update statistics of a cluster"""
dic_stats['mean'] = sum(errors) / float(len(errors)) try:
dic_stats['max'] = max(errors) dic_stats['mean'] = average(errors)
dic_stats['cnt'] = len(errors) dic_stats['max'] = max(errors)
dic_stats['cnt'] = len(errors)
except (ZeroDivisionError, ValueError):
dic_stats['mean'] = 0.
dic_stats['max'] = 0.
dic_stats['cnt'] = 0.
if key == 'our': if key == 'our':
dic_stats['std_ale'] = sum(dic_stds['ale']) / float(len(dic_stds['ale'])) dic_stats['std_ale'] = average(dic_stds['ale'])
dic_stats['std_epi'] = sum(dic_stds['epi']) / float(len(dic_stds['epi'])) dic_stats['std_epi'] = average(dic_stds['epi'])
dic_stats['interval'] = sum(dic_stds['interval']) / float(len(dic_stds['interval'])) dic_stats['interval'] = average(dic_stds['interval'])
dic_stats['at_risk'] = sum(dic_stds['at_risk']) / float(len(dic_stds['at_risk'])) dic_stats['at_risk'] = average(dic_stds['at_risk'])
dic_stats['prec_1'] = sum(dic_stds['prec_1']) / float(len(dic_stds['prec_1'])) dic_stats['prec_1'] = average(dic_stds['prec_1'])
dic_stats['prec_2'] = sum(dic_stds['prec_2']) / float(len(dic_stds['prec_2'])) dic_stats['prec_2'] = average(dic_stds['prec_2'])
def add_true_negatives(err, cnt_gt): def add_true_negatives(err, cnt_gt):
@ -379,3 +457,8 @@ def extract_indices(idx_to_check, *args):
checks[idx_method] = True checks[idx_method] = True
indices.append(idx_pred) indices.append(idx_pred)
return all(checks), indices return all(checks), indices
def average(my_list):
"""calculate mean of a list"""
return sum(my_list) / len(my_list)

View File

@ -0,0 +1,234 @@
"""Run monoloco over all the pifpaf joints of KITTI images
and extract and save the annotations in txt files"""
import math
import os
import glob
import json
import shutil
import itertools
import copy
import numpy as np
import torch
from ..predict.network import MonoLoco
from ..eval.geom_baseline import compute_distance
from ..utils.kitti import get_calibration
from ..utils.pifpaf import preprocess_pif
from ..utils.camera import xyz_from_distance, get_keypoints, pixel_to_camera
from ..utils.stereo import depth_from_disparity
class GenerateKitti:
def __init__(self, model, dir_ann, p_dropout=0.2, n_dropout=0):
# Load monoloco
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
self.monoloco = MonoLoco(model_path=model, device=device, n_dropout=n_dropout, p_dropout=p_dropout)
self.dir_out = os.path.join('data', 'kitti', 'monoloco')
self.dir_ann = dir_ann
# List of images
self.list_basename = factory_basename(dir_ann)
self.dir_kk = os.path.join('data', 'kitti', 'calib')
def run_mono(self):
"""Run Monoloco and save txt files for KITTI evaluation"""
cnt_ann = cnt_file = cnt_no_file = 0
dir_out = os.path.join('data', 'kitti', 'monoloco')
# Remove the output directory if alreaady exists (avoid residual txt files)
if os.path.exists(dir_out):
shutil.rmtree(dir_out)
os.makedirs(dir_out)
print("\nCreated empty output directory for txt files")
# Run monoloco over the list of images
for basename in self.list_basename:
path_calib = os.path.join(self.dir_kk, basename + '.txt')
annotations, kk, tt = factory_file(path_calib, self.dir_ann, basename)
boxes, keypoints = preprocess_pif(annotations, im_size=(1242, 374))
if not keypoints:
cnt_no_file += 1
continue
else:
# Run the network and the geometric baseline
outputs, varss = self.monoloco.forward(keypoints, kk)
dds_geom = eval_geometric(keypoints, kk, average_y=0.48)
# Save the file
uv_centers = get_keypoints(keypoints, mode='bottom') # Kitti uses the bottom center to calculate depth
xy_centers = pixel_to_camera(uv_centers, kk, 1)
outputs = outputs.detach().cpu()
zzs = xyz_from_distance(outputs[:, 0:1], xy_centers)[:, 2].tolist()
all_outputs = [outputs.detach().cpu(), varss.detach().cpu(), dds_geom, zzs]
all_inputs = [boxes, xy_centers]
all_params = [kk, tt]
path_txt = os.path.join(dir_out, basename + '.txt')
save_txts(path_txt, all_inputs, all_outputs, all_params)
# Update counting
cnt_ann += len(boxes)
cnt_file += 1
print("Saved in {} txt {} annotations. Not found {} images\n".format(cnt_file, cnt_ann, cnt_no_file))
def run_stereo(self):
"""Run monoloco on left and right images and alculate disparity if a match is found"""
cnt_ann = cnt_file = cnt_no_file = cnt_no_stereo = cnt_disparity = 0
dir_out = os.path.join('data', 'kitti', 'monoloco_stereo')
# Remove the output directory if alreaady exists (avoid residual txt files)
if os.path.exists(dir_out):
shutil.rmtree(dir_out)
os.makedirs(dir_out)
print("Created empty output directory for txt STEREO files")
for basename in self.list_basename:
path_calib = os.path.join(self.dir_kk, basename + '.txt')
stereo = True
for mode in ['left', 'right']:
annotations, kk, tt = factory_file(path_calib, self.dir_ann, basename, mode=mode)
boxes, keypoints = preprocess_pif(annotations, im_size=(1242, 374))
if not keypoints and mode == 'left':
cnt_no_file += 1
break
elif not keypoints and mode == 'right':
stereo = False
else:
# Run the network and the geometric baseline
outputs, varss = self.monoloco.forward(keypoints, kk)
dds_geom = eval_geometric(keypoints, kk, average_y=0.48)
uv_centers = get_keypoints(keypoints, mode='bottom') # Kitti uses the bottom to calculate depth
xy_centers = pixel_to_camera(uv_centers, kk, 1)
if mode == 'left':
outputs_l = outputs.detach().cpu()
varss_l = varss.detach().cpu()
zzs_l = xyz_from_distance(outputs_l[:, 0:1], xy_centers)[:, 2].tolist()
kps_l = copy.deepcopy(keypoints)
boxes_l = boxes
xy_centers_l = xy_centers
dds_geom_l = dds_geom
kk_l = kk
tt_l = tt
else:
kps_r = copy.deepcopy(keypoints)
if stereo:
zzs, cnt = depth_from_disparity(zzs_l, kps_l, kps_r)
cnt_disparity += cnt
else:
zzs = zzs_l
# Save the file
all_outputs = [outputs_l, varss_l, dds_geom_l, zzs]
all_inputs = [boxes_l, xy_centers_l]
all_params = [kk_l, tt_l]
path_txt = os.path.join(dir_out, basename + '.txt')
save_txts(path_txt, all_inputs, all_outputs, all_params)
# Update counting
cnt_ann += len(boxes_l)
cnt_file += 1
# Print statistics
print("Saved in {} txt {} annotations. Not found {} images."
.format(cnt_file, cnt_ann, cnt_no_file))
print("Annotations corrected using stereo: {:.1f}%, not found {} stereo files"
.format(cnt_disparity / cnt_ann * 100, cnt_no_stereo))
def save_txts(path_txt, all_inputs, all_outputs, all_params):
outputs, varss, dds_geom, zzs = all_outputs[:]
uv_boxes, xy_centers = all_inputs[:]
kk, tt = all_params[:]
with open(path_txt, "w+") as ff:
for idx in range(outputs.shape[0]):
xx = float(xy_centers[idx][0]) * zzs[idx] + tt[0]
yy = float(xy_centers[idx][1]) * zzs[idx] + tt[1]
zz = zzs[idx] + tt[2]
dd = math.sqrt(xx ** 2 + yy ** 2 + zz ** 2)
cam_0 = [xx, yy, zz, dd]
for el in uv_boxes[idx][:]:
ff.write("%s " % el)
for el in cam_0:
ff.write("%s " % el)
ff.write("%s " % float(outputs[idx][1]))
ff.write("%s " % float(varss[idx]))
ff.write("%s " % dds_geom[idx])
ff.write("\n")
# Save intrinsic matrix in the last row
for kk_el in itertools.chain(*kk): # Flatten a list of lists
ff.write("%f " % kk_el)
ff.write("\n")
def factory_basename(dir_ann):
""" Return all the basenames in the annotations folder"""
list_ann = glob.glob(os.path.join(dir_ann, '*.json'))
list_basename = [os.path.basename(x).split('.')[0] for x in list_ann]
assert list_basename, " Missing json annotations file to create txt files for KITTI datasets"
return list_basename
def factory_file(path_calib, dir_ann, basename, mode='left'):
"""Choose the annotation and the calibration files. Stereo option with ite = 1"""
assert mode in ('left', 'right')
p_left, p_right = get_calibration(path_calib)
if mode == 'left':
kk, tt = p_left[:]
path_ann = os.path.join(dir_ann, basename + '.png.pifpaf.json')
else:
kk, tt = p_right[:]
path_ann = os.path.join(dir_ann + '_right', basename + '.png.pifpaf.json')
try:
with open(path_ann, 'r') as f:
annotations = json.load(f)
except FileNotFoundError:
annotations = []
return annotations, kk, tt
def eval_geometric(keypoints, kk, average_y=0.48):
""" Evaluate geometric distance"""
dds_geom = []
uv_centers = get_keypoints(keypoints, mode='center')
uv_shoulders = get_keypoints(keypoints, mode='shoulder')
uv_hips = get_keypoints(keypoints, mode='hip')
xy_centers = pixel_to_camera(uv_centers, kk, 1)
xy_shoulders = pixel_to_camera(uv_shoulders, kk, 1)
xy_hips = pixel_to_camera(uv_hips, kk, 1)
for idx, xy_center in enumerate(xy_centers):
zz = compute_distance(xy_shoulders[idx], xy_hips[idx], average_y)
xyz_center = np.array([xy_center[0], xy_center[1], zz])
dd_geom = float(np.linalg.norm(xyz_center))
dds_geom.append(dd_geom)
return dds_geom

View File

@ -6,12 +6,10 @@ from collections import defaultdict
import numpy as np import numpy as np
from utils.camera import pixel_to_camera, get_keypoints from ..utils.camera import pixel_to_camera, get_keypoints
AVERAGE_Y = 0.48 AVERAGE_Y = 0.48
CLUSTERS = ['10', '20', '30', 'all'] CLUSTERS = ['10', '20', '30', 'all']
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def geometric_baseline(joints): def geometric_baseline(joints):
@ -30,6 +28,8 @@ def geometric_baseline(joints):
'right_ankle'] 'right_ankle']
""" """
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
cnt_tot = 0 cnt_tot = 0
dic_dist = defaultdict(lambda: defaultdict(list)) dic_dist = defaultdict(lambda: defaultdict(list))
@ -100,7 +100,7 @@ def compute_distance(xyz_norm_1, xyz_norm_2, average_y, mode='average', dy_met=0
1. knowing specific height of the annotation (head-ankle) dy_met 1. knowing specific height of the annotation (head-ankle) dy_met
2. using mean height of people (average_y) 2. using mean height of people (average_y)
""" """
assert mode == 'average' or mode == 'real' assert mode in ('average', 'real')
x1 = float(xyz_norm_1[0]) x1 = float(xyz_norm_1[0])
y1 = float(xyz_norm_1[1]) y1 = float(xyz_norm_1[1])
@ -115,13 +115,13 @@ def compute_distance(xyz_norm_1, xyz_norm_2, average_y, mode='average', dy_met=0
cc = -dy_met cc = -dy_met
# Solving the linear system Ax = b # Solving the linear system Ax = b
Aa = np.array([[y1, 0, -xx], matrix = np.array([[y1, 0, -xx],
[0, -y1, 1], [0, -y1, 1],
[y2, 0, -xx], [y2, 0, -xx],
[0, -y2, 1]]) [0, -y2, 1]])
bb = np.array([cc * xx, -cc, 0, 0]).reshape(4, 1) bb = np.array([cc * xx, -cc, 0, 0]).reshape(4, 1)
xx = np.linalg.lstsq(Aa, bb, rcond=None) xx = np.linalg.lstsq(matrix, bb, rcond=None)
z_met = abs(np.float(xx[0][1])) # Abs take into account specularity behind the observer z_met = abs(np.float(xx[0][1])) # Abs take into account specularity behind the observer
return z_met return z_met
@ -160,7 +160,7 @@ def calculate_heights(heights, mode):
Compute statistics of heights based on the distance Compute statistics of heights based on the distance
""" """
assert mode == 'mean' or mode == 'std' or mode == 'max' assert mode in ('mean', 'std', 'max')
heights_fin = {} heights_fin = {}
head_shoulder = np.array(heights['shoulder']) - np.array(heights['head']) head_shoulder = np.array(heights['shoulder']) - np.array(heights['head'])
@ -193,4 +193,3 @@ def calculate_error(dic_errors):
for clst in dic_errors: for clst in dic_errors:
errors[clst] = np.float(np.mean(np.array(dic_errors[clst]))) errors[clst] = np.float(np.mean(np.array(dic_errors[clst])))
return errors return errors

View File

View File

@ -2,7 +2,7 @@
import json import json
import os import os
from openpifpaf import show from openpifpaf import show
from visuals.printer import Printer from ..visuals.printer import Printer
def factory_for_gt(im_size, name=None, path_gt=None): def factory_for_gt(im_size, name=None, path_gt=None):
@ -24,7 +24,7 @@ def factory_for_gt(im_size, name=None, path_gt=None):
dic_gt = None dic_gt = None
x_factor = im_size[0] / 1600 x_factor = im_size[0] / 1600
y_factor = im_size[1] / 900 y_factor = im_size[1] / 900
pixel_factor = (x_factor + y_factor) / 2 pixel_factor = (x_factor + y_factor) / 2 # TODO remove and check it
if im_size[0] / im_size[1] > 2.5: if im_size[0] / im_size[1] > 2.5:
kk = [[718.3351, 0., 600.3891], [0., 718.3351, 181.5122], [0., 0., 1.]] # Kitti calibration kk = [[718.3351, 0., 600.3891], [0., 718.3351, 181.5122], [0., 0., 1.]] # Kitti calibration
else: else:
@ -45,7 +45,7 @@ def factory_outputs(args, images_outputs, output_path, pifpaf_outputs, dic_out=N
keypoint_sets, scores, pifpaf_out = pifpaf_outputs[:] keypoint_sets, scores, pifpaf_out = pifpaf_outputs[:]
# Visualizer # Visualizer
keypoint_painter = show.KeypointPainter(show_box=True) keypoint_painter = show.KeypointPainter(show_box=False)
skeleton_painter = show.KeypointPainter(show_box=False, color_connections=True, skeleton_painter = show.KeypointPainter(show_box=False, color_connections=True,
markersize=1, linewidth=4) markersize=1, linewidth=4)
@ -79,7 +79,8 @@ def factory_outputs(args, images_outputs, output_path, pifpaf_outputs, dic_out=N
printer = Printer(images_outputs[1], output_path, kk, output_types=args.output_types printer = Printer(images_outputs[1], output_path, kk, output_types=args.output_types
, z_max=args.z_max, epistemic=epistemic) , z_max=args.z_max, epistemic=epistemic)
figures, axes = printer.factory_axes() figures, axes = printer.factory_axes()
printer.draw(figures, axes, dic_out, images_outputs[1], save=True, show=args.show) printer.draw(figures, axes, dic_out, images_outputs[1], draw_box=args.draw_box,
save=True, show=args.show)
if 'json' in args.output_types: if 'json' in args.output_types:
with open(os.path.join(output_path + '.monoloco.json'), 'w') as ff: with open(os.path.join(output_path + '.monoloco.json'), 'w') as ff:

View File

@ -8,10 +8,10 @@ from collections import defaultdict
import torch import torch
from utils.iou import get_iou_matches, reorder_matches from ..utils.iou import get_iou_matches, reorder_matches
from utils.camera import get_keypoints, pixel_to_camera, xyz_from_distance from ..utils.camera import get_keypoints, pixel_to_camera, xyz_from_distance
from utils.monoloco import get_monoloco_inputs, unnormalize_bi, laplace_sampling from ..utils.network import get_monoloco_inputs, unnormalize_bi, laplace_sampling
from models.architectures import LinearModel from ..train.architectures import LinearModel
class MonoLoco: class MonoLoco:
@ -64,7 +64,7 @@ class MonoLoco:
return outputs, varss return outputs, varss
@staticmethod @staticmethod
def post_process(outputs, varss, boxes, keypoints, kk, dic_gt, iou_min=0.25): def post_process(outputs, varss, boxes, keypoints, kk, dic_gt, iou_min=0.3):
"""Post process monoloco to output final dictionary with all information for visualizations""" """Post process monoloco to output final dictionary with all information for visualizations"""
dic_out = defaultdict(list) dic_out = defaultdict(list)
@ -74,6 +74,7 @@ class MonoLoco:
if dic_gt: if dic_gt:
boxes_gt, dds_gt = dic_gt['boxes'], dic_gt['dds'] boxes_gt, dds_gt = dic_gt['boxes'], dic_gt['dds']
matches = get_iou_matches(boxes, boxes_gt, thresh=iou_min) matches = get_iou_matches(boxes, boxes_gt, thresh=iou_min)
print("found {} matches with ground-truth".format(len(matches)))
else: else:
matches = [(idx, idx) for idx, _ in enumerate(boxes)] # Replicate boxes matches = [(idx, idx) for idx, _ in enumerate(boxes)] # Replicate boxes
@ -98,6 +99,7 @@ class MonoLoco:
xyz_real = xyz_from_distance(dd_real, xy_centers[idx]) xyz_real = xyz_from_distance(dd_real, xy_centers[idx])
xyz_pred = xyz_from_distance(dd_pred, xy_centers[idx]) xyz_pred = xyz_from_distance(dd_pred, xy_centers[idx])
dic_out['boxes'].append(box) dic_out['boxes'].append(box)
dic_out['boxes_gt'].append(boxes_gt[idx_gt] if dic_gt else boxes[idx])
dic_out['dds_real'].append(dd_real) dic_out['dds_real'].append(dd_real)
dic_out['dds_pred'].append(dd_pred) dic_out['dds_pred'].append(dd_pred)
dic_out['stds_ale'].append(ale) dic_out['stds_ale'].append(ale)

View File

@ -107,4 +107,3 @@ class PifPaf:
for kps in keypoint_sets for kps in keypoint_sets
] ]
return keypoint_sets, scores, pifpaf_out return keypoint_sets, scores, pifpaf_out

View File

@ -4,10 +4,10 @@ from PIL import Image
import torch import torch
from predict.pifpaf import PifPaf, ImageList from ..predict.pifpaf import PifPaf, ImageList
from predict.monoloco import MonoLoco from ..predict.network import MonoLoco
from predict.factory import factory_for_gt, factory_outputs from ..predict.factory import factory_for_gt, factory_outputs
from utils.pifpaf import preprocess_pif from ..utils.pifpaf import preprocess_pif
def predict(args): def predict(args):

View File

View File

@ -8,11 +8,12 @@ from collections import defaultdict
import json import json
import datetime import datetime
from utils.kitti import get_calibration, split_training, parse_ground_truth from ..prep.transforms import transform_keypoints
from utils.monoloco import get_monoloco_inputs from ..utils.kitti import get_calibration, split_training, parse_ground_truth
from utils.pifpaf import preprocess_pif from ..utils.network import get_monoloco_inputs
from utils.iou import get_iou_matches from ..utils.pifpaf import preprocess_pif
from utils.misc import append_cluster from ..utils.iou import get_iou_matches
from ..utils.misc import append_cluster
class PreprocessKitti: class PreprocessKitti:
@ -29,7 +30,7 @@ class PreprocessKitti:
clst=defaultdict(lambda: defaultdict(list)))} clst=defaultdict(lambda: defaultdict(list)))}
dic_names = defaultdict(lambda: defaultdict(list)) dic_names = defaultdict(lambda: defaultdict(list))
def __init__(self, dir_ann, iou_min=0.3): def __init__(self, dir_ann, iou_min):
self.dir_ann = dir_ann self.dir_ann = dir_ann
self.iou_min = iou_min self.iou_min = iou_min
@ -52,10 +53,7 @@ class PreprocessKitti:
def run(self): def run(self):
"""Save json files""" """Save json files"""
cnt_gt = 0 cnt_gt = cnt_files = cnt_files_ped = cnt_fnf = 0
cnt_files = 0
cnt_files_ped = 0
cnt_fnf = 0
dic_cnt = {'train': 0, 'val': 0, 'test': 0} dic_cnt = {'train': 0, 'val': 0, 'test': 0}
for name in self.names_gt: for name in self.names_gt:
@ -73,10 +71,7 @@ class PreprocessKitti:
kk = p_left[0] kk = p_left[0]
# Iterate over each line of the gt file and save box location and distances # Iterate over each line of the gt file and save box location and distances
if phase == 'train': boxes_gt, boxes_3d, dds_gt = parse_ground_truth(path_gt, category='all')[:3]
(boxes_gt, boxes_3d, dds_gt, _, _) = parse_ground_truth(path_gt, mode='gt_all') # Also cyclists
else:
(boxes_gt, boxes_3d, dds_gt, _, _) = parse_ground_truth(path_gt, mode='gt') # only pedestrians
self.dic_names[basename + '.png']['boxes'] = copy.deepcopy(boxes_gt) self.dic_names[basename + '.png']['boxes'] = copy.deepcopy(boxes_gt)
self.dic_names[basename + '.png']['dds'] = copy.deepcopy(dds_gt) self.dic_names[basename + '.png']['dds'] = copy.deepcopy(dds_gt)
@ -90,7 +85,11 @@ class PreprocessKitti:
with open(os.path.join(self.dir_ann, basename + '.png.pifpaf.json'), 'r') as f: with open(os.path.join(self.dir_ann, basename + '.png.pifpaf.json'), 'r') as f:
annotations = json.load(f) annotations = json.load(f)
boxes, keypoints = preprocess_pif(annotations, im_size=(1238, 374)) boxes, keypoints = preprocess_pif(annotations, im_size=(1238, 374))
keypoints_hflip = transform_keypoints(keypoints, mode='flip')
inputs = get_monoloco_inputs(keypoints, kk).tolist() inputs = get_monoloco_inputs(keypoints, kk).tolist()
inputs_hflip = get_monoloco_inputs(keypoints, kk).tolist()
all_keypoints = [keypoints, keypoints_hflip]
all_inputs = [inputs, inputs_hflip]
except FileNotFoundError: except FileNotFoundError:
boxes = [] boxes = []
@ -98,13 +97,15 @@ class PreprocessKitti:
# Match each set of keypoint with a ground truth # Match each set of keypoint with a ground truth
matches = get_iou_matches(boxes, boxes_gt, self.iou_min) matches = get_iou_matches(boxes, boxes_gt, self.iou_min)
for (idx, idx_gt) in matches: for (idx, idx_gt) in matches:
self.dic_jo[phase]['kps'].append(keypoints[idx]) for nn, keypoints in enumerate(all_keypoints):
self.dic_jo[phase]['X'].append(inputs[idx]) inputs = all_inputs[nn]
self.dic_jo[phase]['Y'].append([dds_gt[idx_gt]]) # Trick to make it (nn,1) self.dic_jo[phase]['kps'].append(keypoints[idx])
self.dic_jo[phase]['boxes_3d'].append(boxes_3d[idx_gt]) self.dic_jo[phase]['X'].append(inputs[idx])
self.dic_jo[phase]['K'].append(kk) self.dic_jo[phase]['Y'].append([dds_gt[idx_gt]]) # Trick to make it (nn,1)
self.dic_jo[phase]['names'].append(name) # One image name for each annotation self.dic_jo[phase]['boxes_3d'].append(boxes_3d[idx_gt])
append_cluster(self.dic_jo, phase, inputs[idx], dds_gt[idx_gt], keypoints[idx]) self.dic_jo[phase]['K'].append(kk)
self.dic_jo[phase]['names'].append(name) # One image name for each annotation
append_cluster(self.dic_jo, phase, inputs[idx], dds_gt[idx_gt], keypoints[idx])
dic_cnt[phase] += 1 dic_cnt[phase] += 1
with open(self.path_joints, 'w') as file: with open(self.path_joints, 'w') as file:
@ -116,7 +117,8 @@ class PreprocessKitti:
.format(dic_cnt[phase], phase)) .format(dic_cnt[phase], phase))
print("Number of GT files: {}. Files with at least one pedestrian: {}. Files not found: {}" print("Number of GT files: {}. Files with at least one pedestrian: {}. Files not found: {}"
.format(cnt_files, cnt_files_ped, cnt_fnf)) .format(cnt_files, cnt_files_ped, cnt_fnf))
print("Number of GT annotations: {}".format(cnt_gt)) print("Matched : {:.1f} % of the ground truth instances"
.format(100 * (dic_cnt['train'] + dic_cnt['val']) / cnt_gt))
print("\nOutput files:\n{}\n{}\n".format(self.path_names, self.path_joints)) print("\nOutput files:\n{}\n{}\n".format(self.path_names, self.path_joints))
def _factory_phase(self, name): def _factory_phase(self, name):

View File

@ -13,12 +13,13 @@ import numpy as np
from nuscenes.nuscenes import NuScenes from nuscenes.nuscenes import NuScenes
from nuscenes.utils import splits from nuscenes.utils import splits
from utils.iou import get_iou_matches
from utils.misc import append_cluster from ..utils.iou import get_iou_matches
from utils.nuscenes import select_categories from ..utils.misc import append_cluster
from utils.camera import project_3d from ..utils.nuscenes import select_categories
from utils.pifpaf import preprocess_pif from ..utils.camera import project_3d
from utils.monoloco import get_monoloco_inputs from ..utils.pifpaf import preprocess_pif
from ..utils.network import get_monoloco_inputs
class PreprocessNuscenes: class PreprocessNuscenes:
@ -35,7 +36,7 @@ class PreprocessNuscenes:
} }
dic_names = defaultdict(lambda: defaultdict(list)) dic_names = defaultdict(lambda: defaultdict(list))
def __init__(self, dir_ann, dir_nuscenes, dataset, iou_min=0.3): def __init__(self, dir_ann, dir_nuscenes, dataset, iou_min):
logging.basicConfig(level=logging.INFO) logging.basicConfig(level=logging.INFO)
self.logger = logging.getLogger(__name__) self.logger = logging.getLogger(__name__)
@ -58,21 +59,13 @@ class PreprocessNuscenes:
""" """
Prepare arrays for training Prepare arrays for training
""" """
cnt_scenes = 0 cnt_scenes = cnt_samples = cnt_sd = cnt_ann = 0
cnt_samples = 0
cnt_sd = 0
cnt_ann = 0
start = time.time() start = time.time()
for ii, scene in enumerate(self.scenes): for ii, scene in enumerate(self.scenes):
end_scene = time.time() end_scene = time.time()
current_token = scene['first_sample_token'] current_token = scene['first_sample_token']
cnt_scenes += 1 cnt_scenes += 1
if ii == 0: time_left = str((end_scene - start_scene) / 60 * (len(self.scenes) - ii))[:4] if ii != 0 else "NaN"
time_left = "Nan"
else:
time_left = str((end_scene-start_scene)/60 * (len(self.scenes) - ii))[:4]
sys.stdout.write('\r' + 'Elaborating scene {}, remaining time {} minutes' sys.stdout.write('\r' + 'Elaborating scene {}, remaining time {} minutes'
.format(cnt_scenes, time_left) + '\t\n') .format(cnt_scenes, time_left) + '\t\n')
@ -93,29 +86,9 @@ class PreprocessNuscenes:
for cam in self.CAMERAS: for cam in self.CAMERAS:
sd_token = sample_dic['data'][cam] sd_token = sample_dic['data'][cam]
cnt_sd += 1 cnt_sd += 1
path_im, boxes_obj, kk = self.nusc.get_sample_data(sd_token, box_vis_level=1) # At least one corner
kk = kk.tolist()
# Extract all the annotations of the person # Extract all the annotations of the person
boxes_gt = [] name, boxes_gt, boxes_3d, dds, kk = self.extract_from_token(sd_token)
dds = []
boxes_3d = []
name = os.path.basename(path_im)
for box_obj in boxes_obj:
if box_obj.name[:6] != 'animal':
general_name = box_obj.name.split('.')[0] + '.' + box_obj.name.split('.')[1]
else:
general_name = 'animal'
if general_name in select_categories('all'):
box = project_3d(box_obj, kk)
dd = np.linalg.norm(box_obj.center)
boxes_gt.append(box)
dds.append(dd)
box_3d = box_obj.center.tolist() + box_obj.wlh.tolist()
boxes_3d.append(box_3d)
self.dic_names[name]['boxes'].append(box)
self.dic_names[name]['dds'].append(dd)
self.dic_names[name]['K'] = kk
# Run IoU with pifpaf detections and save # Run IoU with pifpaf detections and save
path_pif = os.path.join(self.dir_ann, name + '.pifpaf.json') path_pif = os.path.join(self.dir_ann, name + '.pifpaf.json')
@ -124,23 +97,24 @@ class PreprocessNuscenes:
if exists: if exists:
with open(path_pif, 'r') as file: with open(path_pif, 'r') as file:
annotations = json.load(file) annotations = json.load(file)
boxes, keypoints = preprocess_pif(annotations, im_size=(1600, 900))
else:
continue
boxes, keypoints = preprocess_pif(annotations, im_size=(1600, 900)) if keypoints:
inputs = get_monoloco_inputs(keypoints, kk).tolist()
if keypoints: matches = get_iou_matches(boxes, boxes_gt, self.iou_min)
inputs = get_monoloco_inputs(keypoints, kk).tolist() for (idx, idx_gt) in matches:
self.dic_jo[phase]['kps'].append(keypoints[idx])
matches = get_iou_matches(boxes, boxes_gt, self.iou_min) self.dic_jo[phase]['X'].append(inputs[idx])
for (idx, idx_gt) in matches: self.dic_jo[phase]['Y'].append([dds[idx_gt]]) # Trick to make it (nn,1)
self.dic_jo[phase]['kps'].append(keypoints[idx]) self.dic_jo[phase]['names'].append(name) # One image name for each annotation
self.dic_jo[phase]['X'].append(inputs[idx]) self.dic_jo[phase]['boxes_3d'].append(boxes_3d[idx_gt])
self.dic_jo[phase]['Y'].append([dds[idx_gt]]) # Trick to make it (nn,1) self.dic_jo[phase]['K'].append(kk)
self.dic_jo[phase]['names'].append(name) # One image name for each annotation append_cluster(self.dic_jo, phase, inputs[idx], dds[idx_gt], keypoints[idx])
self.dic_jo[phase]['boxes_3d'].append(boxes_3d[idx_gt]) cnt_ann += 1
self.dic_jo[phase]['K'].append(kk) sys.stdout.write('\r' + 'Saved annotations {}'.format(cnt_ann) + '\t')
append_cluster(self.dic_jo, phase, inputs[idx], dds[idx_gt], keypoints[idx])
cnt_ann += 1
sys.stdout.write('\r' + 'Saved annotations {}'.format(cnt_ann) + '\t')
current_token = sample_dic['next'] current_token = sample_dic['next']
@ -154,33 +128,55 @@ class PreprocessNuscenes:
.format(cnt_ann, cnt_samples, cnt_scenes, (end-start)/60)) .format(cnt_ann, cnt_samples, cnt_scenes, (end-start)/60))
print("\nOutput files:\n{}\n{}\n".format(self.path_names, self.path_joints)) print("\nOutput files:\n{}\n{}\n".format(self.path_names, self.path_joints))
def extract_from_token(self, sd_token):
boxes_gt = []
dds = []
boxes_3d = []
path_im, boxes_obj, kk = self.nusc.get_sample_data(sd_token, box_vis_level=1) # At least one corner
kk = kk.tolist()
name = os.path.basename(path_im)
for box_obj in boxes_obj:
if box_obj.name[:6] != 'animal':
general_name = box_obj.name.split('.')[0] + '.' + box_obj.name.split('.')[1]
else:
general_name = 'animal'
if general_name in select_categories('all'):
box = project_3d(box_obj, kk)
dd = np.linalg.norm(box_obj.center)
boxes_gt.append(box)
dds.append(dd)
box_3d = box_obj.center.tolist() + box_obj.wlh.tolist()
boxes_3d.append(box_3d)
self.dic_names[name]['boxes'].append(box)
self.dic_names[name]['dds'].append(dd)
self.dic_names[name]['K'] = kk
return name, boxes_gt, boxes_3d, dds, kk
def factory(dataset, dir_nuscenes): def factory(dataset, dir_nuscenes):
"""Define dataset type and split training and validation""" """Define dataset type and split training and validation"""
assert dataset in ['nuscenes', 'nuscenes_mini', 'nuscenes_teaser'] assert dataset in ['nuscenes', 'nuscenes_mini', 'nuscenes_teaser']
if dataset == 'nuscenes_mini':
if dataset == 'nuscenes': version = 'v1.0-mini'
nusc = NuScenes(version='v1.0-trainval', dataroot=dir_nuscenes, verbose=True)
scenes = nusc.scene
split_scenes = splits.create_splits_scenes()
split_train, split_val = split_scenes['train'], split_scenes['val']
elif dataset == 'nuscenes_mini':
nusc = NuScenes(version='v1.0-mini', dataroot=dir_nuscenes, verbose=True)
scenes = nusc.scene
split_scenes = splits.create_splits_scenes()
split_train, split_val = split_scenes['train'], split_scenes['val']
else: else:
nusc = NuScenes(version='v1.0-trainval', dataroot=dir_nuscenes, verbose=True) version = 'v1.0-trainval'
nusc = NuScenes(version=version, dataroot=dir_nuscenes, verbose=True)
scenes = nusc.scene
if dataset == 'nuscenes_teaser':
with open("splits/nuscenes_teaser_scenes.txt", "r") as file: with open("splits/nuscenes_teaser_scenes.txt", "r") as file:
teaser_scenes = file.read().splitlines() teaser_scenes = file.read().splitlines()
scenes = nusc.scene
scenes = [scene for scene in scenes if scene['token'] in teaser_scenes] scenes = [scene for scene in scenes if scene['token'] in teaser_scenes]
with open("splits/split_nuscenes_teaser.json", "r") as file: with open("splits/split_nuscenes_teaser.json", "r") as file:
dic_split = json.load(file) dic_split = json.load(file)
split_train = [scene['name'] for scene in scenes if scene['token'] in dic_split['train']] split_train = [scene['name'] for scene in scenes if scene['token'] in dic_split['train']]
split_val = [scene['name'] for scene in scenes if scene['token'] in dic_split['val']] split_val = [scene['name'] for scene in scenes if scene['token'] in dic_split['val']]
else:
split_scenes = splits.create_splits_scenes()
split_train, split_val = split_scenes['train'], split_scenes['val']
return nusc, scenes, split_train, split_val return nusc, scenes, split_train, split_val

View File

@ -0,0 +1,54 @@
import numpy as np
COCO_KEYPOINTS = [
'nose', # 1
'left_eye', # 2
'right_eye', # 3
'left_ear', # 4
'right_ear', # 5
'left_shoulder', # 6
'right_shoulder', # 7
'left_elbow', # 8
'right_elbow', # 9
'left_wrist', # 10
'right_wrist', # 11
'left_hip', # 12
'right_hip', # 13
'left_knee', # 14
'right_knee', # 15
'left_ankle', # 16
'right_ankle', # 17
]
HFLIP = {
'nose': 'nose',
'left_eye': 'right_eye',
'right_eye': 'left_eye',
'left_ear': 'right_ear',
'right_ear': 'left_ear',
'left_shoulder': 'right_shoulder',
'right_shoulder': 'left_shoulder',
'left_elbow': 'right_elbow',
'right_elbow': 'left_elbow',
'left_wrist': 'right_wrist',
'right_wrist': 'left_wrist',
'left_hip': 'right_hip',
'right_hip': 'left_hip',
'left_knee': 'right_knee',
'right_knee': 'left_knee',
'left_ankle': 'right_ankle',
'right_ankle': 'left_ankle',
}
def transform_keypoints(keypoints, mode):
assert mode == 'flip', "mode not recognized"
kps = np.array(keypoints)
dic_kps = {key: kps[:, :, idx] for idx, key in enumerate(COCO_KEYPOINTS)}
kps_hflip = np.array([dic_kps[value] for key, value in HFLIP.items()])
kps_hflip = np.transpose(kps_hflip, (1, 2, 0))
return kps_hflip.tolist()

View File

@ -1,21 +1,19 @@
# pylint: skip-file
import argparse import argparse
import os
import sys
sys.path.insert(0, os.path.join('.', 'features'))
sys.path.insert(0, os.path.join('.', 'models'))
from openpifpaf.network import nets from openpifpaf.network import nets
from openpifpaf import decoder from openpifpaf import decoder
from features.preprocess_nu import PreprocessNuscenes
from features.preprocess_ki import PreprocessKitti from .prep.preprocess_nu import PreprocessNuscenes
from predict.predict import predict from .prep.preprocess_ki import PreprocessKitti
from models.trainer import Trainer from .predict.predict import predict
from eval.generate_kitti import generate_kitti from .train.trainer import Trainer
from eval.geom_baseline import geometric_baseline from .eval.generate_kitti import GenerateKitti
from models.hyp_tuning import HypTuning from .eval.geom_baseline import geometric_baseline
from eval.kitti_eval import KittiEval from .train.hyp_tuning import HypTuning
from visuals.webcam import webcam from .eval.eval_kitti import EvalKitti
from .visuals.webcam import webcam
def cli(): def cli():
@ -37,6 +35,7 @@ def cli():
default='nuscenes') default='nuscenes')
prep_parser.add_argument('--dir_nuscenes', help='directory of nuscenes devkit', prep_parser.add_argument('--dir_nuscenes', help='directory of nuscenes devkit',
default='data/nuscenes/') default='data/nuscenes/')
prep_parser.add_argument('--iou_min', help='minimum iou to match ground truth', type=float, default=0.3)
# Predict (2D pose and/or 3D location from images) # Predict (2D pose and/or 3D location from images)
# General # General
@ -59,9 +58,9 @@ def cli():
default="data/models/monoloco-190513-1437.pkl") default="data/models/monoloco-190513-1437.pkl")
predict_parser.add_argument('--hidden_size', type=int, help='Number of hidden units in the model', default=256) predict_parser.add_argument('--hidden_size', type=int, help='Number of hidden units in the model', default=256)
predict_parser.add_argument('--path_gt', help='path of json file with gt 3d localization', predict_parser.add_argument('--path_gt', help='path of json file with gt 3d localization',
default='data/arrays/names-kitti-190513-1754.json') default='data/arrays/names-kitti-190710-1206.json')
predict_parser.add_argument('--transform', help='transformation for the pose', default='None') predict_parser.add_argument('--transform', help='transformation for the pose', default='None')
predict_parser.add_argument('--draw_kps', help='to draw kps in the images', action='store_true') predict_parser.add_argument('--draw_box', help='to draw box in the images', action='store_true')
predict_parser.add_argument('--predict', help='whether to make prediction', action='store_true') predict_parser.add_argument('--predict', help='whether to make prediction', action='store_true')
predict_parser.add_argument('--z_max', type=int, help='maximum meters distance for predictions', default=22) predict_parser.add_argument('--z_max', type=int, help='maximum meters distance for predictions', default=22)
predict_parser.add_argument('--n_dropout', type=int, help='Epistemic uncertainty evaluation', default=0) predict_parser.add_argument('--n_dropout', type=int, help='Epistemic uncertainty evaluation', default=0)
@ -87,7 +86,7 @@ def cli():
# Evaluation # Evaluation
eval_parser.add_argument('--dataset', help='datasets to evaluate, kitti or nuscenes', default='kitti') eval_parser.add_argument('--dataset', help='datasets to evaluate, kitti or nuscenes', default='kitti')
eval_parser.add_argument('--geometric', help='to evaluate geometric distance', action='store_true') eval_parser.add_argument('--geometric', help='to evaluate geometric distance', action='store_true')
eval_parser.add_argument('--generate', help='create txt files for KITTI evaluation', action='store_true') eval_parser.add_argument('--generate', help='create txt files for KITTI evaluation', action='store_true')
eval_parser.add_argument('--dir_ann', help='directory of annotations of 2d joints (for KITTI evaluation') eval_parser.add_argument('--dir_ann', help='directory of annotations of 2d joints (for KITTI evaluation')
eval_parser.add_argument('--model', help='path of MonoLoco model to load', required=True) eval_parser.add_argument('--model', help='path of MonoLoco model to load', required=True)
@ -96,7 +95,9 @@ def cli():
eval_parser.add_argument('--dropout', type=float, help='dropout. Default no dropout', default=0.2) eval_parser.add_argument('--dropout', type=float, help='dropout. Default no dropout', default=0.2)
eval_parser.add_argument('--hidden_size', type=int, help='Number of hidden units in the model', default=256) eval_parser.add_argument('--hidden_size', type=int, help='Number of hidden units in the model', default=256)
eval_parser.add_argument('--n_stage', type=int, help='Number of stages in the model', default=3) eval_parser.add_argument('--n_stage', type=int, help='Number of stages in the model', default=3)
eval_parser.add_argument('--show', help='whether to show eval statistics', action='store_true') eval_parser.add_argument('--show', help='whether to show statistic graphs', action='store_true')
eval_parser.add_argument('--verbose', help='verbosity of statistics', action='store_true')
eval_parser.add_argument('--stereo', help='include stereo baseline results', action='store_true')
args = parser.parse_args() args = parser.parse_args()
return args return args
@ -113,10 +114,10 @@ def main():
elif args.command == 'prep': elif args.command == 'prep':
if 'nuscenes' in args.dataset: if 'nuscenes' in args.dataset:
prep = PreprocessNuscenes(args.dir_ann, args.dir_nuscenes, args.dataset) prep = PreprocessNuscenes(args.dir_ann, args.dir_nuscenes, args.dataset, args.iou_min)
prep.run() prep.run()
if 'kitti' in args.dataset: if 'kitti' in args.dataset:
prep = PreprocessKitti(args.dir_ann) prep = PreprocessKitti(args.dir_ann, args.iou_min)
prep.run() prep.run()
elif args.command == 'train': elif args.command == 'train':
@ -139,10 +140,13 @@ def main():
geometric_baseline(args.joints) geometric_baseline(args.joints)
if args.generate: if args.generate:
generate_kitti(args.model, args.dir_ann, p_dropout=args.dropout, n_dropout=args.n_dropout) kitti_txt = GenerateKitti(args.model, args.dir_ann, p_dropout=args.dropout, n_dropout=args.n_dropout)
kitti_txt.run_mono()
if args.stereo:
kitti_txt.run_stereo()
if args.dataset == 'kitti': if args.dataset == 'kitti':
kitti_eval = KittiEval() kitti_eval = EvalKitti(verbose=args.verbose, stereo=args.stereo)
kitti_eval.run() kitti_eval.run()
kitti_eval.printer(show=args.show) kitti_eval.printer(show=args.show)

View File

View File

@ -3,47 +3,47 @@ import torch.nn as nn
class TriLinear(nn.Module): class TriLinear(nn.Module):
""" """
As Bilinear but without skip connection As Bilinear but without skip connection
""" """
def __init__(self, input_size, output_size, p_dropout, linear_size=1024): def __init__(self, input_size, output_size, p_dropout, linear_size=1024):
super(TriLinear, self).__init__() super(TriLinear, self).__init__()
self.input_size = input_size self.input_size = input_size
self.output_size = output_size self.output_size = output_size
self.l_size = linear_size self.l_size = linear_size
self.relu = nn.ReLU(inplace=True) self.relu = nn.ReLU(inplace=True)
self.dropout = nn.Dropout(p_dropout) self.dropout = nn.Dropout(p_dropout)
self.w1 = nn.Linear(self.input_size, self.l_size) self.w1 = nn.Linear(self.input_size, self.l_size)
self.batch_norm1 = nn.BatchNorm1d(self.l_size) self.batch_norm1 = nn.BatchNorm1d(self.l_size)
self.w2 = nn.Linear(self.l_size, self.l_size) self.w2 = nn.Linear(self.l_size, self.l_size)
self.batch_norm2 = nn.BatchNorm1d(self.l_size) self.batch_norm2 = nn.BatchNorm1d(self.l_size)
self.w3 = nn.Linear(self.l_size, self.output_size) self.w3 = nn.Linear(self.l_size, self.output_size)
def forward(self, x): def forward(self, x):
y = self.w1(x) y = self.w1(x)
y = self.batch_norm1(y) y = self.batch_norm1(y)
y = self.relu(y) y = self.relu(y)
y = self.dropout(y) y = self.dropout(y)
y = self.w2(y) y = self.w2(y)
y = self.batch_norm2(y) y = self.batch_norm2(y)
y = self.relu(y) y = self.relu(y)
y = self.dropout(y) y = self.dropout(y)
y = self.w3(y) y = self.w3(y)
return y return y
def weight_init(m): def weight_init(batch):
"""TO initialize weights using kaiming initialization""" """TO initialize weights using kaiming initialization"""
if isinstance(m, nn.Linear): if isinstance(batch, nn.Linear):
nn.init.kaiming_normal_(m.weight) nn.init.kaiming_normal_(batch.weight)
class Linear(nn.Module): class Linear(nn.Module):
@ -93,7 +93,7 @@ class LinearModel(nn.Module):
self.batch_norm1 = nn.BatchNorm1d(self.linear_size) self.batch_norm1 = nn.BatchNorm1d(self.linear_size)
self.linear_stages = [] self.linear_stages = []
for l in range(num_stage): for _ in range(num_stage):
self.linear_stages.append(Linear(self.linear_size, self.p_dropout)) self.linear_stages.append(Linear(self.linear_size, self.p_dropout))
self.linear_stages = nn.ModuleList(self.linear_stages) self.linear_stages = nn.ModuleList(self.linear_stages)
@ -109,11 +109,8 @@ class LinearModel(nn.Module):
y = self.batch_norm1(y) y = self.batch_norm1(y)
y = self.relu(y) y = self.relu(y)
y = self.dropout(y) y = self.dropout(y)
# linear layers # linear layers
for i in range(self.num_stage): for i in range(self.num_stage):
y = self.linear_stages[i](y) y = self.linear_stages[i](y)
y = self.w2(y) y = self.w2(y)
return y return y

View File

@ -54,10 +54,3 @@ class KeypointsDataset(Dataset):
count = len(self.dic_clst[clst]['Y']) count = len(self.dic_clst[clst]['Y'])
return inputs, outputs, count return inputs, outputs, count

View File

@ -1,13 +1,16 @@
import math import math
import os import os
import json import json
import time import time
import logging import logging
import torch
import random import random
import datetime import datetime
import torch
import numpy as np import numpy as np
from models.trainer import Trainer
from .trainer import Trainer
class HypTuning: class HypTuning:
@ -30,12 +33,10 @@ class HypTuning:
if not os.path.exists(dir_logs): if not os.path.exists(dir_logs):
os.makedirs(dir_logs) os.makedirs(dir_logs)
now = datetime.datetime.now()
now_time = now.strftime("%Y%m%d-%H%M")[2:]
name_out = 'hyp-baseline-' if baseline else 'hyp-monoloco-' name_out = 'hyp-baseline-' if baseline else 'hyp-monoloco-'
self.path_log = os.path.join(dir_logs, name_out + now_time) self.path_log = os.path.join(dir_logs, name_out)
self.path_model = os.path.join(dir_out, name_out + now_time + '.pkl') self.path_model = os.path.join(dir_out, name_out)
logging.basicConfig(level=logging.INFO) logging.basicConfig(level=logging.INFO)
self.logger = logging.getLogger(__name__) self.logger = logging.getLogger(__name__)
@ -49,7 +50,7 @@ class HypTuning:
random.shuffle(self.sched_step) random.shuffle(self.sched_step)
self.bs_list = [64, 128, 256, 512, 1024, 2048] * multiplier self.bs_list = [64, 128, 256, 512, 1024, 2048] * multiplier
random.shuffle(self.bs_list) random.shuffle(self.bs_list)
self.hidden_list = [128, 256, 512, 128, 256, 512] * multiplier self.hidden_list = [256, 256, 256, 256, 256, 256] * multiplier
random.shuffle(self.hidden_list) random.shuffle(self.hidden_list)
self.n_stage_list = [3, 3, 3, 3, 3, 3] * multiplier self.n_stage_list = [3, 3, 3, 3, 3, 3] * multiplier
random.shuffle(self.n_stage_list) random.shuffle(self.n_stage_list)
@ -104,11 +105,14 @@ class HypTuning:
dic_err_best = dic_err dic_err_best = dic_err
best_acc_val = acc_val best_acc_val = acc_val
model_best = model model_best = model
torch.save(model_best.state_dict(), self.path_model)
with open(self.path_log, 'w') as f:
json.dump(dic_best, f)
# Save model and log
now = datetime.datetime.now()
now_time = now.strftime("%Y%m%d-%H%M")[2:]
self.path_model = self.path_model + now_time + '.pkl'
torch.save(model_best.state_dict(), self.path_model)
with open(self.path_log + now_time, 'w') as f:
json.dump(dic_best, f)
end = time.time() end = time.time()
print('\n\n\n') print('\n\n\n')
self.logger.info(" Tried {} combinations".format(cnt)) self.logger.info(" Tried {} combinations".format(cnt))

View File

@ -52,8 +52,6 @@ class CustomL1Loss(torch.nn.Module):
weights = torch.from_numpy(weights_np).float().to(self.device) # To make weights in the same cuda device weights = torch.from_numpy(weights_np).float().to(self.device) # To make weights in the same cuda device
losses = torch.abs(output - target) * weights losses = torch.abs(output - target) * weights
loss = losses.mean() # Mean over the batch loss = losses.mean() # Mean over the batch
# self.print_loss()
return loss return loss
@ -66,7 +64,7 @@ class LaplacianLoss(torch.nn.Module):
self.reduce = reduce self.reduce = reduce
self.evaluate = evaluate self.evaluate = evaluate
def laplacian_1d(self, mu_si, xx): def laplacian_1d(self, mu_si, xx):
""" """
1D Gaussian Loss. f(x | mu, sigma). The network outputs mu and sigma. X is the ground truth distance. 1D Gaussian Loss. f(x | mu, sigma). The network outputs mu and sigma. X is the ground truth distance.
This supports backward(). This supports backward().
@ -84,8 +82,7 @@ class LaplacianLoss(torch.nn.Module):
if self.evaluate: if self.evaluate:
return norm_bi return norm_bi
else: return term_a + term_b
return term_a + term_b
def forward(self, outputs, targets): def forward(self, outputs, targets):
@ -109,13 +106,12 @@ class GaussianLoss(torch.nn.Module):
self.evaluate = evaluate self.evaluate = evaluate
self.device = device self.device = device
def gaussian_1d(self, mu_si, xx): def gaussian_1d(self, mu_si, xx):
""" """
1D Gaussian Loss. f(x | mu, sigma). The network outputs mu and sigma. X is the ground truth distance. 1D Gaussian Loss. f(x | mu, sigma). The network outputs mu and sigma. X is the ground truth distance.
This supports backward(). This supports backward().
Inspired by Inspired by
https://github.com/naba89/RNN-Handwriting-Generation-Pytorch/blob/master/loss_functions.py https://github.com/naba89/RNN-Handwriting-Generation-Pytorch/blob/master/loss_functions.py
""" """
mu, si = mu_si[:, 0:1], mu_si[:, 1:2] mu, si = mu_si[:, 0:1], mu_si[:, 1:2]
@ -129,8 +125,8 @@ class GaussianLoss(torch.nn.Module):
if self.evaluate: if self.evaluate:
return norm_si return norm_si
else:
return term_a + term_b return term_a + term_b
def forward(self, outputs, targets): def forward(self, outputs, targets):

View File

@ -1,3 +1,9 @@
# pylint: skip-file # TODO
"""
Training and evaluation of a neural network which predicts 3D localization and confidence intervals
given 2d joints
"""
import copy import copy
import os import os
@ -13,19 +19,14 @@ import torch.nn as nn
from torch.utils.data import DataLoader from torch.utils.data import DataLoader
from torch.optim import lr_scheduler from torch.optim import lr_scheduler
from models.datasets import KeypointsDataset from .datasets import KeypointsDataset
from models.architectures import LinearModel from .architectures import LinearModel
from models.losses import LaplacianLoss from .losses import LaplacianLoss
from utils.logs import set_logger from ..utils.logs import set_logger
from utils.monoloco import epistemic_variance, laplace_sampling, unnormalize_bi from ..utils.network import laplace_sampling, unnormalize_bi
class Trainer: class Trainer:
"""
Training and evaluation of a neural network which predicts 3D localization and confidence intervals
given 2d joints
"""
def __init__(self, joints, epochs=100, bs=256, dropout=0.2, lr=0.002, def __init__(self, joints, epochs=100, bs=256, dropout=0.2, lr=0.002,
sched_step=20, sched_gamma=1, hidden_size=256, n_stage=3, r_seed=1, n_dropout=0, n_samples=100, sched_step=20, sched_gamma=1, hidden_size=256, n_stage=3, r_seed=1, n_dropout=0, n_samples=100,
baseline=False, save=False, print_loss=False): baseline=False, save=False, print_loss=False):
@ -123,10 +124,7 @@ class Trainer:
best_model_wts = copy.deepcopy(self.model.state_dict()) best_model_wts = copy.deepcopy(self.model.state_dict())
best_acc = 1e6 best_acc = 1e6
best_epoch = 0 best_epoch = 0
epoch_losses_tr = [] epoch_losses_tr = epoch_losses_val = epoch_norms = epoch_sis = []
epoch_losses_val = []
epoch_norms = []
epoch_sis = []
for epoch in range(self.num_epochs): for epoch in range(self.num_epochs):
@ -138,10 +136,7 @@ class Trainer:
else: else:
self.model.eval() # Set model to evaluate mode self.model.eval() # Set model to evaluate mode
running_loss_tr = 0.0 running_loss_tr = running_loss_eval = norm_tr = bi_tr = 0.0
running_loss_eval = 0.0
norm_tr = 0.0
bi_tr = 0.0
# Iterate over data. # Iterate over data.
for inputs, labels, _, _ in self.dataloaders[phase]: for inputs, labels, _, _ in self.dataloaders[phase]:
@ -156,10 +151,7 @@ class Trainer:
with torch.set_grad_enabled(phase == 'train'): with torch.set_grad_enabled(phase == 'train'):
outputs = self.model(inputs) outputs = self.model(inputs)
if self.output_size == 2: outputs_eval = outputs[:, 0:1] if self.output_size == 2 else outputs
outputs_eval = outputs[:, 0:1] # Fundamental to put slices
else:
outputs_eval = outputs
loss = self.criterion(outputs, labels) loss = self.criterion(outputs, labels)
loss_eval = self.criterion_eval(outputs_eval, labels) # L1 loss to evaluation loss_eval = self.criterion_eval(outputs_eval, labels) # L1 loss to evaluation
@ -196,7 +188,8 @@ class Trainer:
time_elapsed = time.time() - since time_elapsed = time.time() - since
print('\n\n' + '-'*120) print('\n\n' + '-'*120)
self.logger.info('Training:\nTraining complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60)) self.logger.info('Training:\nTraining complete in {:.0f}m {:.0f}s'
.format(time_elapsed // 60, time_elapsed % 60))
self.logger.info('Best validation Accuracy: {:.3f}'.format(best_acc)) self.logger.info('Best validation Accuracy: {:.3f}'.format(best_acc))
self.logger.info('Saved weights of the model at epoch: {}'.format(best_epoch)) self.logger.info('Saved weights of the model at epoch: {}'.format(best_epoch))
@ -251,7 +244,7 @@ class Trainer:
total_outputs = torch.empty((0, len(labels))).to(self.device) total_outputs = torch.empty((0, len(labels))).to(self.device)
if self.n_dropout > 0: if self.n_dropout > 0:
for ii in range(self.n_dropout): for _ in range(self.n_dropout):
outputs = self.model(inputs) outputs = self.model(inputs)
outputs = unnormalize_bi(outputs) outputs = unnormalize_bi(outputs)
samples = laplace_sampling(outputs, self.n_samples) samples = laplace_sampling(outputs, self.n_samples)
@ -269,8 +262,6 @@ class Trainer:
if not self.baseline: if not self.baseline:
outputs = unnormalize_bi(outputs) outputs = unnormalize_bi(outputs)
avg_distance = float(self.criterion_eval(outputs[:, 0:1], labels).item())
dic_err[phase]['all'] = self.compute_stats(outputs, labels, varss, dic_err[phase]['all'], size_eval) dic_err[phase]['all'] = self.compute_stats(outputs, labels, varss, dic_err[phase]['all'], size_eval)
print('-'*120) print('-'*120)
@ -323,26 +314,25 @@ class Trainer:
if self.baseline: if self.baseline:
return (mean_mu, max_mu), (0, 0, 0) return (mean_mu, max_mu), (0, 0, 0)
else: mean_bi = torch.mean(outputs[:, 1]).item()
mean_bi = torch.mean(outputs[:, 1]).item()
low_bound_bi = labels >= (outputs[:, 0] - outputs[:, 1]) low_bound_bi = labels >= (outputs[:, 0] - outputs[:, 1])
up_bound_bi = labels <= (outputs[:, 0] + outputs[:, 1]) up_bound_bi = labels <= (outputs[:, 0] + outputs[:, 1])
bools_bi = low_bound_bi & up_bound_bi bools_bi = low_bound_bi & up_bound_bi
conf_bi = float(torch.sum(bools_bi)) / float(bools_bi.shape[0]) conf_bi = float(torch.sum(bools_bi)) / float(bools_bi.shape[0])
# if varss[0] >= 0: # if varss[0] >= 0:
# mean_var = torch.mean(varss).item() # mean_var = torch.mean(varss).item()
# max_var = torch.max(varss).item() # max_var = torch.max(varss).item()
# #
# low_bound_var = labels >= (outputs[:, 0] - varss) # low_bound_var = labels >= (outputs[:, 0] - varss)
# up_bound_var = labels <= (outputs[:, 0] + varss) # up_bound_var = labels <= (outputs[:, 0] + varss)
# bools_var = low_bound_var & up_bound_var # bools_var = low_bound_var & up_bound_var
# conf_var = float(torch.sum(bools_var)) / float(bools_var.shape[0]) # conf_var = float(torch.sum(bools_var)) / float(bools_var.shape[0])
dic_err['mean'] += mean_mu * (outputs.size(0) / size_eval) dic_err['mean'] += mean_mu * (outputs.size(0) / size_eval)
dic_err['bi'] += mean_bi * (outputs.size(0) / size_eval) dic_err['bi'] += mean_bi * (outputs.size(0) / size_eval)
dic_err['count'] += (outputs.size(0) / size_eval) dic_err['count'] += (outputs.size(0) / size_eval)
dic_err['conf_bi'] += conf_bi * (outputs.size(0) / size_eval) dic_err['conf_bi'] += conf_bi * (outputs.size(0) / size_eval)
return dic_err return dic_err

View File

View File

@ -10,9 +10,9 @@ def pixel_to_camera(uv_tensor, kk, z_met):
It accepts lists or tensors of (m, 2) or (m, x, 2) or (m, 2, x) It accepts lists or tensors of (m, 2) or (m, x, 2) or (m, 2, x)
where x is the number of keypoints where x is the number of keypoints
""" """
if type(uv_tensor) == list: if isinstance(uv_tensor, list):
uv_tensor = torch.tensor(uv_tensor) uv_tensor = torch.tensor(uv_tensor)
if type(kk) == list: if isinstance(kk, list):
kk = torch.tensor(kk) kk = torch.tensor(kk)
if uv_tensor.size()[-1] != 2: if uv_tensor.size()[-1] != 2:
uv_tensor = uv_tensor.permute(0, 2, 1) # permute to have 2 as last dim to be padded uv_tensor = uv_tensor.permute(0, 2, 1) # permute to have 2 as last dim to be padded
@ -42,7 +42,7 @@ def project_3d(box_obj, kk):
box_2d = [] box_2d = []
# Obtain the 3d points of the box # Obtain the 3d points of the box
xc, yc, zc = box_obj.center xc, yc, zc = box_obj.center
ww, ll, hh, = box_obj.wlh ww, _, hh, = box_obj.wlh
# Points corresponding to a box at the z of the center # Points corresponding to a box at the z of the center
x1 = xc - ww/2 x1 = xc - ww/2
@ -70,7 +70,7 @@ def get_keypoints(keypoints, mode):
Input --> list or torch.tensor [(m, 3, 17) or (3, 17)] Input --> list or torch.tensor [(m, 3, 17) or (3, 17)]
Output --> torch.tensor [(m, 2)] Output --> torch.tensor [(m, 2)]
""" """
if type(keypoints) == list: if isinstance(keypoints, list):
keypoints = torch.tensor(keypoints) keypoints = torch.tensor(keypoints)
if len(keypoints.size()) == 2: # add batch dim if len(keypoints.size()) == 2: # add batch dim
keypoints = keypoints.unsqueeze(0) keypoints = keypoints.unsqueeze(0)
@ -109,17 +109,15 @@ def get_keypoints(keypoints, mode):
def transform_kp(kps, tr_mode): def transform_kp(kps, tr_mode):
"""Apply different transformations to the keypoints based on the tr_mode""" """Apply different transformations to the keypoints based on the tr_mode"""
assert tr_mode == "None" or tr_mode == "singularity" or tr_mode == "upper" or tr_mode == "lower" \ assert tr_mode in ("None", "singularity", "upper", "lower", "horizontal", "vertical", "lateral",
or tr_mode == "horizontal" or tr_mode == "vertical" or tr_mode == "lateral" \ 'shoulder', 'knee', 'upside', 'falling', 'random')
or tr_mode == 'shoulder' or tr_mode == 'knee' or tr_mode == 'upside' or tr_mode == 'falling' \
or tr_mode == 'random'
uu_c, vv_c = get_keypoints(kps, mode='center') uu_c, vv_c = get_keypoints(kps, mode='center')
if tr_mode == "None": if tr_mode == "None":
return kps return kps
elif tr_mode == "singularity": if tr_mode == "singularity":
uus = [uu_c for uu in kps[0]] uus = [uu_c for uu in kps[0]]
vvs = [vv_c for vv in kps[1]] vvs = [vv_c for vv in kps[1]]
@ -131,23 +129,6 @@ def transform_kp(kps, tr_mode):
uus = kps[0] uus = kps[0]
vvs = [vv_c for vv in kps[1]] vvs = [vv_c for vv in kps[1]]
elif tr_mode == 'lower':
uus = kps[0]
vvs = kps[1][:9] + [vv_c for vv in kps[1][9:]]
elif tr_mode == 'upper':
uus = kps[0]
vvs = [vv_c for vv in kps[1][:9]] + kps[1][9:]
elif tr_mode == 'lateral':
uus = []
for idx, kp in enumerate(kps[0]):
if idx % 2 == 1:
uus.append(kp)
else:
uus.append(uu_c)
vvs = kps[1]
elif tr_mode == 'shoulder': elif tr_mode == 'shoulder':
uus = kps[0] uus = kps[0]
vvs = kps[1][:7] + [kps[1][6] for vv in kps[1][7:]] vvs = kps[1][:7] + [kps[1][6] for vv in kps[1][7:]]
@ -183,7 +164,7 @@ def xyz_from_distance(distances, xy_centers):
xy_centers --> tensor(m,3) or (3) xy_centers --> tensor(m,3) or (3)
""" """
if type(distances) == float: if isinstance(distances, float):
distances = torch.tensor(distances).unsqueeze(0) distances = torch.tensor(distances).unsqueeze(0)
if len(distances.size()) == 1: if len(distances.size()) == 1:
distances = distances.unsqueeze(1) distances = distances.unsqueeze(1)
@ -193,16 +174,3 @@ def xyz_from_distance(distances, xy_centers):
assert xy_centers.size()[-1] == 3 and distances.size()[-1] == 1, "Size of tensor not recognized" assert xy_centers.size()[-1] == 3 and distances.size()[-1] == 1, "Size of tensor not recognized"
return xy_centers * distances / torch.sqrt(1 + xy_centers[:, 0:1].pow(2) + xy_centers[:, 1:2].pow(2)) return xy_centers * distances / torch.sqrt(1 + xy_centers[:, 0:1].pow(2) + xy_centers[:, 1:2].pow(2))
def pixel_to_camera_old(uv1, kk, z_met):
"""
(3,) array --> (3,) array
Convert a point in pixel coordinate to absolute camera coordinates
"""
if len(uv1) == 2:
uv1.append(1)
kk_1 = np.linalg.inv(kk)
xyz_met_norm = np.dot(kk_1, uv1)
xyz_met = xyz_met_norm * z_met
return xyz_met

View File

@ -68,5 +68,3 @@ def reorder_matches(matches, boxes, mode='left_rigth'):
matches_left = [idx for (idx, _) in matches] matches_left = [idx for (idx, _) in matches]
return [matches[matches_left.index(idx_boxes)] for idx_boxes in ordered_boxes if idx_boxes in matches_left] return [matches[matches_left.index(idx_boxes)] for idx_boxes in ordered_boxes if idx_boxes in matches_left]

View File

@ -1,6 +1,7 @@
import math
import numpy as np import numpy as np
import math
def get_calibration(path_txt): def get_calibration(path_txt):
@ -69,28 +70,27 @@ def get_simplified_calibration(path_txt):
raise ValueError('Matrix K_02 not found in the file') raise ValueError('Matrix K_02 not found in the file')
def check_conditions(line, mode, thresh=0.3): def check_conditions(line, category, method, thresh=0.3):
"""Check conditions of our or m3d txt file""" """Check conditions of our or m3d txt file"""
check = False check = False
assert mode in ['gt', 'gt_all', 'm3d', '3dop','our'], "Mode %r not recognized" % mode assert method in ['gt', 'm3d', '3dop', 'our'], "Method %r not recognized" % method
assert category in ['pedestrian', 'cyclist', 'all']
if mode == 'm3d' or mode == '3dop': if method in ('m3d', '3dop'):
conf = line.split()[15] conf = line.split()[15]
if line[:10] == 'pedestrian' and float(conf) >= thresh: if line.split()[0] == category and float(conf) >= thresh:
check = True check = True
elif mode == 'gt': elif method == 'gt':
# if line[:10] == 'Pedestrian' or line[:10] == 'Person_sit': if category == 'all':
if line[:10] == 'Pedestrian': categories_gt = ['Pedestrian', 'Person_sitting', 'Cyclist']
else:
categories_gt = [category.upper()[0] + category[1:]] # Upper case names
if line.split()[0] in categories_gt:
check = True check = True
# Consider also person sitting and cyclists categories elif method == 'our':
elif mode == 'gt_all':
if line[:10] == 'Pedestrian' or line[:10] == 'Person_sit' or line[:7] == 'Cyclist':
check = True
elif mode == 'our':
if line[4] >= thresh: if line[4] >= thresh:
check = True check = True
@ -130,23 +130,25 @@ def split_training(names_gt, path_train, path_val):
return set_train, set_val return set_train, set_val
def parse_ground_truth(path_gt, mode='gt'): def parse_ground_truth(path_gt, category):
"""Parse KITTI ground truth files""" """Parse KITTI ground truth files"""
boxes_gt = [] boxes_gt = []
dds_gt = [] dds_gt = []
zzs_gt = []
truncs_gt = [] # Float from 0 to 1 truncs_gt = [] # Float from 0 to 1
occs_gt = [] # Either 0,1,2,3 fully visible, partly occluded, largely occluded, unknown occs_gt = [] # Either 0,1,2,3 fully visible, partly occluded, largely occluded, unknown
boxes_3d = [] boxes_3d = []
with open(path_gt, "r") as f_gt: with open(path_gt, "r") as f_gt:
for line_gt in f_gt: for line_gt in f_gt:
if check_conditions(line_gt, mode=mode): if check_conditions(line_gt, category, method='gt'):
truncs_gt.append(float(line_gt.split()[1])) truncs_gt.append(float(line_gt.split()[1]))
occs_gt.append(int(line_gt.split()[2])) occs_gt.append(int(line_gt.split()[2]))
boxes_gt.append([float(x) for x in line_gt.split()[4:8]]) boxes_gt.append([float(x) for x in line_gt.split()[4:8]])
loc_gt = [float(x) for x in line_gt.split()[11:14]] loc_gt = [float(x) for x in line_gt.split()[11:14]]
wlh = [float(x) for x in line_gt.split()[8:11]] wlh = [float(x) for x in line_gt.split()[8:11]]
boxes_3d.append(loc_gt + wlh) boxes_3d.append(loc_gt + wlh)
zzs_gt.append(loc_gt[2])
dds_gt.append(math.sqrt(loc_gt[0] ** 2 + loc_gt[1] ** 2 + loc_gt[2] ** 2)) dds_gt.append(math.sqrt(loc_gt[0] ** 2 + loc_gt[1] ** 2 + loc_gt[2] ** 2))
return boxes_gt, boxes_3d, dds_gt, truncs_gt, occs_gt return boxes_gt, boxes_3d, dds_gt, zzs_gt, truncs_gt, occs_gt

View File

@ -1,4 +1,6 @@
import random
def append_cluster(dic_jo, phase, xx, dd, kps): def append_cluster(dic_jo, phase, xx, dd, kps):
"""Append the annotation based on its distance""" """Append the annotation based on its distance"""
@ -24,11 +26,21 @@ def append_cluster(dic_jo, phase, xx, dd, kps):
dic_jo[phase]['clst']['>30']['Y'].append([dd]) dic_jo[phase]['clst']['>30']['Y'].append([dd])
def get_task_error(dd): def get_task_error(dd, mode='std'):
"""Get target error not knowing the gender""" """Get target error not knowing the gender"""
mm_gender = 0.0556 assert mode in ('std', 'mad')
if mode == 'std':
mm_gender = 0.0557
elif mode == 'mad': # mean absolute deviation
mm_gender = 0.0457
return mm_gender * dd return mm_gender * dd
def get_pixel_error(dd_gt, zz_gt):
"""calculate error in stereo distance due to +-1 pixel mismatch (function of depth)"""
disp = 0.54 * 721 / zz_gt
random.seed(1)
sign = random.choice((-1, 1))
delta_z = zz_gt - 0.54 * 721 / (disp + sign)
return dd_gt + delta_z

View File

@ -1,7 +1,7 @@
import numpy as np import numpy as np
import torch import torch
from utils.camera import get_keypoints, pixel_to_camera from ..utils.camera import get_keypoints, pixel_to_camera
def get_monoloco_inputs(keypoints, kk): def get_monoloco_inputs(keypoints, kk):
@ -16,8 +16,9 @@ def get_monoloco_inputs(keypoints, kk):
kk = torch.tensor(kk) kk = torch.tensor(kk)
# Projection in normalized image coordinates and zero-center with the center of the bounding box # Projection in normalized image coordinates and zero-center with the center of the bounding box
uv_center = get_keypoints(keypoints, mode='center') uv_center = get_keypoints(keypoints, mode='center')
xy1_center = pixel_to_camera(uv_center, kk, 1) * 10 xy1_center = pixel_to_camera(uv_center, kk, 10)
xy1_all = pixel_to_camera(keypoints[:, 0:2, :], kk, 1) * 10 xy1_all = pixel_to_camera(keypoints[:, 0:2, :], kk, 10)
# xy1_center[:, 1].fill_(0) #TODO
kps_norm = xy1_all - xy1_center.unsqueeze(1) # (m, 17, 3) - (m, 1, 3) kps_norm = xy1_all - xy1_center.unsqueeze(1) # (m, 17, 3) - (m, 1, 3)
kps_out = kps_norm[:, :, 0:2].reshape(kps_norm.size()[0], -1) # no contiguous for view kps_out = kps_norm[:, :, 0:2].reshape(kps_norm.size()[0], -1) # no contiguous for view
return kps_out return kps_out

View File

@ -23,7 +23,7 @@ def get_unique_tokens(list_fin):
return list_token_scene return list_token_scene
def split_scenes(list_token_scene, tr, val, dir_main, save=False, load=True): def split_scenes(list_token_scene, train, val, dir_main, save=False, load=True):
""" """
Split the list according tr, val percentages (test percentage is a consequence) after shuffling the order Split the list according tr, val percentages (test percentage is a consequence) after shuffling the order
""" """
@ -34,7 +34,7 @@ def split_scenes(list_token_scene, tr, val, dir_main, save=False, load=True):
random.seed(1) random.seed(1)
random.shuffle(list_token_scene) # it shuffles in place random.shuffle(list_token_scene) # it shuffles in place
n_scenes = len(list_token_scene) n_scenes = len(list_token_scene)
n_train = round(n_scenes * tr / 100) n_train = round(n_scenes * train / 100)
n_val = round(n_scenes * val / 100) n_val = round(n_scenes * val / 100)
list_train = list_token_scene[0: n_train] list_train = list_token_scene[0: n_train]
list_val = list_token_scene[n_train: n_train + n_val] list_val = list_token_scene[n_train: n_train + n_val]
@ -55,18 +55,16 @@ def select_categories(cat):
""" """
Choose the categories to extract annotations from Choose the categories to extract annotations from
""" """
assert cat == 'person' or cat == 'all' or cat == 'car' assert cat in ['person', 'all', 'car', 'cyclist']
if cat == 'person': if cat == 'person':
categories = ['human.pedestrian'] categories = ['human.pedestrian']
elif cat == 'all': elif cat == 'all':
categories = ['human.pedestrian', categories = ['human.pedestrian', 'vehicle.bicycle', 'vehicle.motorcycle']
'vehicle.bicycle', 'vehicle.motorcycle'] elif cat == 'cyclist':
categories = ['vehicle.bicycle']
elif cat == 'car': elif cat == 'car':
categories = ['vehicle'] categories = ['vehicle']
return categories return categories

54
monoloco/utils/pifpaf.py Normal file
View File

@ -0,0 +1,54 @@
import numpy as np
def preprocess_pif(annotations, im_size=None):
"""
Preprocess pif annotations:
1. enlarge the box of 10%
2. Constraint it inside the image (if image_size provided)
"""
boxes = []
keypoints = []
for dic in annotations:
box = dic['bbox']
if box[3] < 0.5: # Check for no detections (boxes 0,0,0,0)
return [], []
kps = prepare_pif_kps(dic['keypoints'])
conf = float(np.sort(np.array(kps[2]))[-3]) # The confidence is the 3rd highest value for the keypoints
# Add 15% for y and 20% for x
delta_h = (box[3] - box[1]) / 7
delta_w = (box[2] - box[0]) / 3.5
assert delta_h > -5 and delta_w > -5, "Bounding box <=0"
box[0] -= delta_w
box[1] -= delta_h
box[2] += delta_w
box[3] += delta_h
# Put the box inside the image
if im_size is not None:
box[0] = max(0, box[0])
box[1] = max(0, box[1])
box[2] = min(box[2], im_size[0])
box[3] = min(box[3], im_size[1])
box.append(conf)
boxes.append(box)
keypoints.append(kps)
return boxes, keypoints
def prepare_pif_kps(kps_in):
"""Convert from a list of 51 to a list of 3, 17"""
assert len(kps_in) % 3 == 0, "keypoints expected as a multiple of 3"
xxs = kps_in[0:][::3]
yys = kps_in[1:][::3] # from offset 1 every 3
ccs = kps_in[2:][::3]
return [xxs, yys, ccs]

87
monoloco/utils/stereo.py Normal file
View File

@ -0,0 +1,87 @@
import copy
import warnings
import numpy as np
def depth_from_disparity(zzs, kps, kps_right):
"""Associate instances in left and right images and compute disparity"""
zzs_stereo = []
zzs = np.array(zzs)
kps = np.array(kps)
kps_right_list = copy.deepcopy(kps_right)
cnt_stereo = 0
expected_disps = 0.54 * 721 / np.array(zzs)
for idx, zz_mono in enumerate(zzs):
if kps_right_list:
zz_stereo, disparity_x, disparity_y, idx_min = filter_disparities(kps, kps_right_list, idx, expected_disps)
if verify_stereo(zz_stereo, zz_mono, disparity_x, disparity_y):
zzs_stereo.append(zz_stereo)
cnt_stereo += 1
kps_right_list.pop(idx_min)
else:
zzs_stereo.append(zz_mono)
else:
zzs_stereo.append(zz_mono)
return zzs_stereo, cnt_stereo
def filter_disparities(kps, kps_right_list, idx, expected_disps):
"""filter joints based on confidence and interquartile range of the distribution"""
CONF_MIN = 0.3
kps_right = np.array(kps_right_list)
with warnings.catch_warnings() and np.errstate(invalid='ignore'):
try:
disparity_x = kps[idx, 0, :] - kps_right[:, 0, :]
disparity_y = kps[idx, 1, :] - kps_right[:, 1, :]
# Mask for low confidence
mask_conf_left = kps[idx, 2, :] > CONF_MIN
mask_conf_right = kps_right[:, 2, :] > CONF_MIN
mask_conf = mask_conf_left & mask_conf_right
disparity_x_conf = np.where(mask_conf, disparity_x, np.nan)
disparity_y_conf = np.where(mask_conf, disparity_y, np.nan)
# Mask outliers using iqr
mask_outlier = get_iqr_mask(disparity_x_conf)
disparity_x_mask = np.where(mask_outlier, disparity_x_conf, np.nan)
disparity_y_mask = np.where(mask_outlier, disparity_y_conf, np.nan)
avg_disparity_x = np.nanmedian(disparity_x_mask, axis=1) # ignore the nan
diffs_x = [abs(expected_disps[idx] - real) for real in avg_disparity_x]
idx_min = diffs_x.index(min(diffs_x))
zz_stereo = 0.54 * 721. / float(avg_disparity_x[idx_min])
except ZeroDivisionError:
zz_stereo = - 100
return zz_stereo, disparity_x_mask[idx_min], disparity_y_mask[idx_min], idx_min
def verify_stereo(zz_stereo, zz_mono, disparity_x, disparity_y):
COV_MIN = 0.1
y_max_difference = (50 / zz_mono)
z_max_difference = 0.6 * zz_mono
cov = float(np.nanstd(disparity_x) / np.abs(np.nanmean(disparity_x))) # Coefficient of variation
avg_disparity_y = np.nanmedian(disparity_y)
if abs(zz_stereo - zz_mono) < z_max_difference and \
avg_disparity_y < y_max_difference and \
cov < COV_MIN:
return True
return False
def get_iqr_mask(distribution):
quartile_1, quartile_3 = np.nanpercentile(distribution, [25, 75], axis=1)
iqr = quartile_3 - quartile_1
lower_bound = quartile_1 - (iqr * 1.5)
upper_bound = quartile_3 + (iqr * 1.5)
return (distribution < upper_bound.reshape(-1, 1)) & (distribution > lower_bound.reshape(-1, 1))

View File

View File

@ -1,15 +1,15 @@
# pylint: skip-file
import numpy as np import numpy as np
import os import os
import math
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse from matplotlib.patches import Ellipse
from visuals.printer import get_angle from visuals.printer import get_angle
from visuals.printer import get_confidence
def paper(): def paper():
"""Print paper figures""" """Print paper figures"""
dir_out = os.path.join('data', 'all_images', 'paper')
method = True method = True
task_error = True task_error = True
@ -75,7 +75,7 @@ def paper():
plt.yticks([]) plt.yticks([])
plt.xlabel('X [m]') plt.xlabel('X [m]')
plt.ylabel('Z [m]') plt.ylabel('Z [m]')
plt.savefig(os.path.join(dir_out, fig_name)) # plt.savefig(os.path.join('docs', fig_name))
plt.show() plt.show()
plt.close() plt.close()
@ -107,7 +107,7 @@ def paper():
plt.xlabel("Distance from the camera [m]") plt.xlabel("Distance from the camera [m]")
plt.ylabel("Localization error due to human height variation [m]") plt.ylabel("Localization error due to human height variation [m]")
plt.legend(loc=(0.01, 0.55)) # Location from 0 to 1 from lower left plt.legend(loc=(0.01, 0.55)) # Location from 0 to 1 from lower left
plt.savefig(os.path.join(dir_out, fig_name)) # plt.savefig(os.path.join(dir_out, fig_name))
plt.show() plt.show()
plt.close() plt.close()
@ -121,11 +121,21 @@ def gmm():
std_men = 7 std_men = 7
mu_women = 165 mu_women = 165
std_women = 7 std_women = 7
N_men = np.random.normal(mu_men, std_men, 100000) N_men_1 = np.random.normal(mu_men, std_men, 1000000)
N_women = np.random.normal(mu_women, std_women, 100000) N_men_2 = np.random.normal(mu_men, std_men, 1000000)
N_gmm = np.concatenate((N_men, N_women)) N_women_1 = np.random.normal(mu_women, std_women, 1000000)
mu_gmm = np.mean(N_gmm) N_women_2 = np.random.normal(mu_women, std_women, 1000000)
std_gmm = np.std(N_gmm) N_gmm_1 = np.concatenate((N_men_1, N_women_1))
N_gmm_2 = np.concatenate((N_men_2, N_women_2))
mu_gmm_1 = np.mean(N_gmm_1)
mu_gmm_2 = np.mean(N_gmm_2)
std_gmm = np.std(N_gmm_1)
mm_gender = std_gmm / mu_gmm_1
var_gmm = np.var(N_gmm_1)
abs_diff_1 = np.abs(mu_gmm_1 - N_gmm_1)
abs_diff_2 = np.mean(np.abs(N_gmm_1 - N_gmm_2))
mean_deviation_1 = np.mean(abs_diff_1)
mean_deviation_2 = np.mean(abs_diff_2)
# sns.distplot(N_men, hist=False, rug=False, label="Men") # sns.distplot(N_men, hist=False, rug=False, label="Men")
# sns.distplot(N_women, hist=False, rug=False, label="Women") # sns.distplot(N_women, hist=False, rug=False, label="Women")
# sns.distplot(N_gmm, hist=False, rug=False, label="GMM") # sns.distplot(N_gmm, hist=False, rug=False, label="GMM")
@ -133,7 +143,21 @@ def gmm():
# plt.ylabel("Height distributions of men and women") # plt.ylabel("Height distributions of men and women")
# plt.legend() # plt.legend()
# plt.show() # plt.show()
print("Variace of GMM distribution: {:.2f}".format(std_gmm)) print("Mean of GMM distribution: {:.2f}".format(mu_gmm_1))
mm_gender = std_gmm / mu_gmm print("Standard deviation: {:.2f}".format(std_gmm))
print("Relative error (standard deviation) {:.3f} %".format(mm_gender * 100))
print("Variance: {:.2f}".format(var_gmm))
print("Mean deviation: {:.2f}".format(mean_deviation_1))
print("Mean deviation 2: {:.2f}".format(mean_deviation_2))
print("Relative error (mean absolute deviation): {:.3f} %".format((mean_deviation_1 / mu_gmm_1) * 100))
return mm_gender return mm_gender
def get_confidence(xx, zz, std):
theta = math.atan2(zz, xx)
delta_x = std * math.cos(theta)
delta_z = std * math.sin(theta)
return (xx - delta_x, xx + delta_x), (zz - delta_z, zz + delta_z)

278
monoloco/visuals/printer.py Normal file
View File

@ -0,0 +1,278 @@
import math
from collections import OrderedDict
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from matplotlib.patches import Ellipse, Circle, Rectangle
from mpl_toolkits.axes_grid1 import make_axes_locatable
from ..utils.camera import pixel_to_camera
from ..utils.misc import get_task_error
class Printer:
"""
Print results on images: birds eye view and computed distance
"""
FONTSIZE_BV = 16
FONTSIZE = 18
TEXTCOLOR = 'darkorange'
COLOR_KPS = 'yellow'
def __init__(self, image, output_path, kk, output_types, epistemic=False, z_max=30, fig_width=10):
self.im = image
self.kk = kk
self.output_types = output_types
self.epistemic = epistemic
self.z_max = z_max # To include ellipses in the image
self.y_scale = 1
self.width = self.im.size[0]
self.height = self.im.size[1]
self.fig_width = fig_width
# Define the output dir
self.path_out = output_path
self.cmap = cm.get_cmap('jet')
self.extensions = []
# Define variables of the class to change for every image
self.mpl_im0 = self.stds_ale = self.stds_epi = self.xx_gt = self.zz_gt = self.xx_pred = self.zz_pred =\
self.dds_real = self.uv_centers = self.uv_shoulders = self.uv_kps = self.boxes = self.boxes_gt = \
self.uv_camera = self.radius = None
def _process_results(self, dic_ann):
# Include the vectors inside the interval given by z_max
self.stds_ale = dic_ann['stds_ale']
self.stds_epi = dic_ann['stds_epi']
self.xx_gt = [xx[0] for xx in dic_ann['xyz_real']]
self.zz_gt = [xx[2] if xx[2] < self.z_max - self.stds_epi[idx] else 0
for idx, xx in enumerate(dic_ann['xyz_real'])]
self.xx_pred = [xx[0] for xx in dic_ann['xyz_pred']]
self.zz_pred = [xx[2] if xx[2] < self.z_max - self.stds_epi[idx] else 0
for idx, xx in enumerate(dic_ann['xyz_pred'])]
self.dds_real = dic_ann['dds_real']
self.uv_centers = dic_ann['uv_centers']
self.uv_shoulders = dic_ann['uv_shoulders']
self.uv_kps = dic_ann['uv_kps']
self.boxes = dic_ann['boxes']
self.boxes_gt = dic_ann['boxes_gt']
self.uv_camera = (int(self.im.size[0] / 2), self.im.size[1])
self.radius = 11 / 1600 * self.width
def factory_axes(self):
"""Create axes for figures: front bird combined"""
axes = []
figures = []
# Initialize combined figure, resizing it for aesthetic proportions
if 'combined' in self.output_types:
assert 'bird' and 'front' not in self.output_types, \
"combined figure cannot be print together with front or bird ones"
self.y_scale = self.width / (self.height * 1.8) # Defined proportion
if self.y_scale < 0.95 or self.y_scale > 1.05: # allows more variation without resizing
self.im = self.im.resize((self.width, round(self.height * self.y_scale)))
self.width = self.im.size[0]
self.height = self.im.size[1]
fig_width = self.fig_width + 0.6 * self.fig_width
fig_height = self.fig_width * self.height / self.width
# Distinguish between KITTI images and general images
fig_ar_1 = 1.7 if self.y_scale > 1.7 else 1.3
width_ratio = 1.9
self.extensions.append('.combined.png')
fig, (ax1, ax0) = plt.subplots(1, 2, sharey=False, gridspec_kw={'width_ratios': [1, width_ratio]},
figsize=(fig_width, fig_height))
ax1.set_aspect(fig_ar_1)
fig.set_tight_layout(True)
fig.subplots_adjust(left=0.02, right=0.98, bottom=0, top=1, hspace=0, wspace=0.02)
figures.append(fig)
assert 'front' not in self.output_types and 'bird' not in self.output_types, \
"--combined arguments is not supported with other visualizations"
# Initialize front figure
elif 'front' in self.output_types:
width = self.fig_width
height = self.fig_width * self.height / self.width
self.extensions.append(".front.png")
plt.figure(0)
fig0, ax0 = plt.subplots(1, 1, figsize=(width, height))
fig0.set_tight_layout(True)
figures.append(fig0)
# Create front figure axis
if any(xx in self.output_types for xx in ['front', 'combined']):
ax0 = self.set_axes(ax0, axis=0)
divider = make_axes_locatable(ax0)
cax = divider.append_axes('right', size='3%', pad=0.05)
bar_ticks = self.z_max // 5 + 1
norm = matplotlib.colors.Normalize(vmin=0, vmax=self.z_max)
scalar_mappable = plt.cm.ScalarMappable(cmap=self.cmap, norm=norm)
scalar_mappable.set_array([])
plt.colorbar(scalar_mappable, ticks=np.linspace(0, self.z_max, bar_ticks),
boundaries=np.arange(- 0.05, self.z_max + 0.1, .1), cax=cax, label='Z [m]')
axes.append(ax0)
if not axes:
axes.append(None)
# Initialize bird-eye-view figure
if 'bird' in self.output_types:
self.extensions.append(".bird.png")
fig1, ax1 = plt.subplots(1, 1)
fig1.set_tight_layout(True)
figures.append(fig1)
if any(xx in self.output_types for xx in ['bird', 'combined']):
ax1 = self.set_axes(ax1, axis=1) # Adding field of view
axes.append(ax1)
return figures, axes
def draw(self, figures, axes, dic_out, image, draw_text=True, legend=True, draw_box=False,
save=False, show=False):
# Process the annotation dictionary of monoloco
self._process_results(dic_out)
# Draw the front figure
num = 0
self.mpl_im0.set_data(image)
for idx, uv in enumerate(self.uv_shoulders):
if any(xx in self.output_types for xx in ['front', 'combined']) and \
min(self.zz_pred[idx], self.zz_gt[idx]) > 0:
color = self.cmap((self.zz_pred[idx] % self.z_max) / self.z_max)
self.draw_circle(axes, uv, color)
if draw_box:
self.draw_boxes(axes, idx, color)
if draw_text:
self.draw_text_front(axes, uv, num)
num += 1
# Draw the bird figure
num = 0
for idx, _ in enumerate(self.xx_pred):
if any(xx in self.output_types for xx in ['bird', 'combined']) and self.zz_gt[idx] > 0:
# Draw ground truth and predicted ellipses
self.draw_ellipses(axes, idx)
# Draw bird eye view text
if draw_text:
self.draw_text_bird(axes, idx, num)
num += 1
# Add the legend
if legend:
draw_legend(axes)
# Draw, save or/and show the figures
for idx, fig in enumerate(figures):
fig.canvas.draw()
if save:
fig.savefig(self.path_out + self.extensions[idx], bbox_inches='tight')
if show:
fig.show()
def draw_ellipses(self, axes, idx):
"""draw uncertainty ellipses"""
target = get_task_error(self.dds_real[idx])
angle_gt = get_angle(self.xx_gt[idx], self.zz_gt[idx])
ellipse_real = Ellipse((self.xx_gt[idx], self.zz_gt[idx]), width=target * 2, height=1,
angle=angle_gt, color='lightgreen', fill=True, label="Task error")
axes[1].add_patch(ellipse_real)
if abs(self.zz_gt[idx] - self.zz_pred[idx]) > 0.001:
axes[1].plot(self.xx_gt[idx], self.zz_gt[idx], 'kx', label="Ground truth", markersize=3)
angle = get_angle(self.xx_pred[idx], self.zz_pred[idx])
ellipse_ale = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_ale[idx] * 2,
height=1, angle=angle, color='b', fill=False, label="Aleatoric Uncertainty",
linewidth=1.3)
ellipse_var = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_epi[idx] * 2,
height=1, angle=angle, color='r', fill=False, label="Uncertainty",
linewidth=1, linestyle='--')
axes[1].add_patch(ellipse_ale)
if self.epistemic:
axes[1].add_patch(ellipse_var)
axes[1].plot(self.xx_pred[idx], self.zz_pred[idx], 'ro', label="Predicted", markersize=3)
def draw_boxes(self, axes, idx, color):
ww_box = self.boxes[idx][2] - self.boxes[idx][0]
hh_box = (self.boxes[idx][3] - self.boxes[idx][1]) * self.y_scale
ww_box_gt = self.boxes_gt[idx][2] - self.boxes_gt[idx][0]
hh_box_gt = (self.boxes_gt[idx][3] - self.boxes_gt[idx][1]) * self.y_scale
rectangle = Rectangle((self.boxes[idx][0], self.boxes[idx][1] * self.y_scale),
width=ww_box, height=hh_box, fill=False, color=color, linewidth=3)
rectangle_gt = Rectangle((self.boxes_gt[idx][0], self.boxes_gt[idx][1] * self.y_scale),
width=ww_box_gt, height=hh_box_gt, fill=False, color='g', linewidth=2)
axes[0].add_patch(rectangle_gt)
axes[0].add_patch(rectangle)
def draw_text_front(self, axes, uv, num):
axes[0].text(uv[0] + self.radius, uv[1] * self.y_scale - self.radius, str(num),
fontsize=self.FONTSIZE, color=self.TEXTCOLOR, weight='bold')
def draw_text_bird(self, axes, idx, num):
"""Plot the number in the bird eye view map"""
std = self.stds_epi[idx] if self.stds_epi[idx] > 0 else self.stds_ale[idx]
theta = math.atan2(self.zz_pred[idx], self.xx_pred[idx])
delta_x = std * math.cos(theta)
delta_z = std * math.sin(theta)
axes[1].text(self.xx_pred[idx] + delta_x, self.zz_pred[idx] + delta_z,
str(num), fontsize=self.FONTSIZE_BV, color='darkorange')
def draw_circle(self, axes, uv, color):
circle = Circle((uv[0], uv[1] * self.y_scale), radius=self.radius, color=color, fill=True)
axes[0].add_patch(circle)
def set_axes(self, ax, axis):
assert axis in (0, 1)
if axis == 0:
ax.set_axis_off()
ax.set_xlim(0, self.width)
ax.set_ylim(self.height, 0)
self.mpl_im0 = ax.imshow(self.im)
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
else:
uv_max = [0., float(self.height)]
xyz_max = pixel_to_camera(uv_max, self.kk, self.z_max)
x_max = abs(xyz_max[0]) # shortcut to avoid oval circles in case of different kk
ax.plot([0, x_max], [0, self.z_max], 'k--')
ax.plot([0, -x_max], [0, self.z_max], 'k--')
ax.set_ylim(0, self.z_max+1)
ax.set_xlabel("X [m]")
ax.set_ylabel("Z [m]")
return ax
def draw_legend(axes):
handles, labels = axes[1].get_legend_handles_labels()
by_label = OrderedDict(zip(labels, handles))
axes[1].legend(by_label.values(), by_label.keys())
def get_angle(xx, zz):
"""Obtain the points to plot the confidence of each annotation"""
theta = math.atan2(zz, xx)
angle = theta * (180 / math.pi)
return angle

View File

@ -1,3 +1,4 @@
# pylint: disable=R0915
import os import os
import numpy as np import numpy as np
@ -5,7 +6,7 @@ import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse from matplotlib.patches import Ellipse
def print_results(dic_stats, show=False, save=False): def print_results(dic_stats, show=False):
""" """
Visualize error as function of the distance on the test set and compare it with target errors based on human Visualize error as function of the distance on the test set and compare it with target errors based on human
@ -67,7 +68,7 @@ def print_results(dic_stats, show=False, save=False):
xxs = get_distances(clusters) xxs = get_distances(clusters)
yys = target_error(np.array(xxs), mm_gender) yys = target_error(np.array(xxs), mm_gender)
ax[1].plot(xxs, bbs, marker='s', color='b', label="Spread b") ax[1].plot(xxs, bbs, marker='s', color='b', label="Spread b")
ax[1].plot(xxs, yys, '--', color='lightgreen', label="Task error", linewidth=2.5) ax[1].plot(xxs, yys, '--', color='lightgreen', label="Task error", linewidth=2.5)
yys_up = [rec_c + ar/2 * scale * yy for yy in yys] yys_up = [rec_c + ar/2 * scale * yy for yy in yys]
bbs_up = [rec_c + ar/2 * scale * bb for bb in bbs] bbs_up = [rec_c + ar/2 * scale * bb for bb in bbs]
yys_down = [rec_c - ar/2 * scale * yy for yy in yys] yys_down = [rec_c - ar/2 * scale * yy for yy in yys]
@ -81,7 +82,7 @@ def print_results(dic_stats, show=False, save=False):
for idx, xx in enumerate(xxs): for idx, xx in enumerate(xxs):
te = Ellipse((xx, rec_c), width=yys[idx]*ar*scale, height=scale, angle=90, color='lightgreen', fill=True) te = Ellipse((xx, rec_c), width=yys[idx]*ar*scale, height=scale, angle=90, color='lightgreen', fill=True)
bi = Ellipse((xx, rec_c), width=bbs[idx]*ar*scale, height=scale, angle=90, color='b',linewidth=1.8, bi = Ellipse((xx, rec_c), width=bbs[idx]*ar*scale, height=scale, angle=90, color='b', linewidth=1.8,
fill=False) fill=False)
ax[0].add_patch(te) ax[0].add_patch(te)

View File

@ -1,3 +1,4 @@
# pylint: disable=W0212
""" """
Webcam demo application Webcam demo application
@ -14,11 +15,11 @@ from openpifpaf import transforms
import cv2 import cv2
from visuals.printer import Printer from ..visuals.printer import Printer
from utils.pifpaf import preprocess_pif from ..utils.pifpaf import preprocess_pif
from predict.pifpaf import PifPaf from ..predict.pifpaf import PifPaf
from predict.monoloco import MonoLoco from ..predict.network import MonoLoco
from predict.factory import factory_for_gt from ..predict.factory import factory_for_gt
def webcam(args): def webcam(args):
@ -107,7 +108,7 @@ class VisualizerMonoloco:
del axes[1].patches[0] # the one became the 0 del axes[1].patches[0] # the one became the 0
if len(axes[1].lines) > 2: if len(axes[1].lines) > 2:
del axes[1].lines[2] del axes[1].lines[2]
if len(axes[1].texts) > 0: # in case of no text if axes[1].texts: # in case of no text
del axes[1].texts[0] del axes[1].texts[0]
printer.draw(figures, axes, dict_ann, image) printer.draw(figures, axes, dict_ann, image)
mypause(0.01) mypause(0.01)

View File

@ -1,153 +0,0 @@
"""Run monoloco over all the pifpaf joints of KITTI images
and extract and save the annotations in txt files"""
import math
import os
import glob
import json
import shutil
import itertools
import numpy as np
import torch
from predict.monoloco import MonoLoco
from eval.geom_baseline import compute_distance
from utils.kitti import get_calibration
from utils.pifpaf import preprocess_pif
from utils.camera import xyz_from_distance, get_keypoints, pixel_to_camera
def generate_kitti(model, dir_ann, p_dropout=0.2, n_dropout=0):
cnt_ann = 0
cnt_file = 0
cnt_no_file = 0
dir_kk = os.path.join('data', 'kitti', 'calib')
dir_out = os.path.join('data', 'kitti', 'monoloco')
# Remove the output directory if alreaady exists (avoid residual txt files)
if os.path.exists(dir_out):
shutil.rmtree(dir_out)
os.makedirs(dir_out)
print("Created empty output directory for txt files")
# Load monoloco
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
monoloco = MonoLoco(model_path=model, device=device, n_dropout=n_dropout, p_dropout=p_dropout)
# Run monoloco over the list of images
list_basename = factory_basename(dir_ann)
for basename in list_basename:
path_calib = os.path.join(dir_kk, basename + '.txt')
annotations, kk, tt = factory_file(path_calib, dir_ann, basename)
boxes, keypoints = preprocess_pif(annotations, im_size=(1242, 374))
if not keypoints:
cnt_no_file += 1
continue
else:
# Run the network and the geometric baseline
outputs, varss = monoloco.forward(keypoints, kk)
dds_geom = eval_geometric(keypoints, kk, average_y=0.48)
# Save the file
all_outputs = [outputs.detach().cpu(), varss.detach().cpu(), dds_geom]
all_inputs = [boxes, keypoints]
all_params = [kk, tt]
path_txt = os.path.join(dir_out, basename + '.txt')
save_txts(path_txt, all_inputs, all_outputs, all_params)
# Update counting
cnt_ann += len(boxes)
cnt_file += 1
# Print statistics
print("Saved in {} txt {} annotations. Not found {} images"
.format(cnt_file, cnt_ann, cnt_no_file))
def save_txts(path_txt, all_inputs, all_outputs, all_params):
outputs, varss, dds_geom = all_outputs[:]
uv_boxes, keypoints = all_inputs[:]
kk, tt = all_params[:]
uv_centers = get_keypoints(keypoints, mode='bottom') # Kitti uses the bottom center to calculate depth
xy_centers = pixel_to_camera(uv_centers, kk, 1)
zzs = xyz_from_distance(outputs[:, 0:1], xy_centers)[:, 2].tolist()
with open(path_txt, "w+") as ff:
for idx in range(outputs.shape[0]):
xx = float(xy_centers[idx][0]) * zzs[idx] + tt[0]
yy = float(xy_centers[idx][1]) * zzs[idx] + tt[1]
zz = zzs[idx] + tt[2]
dd = math.sqrt(xx ** 2 + yy ** 2 + zz ** 2)
cam_0 = [xx, yy, zz, dd]
for el in uv_boxes[idx][:]:
ff.write("%s " % el)
for el in cam_0:
ff.write("%s " % el)
ff.write("%s " % float(outputs[idx][1]))
ff.write("%s " % float(varss[idx]))
ff.write("%s " % dds_geom[idx])
ff.write("\n")
# Save intrinsic matrix in the last row
for kk_el in itertools.chain(*kk): # Flatten a list of lists
ff.write("%f " % kk_el)
ff.write("\n")
def factory_basename(dir_ann):
""" Return all the basenames in the annotations folder"""
list_ann = glob.glob(os.path.join(dir_ann, '*.json'))
list_basename = [os.path.basename(x).split('.')[0] for x in list_ann]
assert list_basename, " Missing json annotations file to create txt files for KITTI datasets"
return list_basename
def factory_file(path_calib, dir_ann, basename):
"""Choose the annotation and the calibration files. Stereo option with ite = 1"""
p_left, p_right = get_calibration(path_calib)
kk, tt = p_left[:]
path_ann = os.path.join(dir_ann, basename + '.png.pifpaf.json')
try:
with open(path_ann, 'r') as f:
annotations = json.load(f)
except FileNotFoundError:
annotations = None
return annotations, kk, tt
def eval_geometric(keypoints, kk, average_y=0.48):
""" Evaluate geometric distance"""
dds_geom = []
uv_centers = get_keypoints(keypoints, mode='center')
uv_shoulders = get_keypoints(keypoints, mode='shoulder')
uv_hips = get_keypoints(keypoints, mode='hip')
xy_centers = pixel_to_camera(uv_centers, kk, 1)
xy_shoulders = pixel_to_camera(uv_shoulders, kk, 1)
xy_hips = pixel_to_camera(uv_hips, kk, 1)
for idx, xy_center in enumerate(xy_centers):
zz = compute_distance(xy_shoulders[idx], xy_hips[idx], average_y)
xyz_center = np.array([xy_center[0], xy_center[1], zz])
dd_geom = float(np.linalg.norm(xyz_center))
dds_geom.append(dd_geom)
return dds_geom

View File

@ -1,37 +0,0 @@
import glob
import logging
import os
import cv2
import sys
def resize(input_glob, output_dir, factor=2):
"""
Resize images using multiplicative factor
"""
list_im = glob.glob(input_glob)
for idx, path_in in enumerate(list_im):
basename, _ = os.path.splitext(os.path.basename(path_in))
im = cv2.imread(path_in)
assert im is not None, "Image not found"
# Paddle the image if requested and resized the dataset to a fixed dataset
h_im = im.shape[0]
w_im = im.shape[1]
w_new = round(factor * w_im)
h_new = round(factor * h_im)
print("resizing image {} to: {} x {}".format(basename, w_new, h_new))
im_new = cv2.resize(im, (w_new, h_new))
# Save the image
name_im = basename + '.png'
path_out = os.path.join(output_dir, name_im)
cv2.imwrite(path_out, im_new)
sys.stdout.write('\r' + 'Saving image number: {}'.format(idx) + '\t')

View File

@ -1,57 +0,0 @@
import numpy as np
def preprocess_pif(annotations, im_size=None):
"""
Preprocess pif annotations:
1. enlarge the box of 10%
2. Constraint it inside the image (if image_size provided)
"""
boxes = []
keypoints = []
for dic in annotations:
box = dic['bbox']
if box[3] < 0.5: # Check for no detections (boxes 0,0,0,0)
return [], []
else:
kps = prepare_pif_kps(dic['keypoints'])
conf = float(np.mean(np.array(kps[2])))
# Add 10% for y and 20% for x
delta_h = (box[3] - box[1]) / 10
delta_w = (box[2] - box[0]) / 5
assert delta_h > -5 and delta_w > -5, "Bounding box <=0"
box[0] -= delta_w
box[1] -= delta_h
box[2] += delta_w
box[3] += delta_h
# Put the box inside the image
if im_size is not None:
box[0] = max(0, box[0])
box[1] = max(0, box[1])
box[2] = min(box[2], im_size[0])
box[3] = min(box[3], im_size[1])
box.append(conf)
boxes.append(box)
keypoints.append(kps)
return boxes, keypoints
def prepare_pif_kps(kps_in):
"""Convert from a list of 51 to a list of 3, 17"""
assert len(kps_in) % 3 == 0, "keypoints expected as a multiple of 3"
xxs = kps_in[0:][::3]
yys = kps_in[1:][::3] # from offset 1 every 3
ccs = kps_in[2:][::3]
return [xxs, yys, ccs]

View File

@ -1,243 +0,0 @@
import math
from collections import OrderedDict
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from matplotlib.patches import Ellipse, Circle
from mpl_toolkits.axes_grid1 import make_axes_locatable
from utils.camera import pixel_to_camera
from utils.misc import get_task_error
class Printer:
"""
Print results on images: birds eye view and computed distance
"""
RADIUS_KPS = 6
FONTSIZE_BV = 16
FONTSIZE = 18
TEXTCOLOR = 'darkorange'
COLOR_KPS = 'yellow'
def __init__(self, image, output_path, kk, output_types, text=True, legend=True, epistemic=False,
z_max=30, fig_width=10):
self.im = image
self.kk = kk
self.output_types = output_types
self.text = text
self.epistemic = epistemic
self.legend = legend
self.z_max = z_max # To include ellipses in the image
self.y_scale = 1
self.width = self.im.size[0]
self.height = self.im.size[1]
self.fig_width = fig_width
# Define the output dir
self.path_out = output_path
self.cmap = cm.get_cmap('jet')
self.extensions = []
self.mpl_im0 = None
def _process_results(self, dic_ann):
# Include the vectors inside the interval given by z_max
self.stds_ale = dic_ann['stds_ale']
self.stds_ale_epi = dic_ann['stds_epi']
self.xx_gt = [xx[0] for xx in dic_ann['xyz_real']]
self.zz_gt = [xx[2] if xx[2] < self.z_max - self.stds_ale_epi[idx] else 0
for idx, xx in enumerate(dic_ann['xyz_real'])]
self.xx_pred = [xx[0] for xx in dic_ann['xyz_pred']]
self.zz_pred = [xx[2] if xx[2] < self.z_max - self.stds_ale_epi[idx] else 0
for idx, xx in enumerate(dic_ann['xyz_pred'])]
self.dds_real = dic_ann['dds_real']
self.uv_centers = dic_ann['uv_centers']
self.uv_shoulders = dic_ann['uv_shoulders']
self.uv_kps = dic_ann['uv_kps']
self.uv_camera = (int(self.im.size[0] / 2), self.im.size[1])
self.radius = 14 / 1600 * self.width
def factory_axes(self):
"""Create axes for figures: front bird combined"""
axes = []
figures = []
# Initialize combined figure, resizing it for aesthetic proportions
if 'combined' in self.output_types:
assert 'bird' and 'front' not in self.output_types, \
"combined figure cannot be print together with front or bird ones"
self.y_scale = self.width / (self.height * 1.8) # Defined proportion
if self.y_scale < 0.95 or self.y_scale > 1.05: # allows more variation without resizing
self.im = self.im.resize((self.width, round(self.height * self.y_scale)))
self.width = self.im.size[0]
self.height = self.im.size[1]
fig_width = self.fig_width + 0.6 * self.fig_width
fig_height = self.fig_width * self.height / self.width
# Distinguish between KITTI images and general images
if self.y_scale > 1.7:
fig_ar_1 = 1.7
else:
fig_ar_1 = 1.3
width_ratio = 1.9
self.extensions.append('.combined.png')
fig, (ax1, ax0) = plt.subplots(1, 2, sharey=False, gridspec_kw={'width_ratios': [1, width_ratio]},
figsize=(fig_width, fig_height))
ax1.set_aspect(fig_ar_1)
fig.set_tight_layout(True)
fig.subplots_adjust(left=0.02, right=0.98, bottom=0, top=1, hspace=0, wspace=0.02)
figures.append(fig)
assert 'front' not in self.output_types and 'bird' not in self.output_types, \
"--combined arguments is not supported with other visualizations"
# Initialize front figure
elif 'front' in self.output_types:
width = self.fig_width
height = self.fig_width * self.height / self.width
self.extensions.append(".front.png")
plt.figure(0)
fig0, ax0 = plt.subplots(1, 1, figsize=(width, height))
fig0.set_tight_layout(True)
figures.append(fig0)
# Create front figure axis
if any(xx in self.output_types for xx in ['front', 'combined']):
ax0.set_axis_off()
ax0.set_xlim(0, self.width)
ax0.set_ylim(self.height, 0)
self.mpl_im0 = ax0.imshow(self.im)
z_min = 0
bar_ticks = self.z_max // 5 + 1
ax0.get_xaxis().set_visible(False)
ax0.get_yaxis().set_visible(False)
divider = make_axes_locatable(ax0)
cax = divider.append_axes('right', size='3%', pad=0.05)
norm = matplotlib.colors.Normalize(vmin=z_min, vmax=self.z_max)
scalar_mappable = plt.cm.ScalarMappable(cmap=self.cmap, norm=norm)
scalar_mappable.set_array([])
plt.colorbar(scalar_mappable, ticks=np.linspace(z_min, self.z_max, bar_ticks),
boundaries=np.arange(z_min - 0.05, self.z_max + 0.1, .1), cax=cax, label='Z [m]')
axes.append(ax0)
if not axes:
axes.append(None)
if 'bird' in self.output_types:
self.extensions.append(".bird.png")
fig1, ax1 = plt.subplots(1, 1)
fig1.set_tight_layout(True)
figures.append(fig1)
if any(xx in self.output_types for xx in ['bird', 'combined']):
uv_max = [0., float(self.height)]
xyz_max = pixel_to_camera(uv_max, self.kk, self.z_max)
x_max = abs(xyz_max[0]) # shortcut to avoid oval circles in case of different kk
# Adding field of view
ax1.plot([0, x_max], [0, self.z_max], 'k--')
ax1.plot([0, -x_max], [0, self.z_max], 'k--')
ax1.set_ylim(0, self.z_max+1)
ax1.set_xlabel("X [m]")
ax1.set_ylabel("Z [m]")
axes.append(ax1)
return figures, axes
def draw(self, figures, axes, dic_out, image, save=False, show=False):
self._process_results(dic_out)
num = 0
if any(xx in self.output_types for xx in ['front', 'combined']):
self.mpl_im0.set_data(image)
for idx, uv in enumerate(self.uv_shoulders):
if min(self.zz_pred[idx], self.zz_gt[idx]) > 0:
color = self.cmap((self.zz_pred[idx] % self.z_max) / self.z_max)
circle = Circle((uv[0], uv[1] * self.y_scale), radius=self.radius, color=color, fill=True)
axes[0].add_patch(circle)
if self.text:
axes[0].text(uv[0]+self.radius, uv[1] * self.y_scale - self.radius, str(num),
fontsize=self.FONTSIZE, color=self.TEXTCOLOR, weight='bold')
num += 1
if any(xx in self.output_types for xx in ['bird', 'combined']):
for idx, _ in enumerate(self.xx_gt):
if self.zz_gt[idx] > 0:
target = get_task_error(self.dds_real[idx])
angle = get_angle(self.xx_gt[idx], self.zz_gt[idx])
ellipse_real = Ellipse((self.xx_gt[idx], self.zz_gt[idx]), width=target * 2, height=1,
angle=angle, color='lightgreen', fill=True, label="Task error")
axes[1].add_patch(ellipse_real)
if abs(self.zz_gt[idx] - self.zz_pred[idx]) > 0.001:
axes[1].plot(self.xx_gt[idx], self.zz_gt[idx], 'kx', label="Ground truth", markersize=3)
# Print prediction and the real ground truth.
num = 0
for idx, _ in enumerate(self.xx_pred):
if self.zz_gt[idx] > 0: # only the merging ones and inside the interval
angle = get_angle(self.xx_pred[idx], self.zz_pred[idx])
ellipse_ale = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_ale[idx] * 2,
height=1, angle=angle, color='b', fill=False, label="Aleatoric Uncertainty",
linewidth=1.3)
ellipse_var = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_ale_epi[idx] * 2,
height=1, angle=angle, color='r', fill=False, label="Uncertainty",
linewidth=1, linestyle='--')
axes[1].add_patch(ellipse_ale)
if self.epistemic:
axes[1].add_patch(ellipse_var)
axes[1].plot(self.xx_pred[idx], self.zz_pred[idx], 'ro', label="Predicted", markersize=3)
# Setup the legend to avoid repetitions
if self.legend:
handles, labels = axes[1].get_legend_handles_labels()
by_label = OrderedDict(zip(labels, handles))
axes[1].legend(by_label.values(), by_label.keys())
# Plot the number
(_, x_pos), (_, z_pos) = get_confidence(self.xx_pred[idx], self.zz_pred[idx],
self.stds_ale_epi[idx])
if self.text:
axes[1].text(x_pos, z_pos, str(num), fontsize=self.FONTSIZE_BV, color='darkorange')
num += 1
for idx, fig in enumerate(figures):
fig.canvas.draw()
if save:
fig.savefig(self.path_out + self.extensions[idx], bbox_inches='tight')
if show:
fig.show()
def get_confidence(xx, zz, std):
"""Obtain the points to plot the confidence of each annotation"""
theta = math.atan2(zz, xx)
delta_x = std * math.cos(theta)
delta_z = std * math.sin(theta)
return (xx - delta_x, xx + delta_x), (zz - delta_z, zz + delta_z)
def get_angle(xx, zz):
"""Obtain the points to plot the confidence of each annotation"""
theta = math.atan2(zz, xx)
angle = theta * (180 / math.pi)
return angle

View File

@ -1,10 +1,12 @@
import os
import sys
# Python does not consider the current directory to be a package
from utils.iou import get_iou_matrix sys.path.insert(0, os.path.join('..', 'monoloco'))
from utils.camera import pixel_to_camera
def test_iou(): def test_iou():
from monoloco.utils.iou import get_iou_matrix
boxes_pred = [[1, 100, 1, 200]] boxes_pred = [[1, 100, 1, 200]]
boxes_gt = [[100., 120., 150., 160.],[12, 110, 130., 160.]] boxes_gt = [[100., 120., 150., 160.],[12, 110, 130., 160.]]
iou_matrix = get_iou_matrix(boxes_pred, boxes_gt) iou_matrix = get_iou_matrix(boxes_pred, boxes_gt)
@ -12,6 +14,7 @@ def test_iou():
def test_pixel_to_camera(): def test_pixel_to_camera():
from monoloco.utils.camera import pixel_to_camera
kk = [[718.3351, 0., 600.3891], [0., 718.3351, 181.5122], [0., 0., 1.]] kk = [[718.3351, 0., 600.3891], [0., 718.3351, 181.5122], [0., 0., 1.]]
zz = 10 zz = 10
uv_vector = [1000., 400.] uv_vector = [1000., 400.]