first commit
This commit is contained in:
commit
28faf348c5
4
.gitignore
vendored
Normal file
4
.gitignore
vendored
Normal file
@ -0,0 +1,4 @@
|
||||
.idea/
|
||||
data/
|
||||
.DS_store
|
||||
__pycache__
|
||||
140
README.md
Normal file
140
README.md
Normal file
@ -0,0 +1,140 @@
|
||||
# Monoloco
|
||||
|
||||
> We tackle the fundamentally ill-posed problem of 3D human localization from monocular RGB images.
|
||||
Driven by the limitation of neural networks outputting point estimates,
|
||||
we address the ambiguity in the task with a new neural network predicting confidence intervals through a
|
||||
loss function based on the Laplace distribution.
|
||||
Our architecture is a light-weight feed-forward neural network which predicts the 3D coordinates given 2D human pose.
|
||||
The design is particularly well suited for small training data and cross-dataset generalization.
|
||||
Our experiments show that (i) we outperform state-of-the art results on KITTI and nuScenes datasets,
|
||||
(ii) even outperform stereo for far-away pedestrians, and (iii) estimate meaningful confidence intervals.
|
||||
We further share insights on our model of uncertainty in case of limited observation and out-of-distribution samples.
|
||||
|
||||
```
|
||||
@article{bertoni2019monoloco,
|
||||
title={MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation},
|
||||
author={Bertoni, Lorenzo and Kreiss, Sven and Alahi, Alexandre},
|
||||
journal={arXiv preprint arXiv:xxxx},
|
||||
year={2019}
|
||||
}
|
||||
```
|
||||
|
||||
Add link paper
|
||||

|
||||
# Setup
|
||||
|
||||
### Install
|
||||
Python 3.6 or 3.7 is required for nuScenes development kit. Python 3 is required for openpifpaf.
|
||||
All details for Pifpaf pose detector at [openpifpaf](https://github.com/vita-epfl/openpifpaf).
|
||||
|
||||
|
||||
```
|
||||
pip install nuscenes-devkit openpifpaf
|
||||
```
|
||||
### Data structure
|
||||
|
||||
Data
|
||||
├── arrays
|
||||
├── models
|
||||
├── baselines
|
||||
├── logs
|
||||
|
||||
|
||||
Run the following to create the folders:
|
||||
```
|
||||
mkdir data
|
||||
cd data
|
||||
mkdir arrays models baselines logs
|
||||
```
|
||||
|
||||
### Pre-trained Models
|
||||
* Download a MonoLoco pre-trained model from Google Drive: ADD LINK and save it in `data/models`
|
||||
* Download a Pifpaf pre-trained model from [openpifpaf](https://github.com/vita-epfl/openpifpaf) project
|
||||
and save it into `data/models`
|
||||
|
||||
### Baselines
|
||||
Download KITTI ground truth txt files from Google Drive ADD link and unzip them in `data/baselines`
|
||||
The zip file also contains nuScenes and KITTI splits for training and validations as well as detections
|
||||
from other baselines (more details in Eval section)
|
||||
|
||||
|
||||
# Interfaces
|
||||
All the commands are run through a main file called `main.py` using subparsers.
|
||||
To check all the commands for the parser and the subparsers run:
|
||||
|
||||
* `python3 src/main.py --help`
|
||||
* `python3 src/main.py prep --help`
|
||||
* `python3 src/main.py predict --help`
|
||||
* `python3 src/main.py train --help`
|
||||
* `python3 src/main.py eval --help`
|
||||
|
||||
|
||||
# Predict
|
||||
The predict script receives an image (or an entire folder using glob expressions),
|
||||
calls PifPaf for 2d human pose detection over the image
|
||||
and runs Monoloco for 3d location of the detected poses.
|
||||
The command `--networks` defines if saving pifpaf outputs, MonoLoco outputs or both.
|
||||
You can check all commands for Pifpaf at [openpifpaf](https://github.com/vita-epfl/openpifpaf).
|
||||
|
||||
|
||||
Output options include json files and/or visualization of the predictions on the image in *frontal mode*,
|
||||
*birds-eye-view mode* or *combined mode* and can be specified with `--output_types`
|
||||
|
||||
|
||||
### Ground truth matching
|
||||
* In case you provide a ground-truth json file to compare the predictions of MonoLoco,
|
||||
the script will match every detection using Intersection over Union metric.
|
||||
The ground truth file can be generated using the subparser `prep` and called with the command `--path_gt`.
|
||||
Check preprocess section for more details.
|
||||
|
||||
* In case you don't provide a ground-truth file, the script will look for a predefined path.
|
||||
If it does not find the file, it will generate images
|
||||
with all the predictions without ground-truth matching.
|
||||
|
||||
Below an example with and without ground-truth matching. They have been created (adding or removing `--path_gt`) with:
|
||||
`python src/main.py predict --networks monoloco --glob docs/002282.png --output_types combined --scale 2 --y_scale 1.85
|
||||
--model data/models/base_model.pickle --n_dropout 100 --z_max 30`
|
||||
|
||||
With ground truth matching (only matching people):
|
||||

|
||||
|
||||
Without ground_truth matching (all the detected people):
|
||||

|
||||
|
||||
# Preprocess
|
||||
### Input joints for training
|
||||
MonoLoco is trained using 2D human pose joints detected by pifpaf and matched with the ground truth location provided by
|
||||
nuScenes or KITTI Dataset. To create the joints run: `python src/main.py prep` specifying:
|
||||
1. `--dir_ann` annotation directory containing pifpaf joints. You can create them running the predict script and using
|
||||
`--network pifpaf`.
|
||||
2. `--dataset` Which dataset to preprocess. For nuscenes, all three versions of the
|
||||
dataset are supported: nuscenes_mini, nuscenes, nuscenes_teaser.
|
||||
|
||||
### Ground truth file for evaluation
|
||||
The preprocessing script also outputs a second json file called **names.json** which provide a dictionary indexed
|
||||
by the image name to easily access ground truth files for evaluation purposes.
|
||||
|
||||
|
||||
### Train
|
||||
Provide the json file containing the preprocess joints (**joints.json**) as argument.
|
||||
|
||||
As simple as `python3 src/main.py --train --joints 'data/arrays/joints.json`
|
||||
|
||||
All the hyperparameters options can be checked at `python3 src/main.py train --help`.
|
||||
|
||||
### Hyperparameters tuning
|
||||
Random search in log space is provided. An example: `python3 src/main.py train --hyp --multiplier 10 --r_seed 1`.
|
||||
One iteration of the multiplier includes 6 runs.
|
||||
|
||||
# Eval
|
||||
Evaluate performances of the trained model on KITTI or Nuscenes Dataset. Compare them with other monocular
|
||||
and stereo Baselines:
|
||||
[Mono3D](http://3dimage.ee.tsinghua.edu.cn/cxz/mono3d),
|
||||
[3DOP](https://xiaozhichen.github.io/papers/nips15chen.pdf),
|
||||
[MonoDepth](https://arxiv.org/abs/1609.03677) and our
|
||||
[Geometrical Baseline](src/eval/geom_baseline.py).
|
||||
|
||||
The following graph is obtained running:
|
||||
`python3 src/main.py eval --dataset kitti --model data/models/base_model.pickle`
|
||||

|
||||
|
||||
BIN
docs/002282.png
Executable file
BIN
docs/002282.png
Executable file
Binary file not shown.
|
After Width: | Height: | Size: 831 KiB |
BIN
docs/002282.png.combined_1.png
Normal file
BIN
docs/002282.png.combined_1.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 697 KiB |
BIN
docs/002282.png.combined_2.png
Normal file
BIN
docs/002282.png.combined_2.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 711 KiB |
BIN
docs/pull.png
Normal file
BIN
docs/pull.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.4 MiB |
BIN
docs/results.png
Normal file
BIN
docs/results.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 54 KiB |
246
src/eval/geom_baseline.py
Normal file
246
src/eval/geom_baseline.py
Normal file
@ -0,0 +1,246 @@
|
||||
|
||||
import glob
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import numpy as np
|
||||
import math
|
||||
from collections import defaultdict
|
||||
from utils.camera import pixel_to_camera
|
||||
|
||||
|
||||
class GeomBaseline:
|
||||
|
||||
def __init__(self, joints):
|
||||
|
||||
# Initialize directories
|
||||
self.clusters = ['10', '20', '30', '>30', 'all']
|
||||
self.average_y = 0.48
|
||||
self.joints = joints
|
||||
|
||||
from utils.misc import calculate_iou
|
||||
self.calculate_iou = calculate_iou
|
||||
from utils.nuscenes import get_unique_tokens, split_scenes
|
||||
self.get_unique_tokens = get_unique_tokens
|
||||
self.split_scenes = split_scenes
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
|
||||
def run(self):
|
||||
"""
|
||||
List of json files --> 2 lists with mean and std for each segment and the total count of instances
|
||||
|
||||
For each annotation:
|
||||
1. From gt boxes calculate the height (deltaY) for the segments head, shoulder, hip, ankle
|
||||
2. From mask boxes calculate distance of people using average height of people and real pixel height
|
||||
|
||||
For left-right ambiguities we chose always the average of the joints
|
||||
|
||||
The joints are mapped from 0 to 16 in the following order:
|
||||
['nose', 'left_eye', 'right_eye', 'left_ear', 'right_ear', 'left_shoulder', 'right_shoulder', 'left_elbow',
|
||||
'right_elbow', 'left_wrist', 'right_wrist', 'left_hip', 'right_hip', 'left_knee', 'right_knee', 'left_ankle',
|
||||
'right_ankle']
|
||||
|
||||
"""
|
||||
cnt_tot = 0
|
||||
# Access the joints file
|
||||
with open(self.joints, 'r') as ff:
|
||||
dic_joints = json.load(ff)
|
||||
|
||||
dic_dist = defaultdict(lambda: defaultdict(list))
|
||||
|
||||
# Calculate distances for all the segments
|
||||
for phase in ['train', 'val']:
|
||||
|
||||
cnt = update_distances(dic_joints[phase], dic_dist, phase, self.average_y)
|
||||
cnt_tot += cnt
|
||||
|
||||
dic_h_means = calculate_heights(dic_dist['heights'], mode='mean')
|
||||
dic_h_stds = calculate_heights(dic_dist['heights'], mode='std')
|
||||
|
||||
self.logger.info("Computed distance of {} annotations".format(cnt_tot))
|
||||
|
||||
for key in dic_h_means:
|
||||
self.logger.info("Average height of segment {} is {:.2f} with a std of {:.2f}".
|
||||
format(key, dic_h_means[key], dic_h_stds[key]))
|
||||
|
||||
errors = calculate_error(dic_dist['error'])
|
||||
|
||||
for clst in self.clusters:
|
||||
self.logger.info("Average distance over the val set for clst {}: {:.2f}".format(clst, errors[clst]))
|
||||
|
||||
self.logger.info("Joints used: {}".format(self.joints))
|
||||
|
||||
|
||||
def update_distances(dic_fin, dic_dist, phase, average_y):
|
||||
|
||||
# Loop over each annotation in the json file corresponding to the image
|
||||
|
||||
cnt = 0
|
||||
for idx, kps in enumerate(dic_fin['kps']):
|
||||
# Extract pixel coordinates of head, shoulder, hip, ankle and and save them
|
||||
dic_uv = extract_pixel_coord(kps)
|
||||
|
||||
# Convert segments from pixel coordinate to camera coordinate
|
||||
kk = dic_fin['K'][idx]
|
||||
z_met = dic_fin['boxes_3d'][idx][2]
|
||||
|
||||
# Create a dict with all annotations in meters
|
||||
dic_xyz = {key: pixel_to_camera(dic_uv[key], kk, z_met) for key in dic_uv}
|
||||
|
||||
# Compute real height
|
||||
dy_met = abs(dic_xyz['hip'][1] - dic_xyz['shoulder'][1])
|
||||
|
||||
# Estimate distance for a single annotation
|
||||
z_met_real, _ = compute_distance_single(dic_uv['shoulder'], dic_uv['hip'], kk, average_y,
|
||||
mode='real', dy_met=dy_met)
|
||||
z_met_approx, _ = compute_distance_single(dic_uv['shoulder'], dic_uv['hip'], kk, average_y,
|
||||
mode='average')
|
||||
|
||||
# Compute distance with respect to the center of the 3D bounding box
|
||||
xyz_met = np.array(dic_fin['boxes_3d'][idx][0:3])
|
||||
d_met = np.linalg.norm(xyz_met)
|
||||
d_real = math.sqrt(z_met_real ** 2 + dic_fin['boxes_3d'][idx][0] ** 2 + dic_fin['boxes_3d'][idx][1] ** 2)
|
||||
d_approx = math.sqrt(z_met_approx ** 2 +
|
||||
dic_fin['boxes_3d'][idx][0] ** 2 + dic_fin['boxes_3d'][idx][1] ** 2)
|
||||
|
||||
# if abs(d_qmet - d_real) > 1e-1: # "Error in computing distance with real height in pixels"
|
||||
# aa = 5
|
||||
|
||||
# Update the dictionary with distance and heights metrics
|
||||
dic_dist = update_dic_dist(dic_dist, dic_xyz, d_real, d_approx, phase)
|
||||
cnt += 1
|
||||
|
||||
return cnt
|
||||
|
||||
|
||||
def compute_distance_single(uv_1, uv_2, kk, average_y, mode='average', dy_met=0):
|
||||
|
||||
"""
|
||||
Compute distance Z of a mask annotation (solving a linear system) for 2 possible cases:
|
||||
1. knowing specific height of the annotation (head-ankle) dy_met
|
||||
2. using mean height of people (average_y)
|
||||
"""
|
||||
assert mode == 'average' or mode == 'real'
|
||||
# Trasform into normalized camera coordinates (plane at 1m)
|
||||
xyz_met_norm_1 = pixel_to_camera(uv_1, kk, 1)
|
||||
xyz_met_norm_2 = pixel_to_camera(uv_2, kk, 1)
|
||||
|
||||
x1 = xyz_met_norm_1[0]
|
||||
y1 = xyz_met_norm_1[1]
|
||||
x2 = xyz_met_norm_2[0]
|
||||
y2 = xyz_met_norm_2[1]
|
||||
xx = (x1 + x2) / 2
|
||||
|
||||
# Choose if solving for provided height or average one.
|
||||
if mode == 'average':
|
||||
cc = - average_y # Y axis goes down
|
||||
else:
|
||||
cc = -dy_met
|
||||
|
||||
# if - 3 * average_y <= cc <= -2:
|
||||
# aa = 5
|
||||
|
||||
# Solving the linear system Ax = b
|
||||
Aa = np.array([[y1, 0, -xx],
|
||||
[0, -y1, 1],
|
||||
[y2, 0, -xx],
|
||||
[0, -y2, 1]])
|
||||
|
||||
bb = np.array([cc * xx, -cc, 0, 0]).reshape(4, 1)
|
||||
xx = np.linalg.lstsq(Aa, bb, rcond=None)
|
||||
z_met = abs(np.float(xx[0][1])) # Abs take into account specularity behind the observer
|
||||
|
||||
# Compute the absolute x and y coordinates in meters
|
||||
xyz_met_1 = xyz_met_norm_1 * z_met
|
||||
xyz_met_2 = xyz_met_norm_2 * z_met
|
||||
|
||||
return z_met, (xyz_met_1, xyz_met_2)
|
||||
|
||||
|
||||
def extract_pixel_coord(kps):
|
||||
|
||||
"""Extract uv coordinates from keypoints and save them in a dict """
|
||||
# For each level of height (e.g. 5 points in the head), take the average of them
|
||||
|
||||
uv_head = np.array([np.average(kps[0][0:5]), np.average(kps[1][0:5]), 1])
|
||||
uv_shoulder = np.array([np.average(kps[0][5:7]), np.average(kps[1][5:7]), 1])
|
||||
uv_hip = np.array([np.average(kps[0][11:13]), np.average(kps[1][11:13]), 1])
|
||||
uv_ankle = np.array([np.average(kps[0][15:17]), np.average(kps[1][15:17]), 1])
|
||||
|
||||
dic_uv = {'head': uv_head, 'shoulder': uv_shoulder, 'hip': uv_hip, 'ankle': uv_ankle}
|
||||
|
||||
return dic_uv
|
||||
|
||||
|
||||
def update_dic_dist(dic_dist, dic_xyz, d_real, d_approx, phase):
|
||||
""" For every annotation in a single image, update the final dictionary"""
|
||||
|
||||
# Update the dict with heights metric
|
||||
if phase == 'train':
|
||||
dic_dist['heights']['head'].append(np.float(dic_xyz['head'][1]))
|
||||
dic_dist['heights']['shoulder'].append(np.float(dic_xyz['shoulder'][1]))
|
||||
dic_dist['heights']['hip'].append(np.float(dic_xyz['hip'][1]))
|
||||
dic_dist['heights']['ankle'].append(np.float(dic_xyz['ankle'][1]))
|
||||
|
||||
# Update the dict with distance metrics for the test phase
|
||||
if phase == 'val':
|
||||
error = abs(d_real - d_approx)
|
||||
|
||||
if d_real <= 10:
|
||||
dic_dist['error']['10'].append(error)
|
||||
elif d_real <= 20:
|
||||
dic_dist['error']['20'].append(error)
|
||||
elif d_real <= 30:
|
||||
dic_dist['error']['30'].append(error)
|
||||
else:
|
||||
dic_dist['error']['>30'].append(error)
|
||||
|
||||
dic_dist['error']['all'].append(error)
|
||||
|
||||
return dic_dist
|
||||
|
||||
|
||||
def calculate_heights(heights, mode):
|
||||
"""
|
||||
Compute statistics of heights based on the distance
|
||||
"""
|
||||
|
||||
assert mode == 'mean' or mode == 'std' or mode == 'max'
|
||||
heights_fin = {}
|
||||
|
||||
head_shoulder = np.array(heights['shoulder']) - np.array(heights['head'])
|
||||
shoulder_hip = np.array(heights['hip']) - np.array(heights['shoulder'])
|
||||
hip_ankle = np.array(heights['ankle']) - np.array(heights['hip'])
|
||||
|
||||
if mode == 'mean':
|
||||
heights_fin['head_shoulder'] = np.float(np.mean(head_shoulder)) * 100
|
||||
heights_fin['shoulder_hip'] = np.float(np.mean(shoulder_hip)) * 100
|
||||
heights_fin['hip_ankle'] = np.float(np.mean(hip_ankle)) * 100
|
||||
|
||||
elif mode == 'std':
|
||||
heights_fin['head_shoulder'] = np.float(np.std(head_shoulder)) * 100
|
||||
heights_fin['shoulder_hip'] = np.float(np.std(shoulder_hip)) * 100
|
||||
heights_fin['hip_ankle'] = np.float(np.std(hip_ankle)) * 100
|
||||
|
||||
elif mode == 'max':
|
||||
heights_fin['head_shoulder'] = np.float(np.max(head_shoulder)) * 100
|
||||
heights_fin['shoulder_hip'] = np.float(np.max(shoulder_hip)) * 100
|
||||
heights_fin['hip_ankle'] = np.float(np.max(hip_ankle)) * 100
|
||||
|
||||
return heights_fin
|
||||
|
||||
|
||||
def calculate_error(dic_errors):
|
||||
"""
|
||||
Compute statistics of distances based on the distance
|
||||
"""
|
||||
|
||||
errors = {}
|
||||
for clst in dic_errors:
|
||||
|
||||
errors[clst] = np.float(np.mean(np.array(dic_errors[clst])))
|
||||
|
||||
return errors
|
||||
|
||||
464
src/eval/kitti_eval.py
Normal file
464
src/eval/kitti_eval.py
Normal file
@ -0,0 +1,464 @@
|
||||
|
||||
import os
|
||||
import math
|
||||
import logging
|
||||
from collections import defaultdict
|
||||
import json
|
||||
import copy
|
||||
import datetime
|
||||
|
||||
|
||||
class KittiEval:
|
||||
"""
|
||||
Evaluate Monoloco code on KITTI dataset and compare it with:
|
||||
- Mono3D
|
||||
- 3DOP
|
||||
- MonoDepth
|
||||
"""
|
||||
|
||||
def __init__(self, show=False, thresh_iou_our=0.3, thresh_iou_m3d=0.5, thresh_conf_m3d=0.5, thresh_conf_our=0.3):
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
|
||||
self.show = show
|
||||
from utils.misc import get_idx_max
|
||||
self.get_idx_max = get_idx_max
|
||||
from utils.kitti import check_conditions, get_category
|
||||
self.check_conditions = check_conditions
|
||||
self.get_category = get_category
|
||||
from visuals.results import print_results
|
||||
self.print_results = print_results
|
||||
|
||||
self.dir_gt = os.path.join('data', 'baselines', 'gt')
|
||||
self.dir_m3d = os.path.join('data', 'baselines', 'm3d')
|
||||
self.dir_3dop = os.path.join('data', 'baselines', '3dop')
|
||||
self.dir_md = os.path.join('data', 'baselines', 'monodepth')
|
||||
self.dir_psm = os.path.join('data', 'baselines', 'psm')
|
||||
self.dir_our = os.path.join('data', 'baselines', 'monoloco')
|
||||
path_val = os.path.join('data', 'baselines', 'val.txt')
|
||||
dir_logs = os.path.join('data', 'logs')
|
||||
assert dir_logs, "No directory to save final statistics"
|
||||
|
||||
now = datetime.datetime.now()
|
||||
now_time = now.strftime("%Y%m%d-%H%M")[2:]
|
||||
self.path_results = os.path.join(dir_logs, 'eval-' + now_time + '.json')
|
||||
|
||||
assert os.path.exists(self.dir_m3d) and os.path.exists(self.dir_our) \
|
||||
and os.path.exists(self.dir_3dop)
|
||||
|
||||
self.clusters = ['easy', 'moderate', 'hard', 'all', '6', '10', '15', '20', '25', '30', '40', '50', '>50']
|
||||
|
||||
self.dic_thresh_iou = {'m3d': thresh_iou_m3d, '3dop': thresh_iou_m3d, 'md': thresh_iou_our,
|
||||
'psm': thresh_iou_our, 'our': thresh_iou_our}
|
||||
self.dic_thresh_conf = {'m3d': thresh_conf_m3d, '3dop': thresh_conf_m3d, 'our': thresh_conf_our}
|
||||
|
||||
self.dic_cnt = defaultdict(int)
|
||||
self.errors = defaultdict(lambda: defaultdict(list))
|
||||
|
||||
# Only consider validation images
|
||||
set_gt = set(os.listdir(self.dir_gt))
|
||||
set_val = set()
|
||||
|
||||
with open(path_val, "r") as f_val:
|
||||
for line in f_val:
|
||||
set_val.add(line[:-1] + '.txt')
|
||||
self.list_gt = list(set_gt.intersection(set_val))
|
||||
assert self.list_gt, "No images in the folder"
|
||||
|
||||
def run(self):
|
||||
|
||||
self.dic_stds = defaultdict(lambda: defaultdict(list))
|
||||
dic_stats = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: defaultdict(float))))
|
||||
|
||||
cnt_gt = 0
|
||||
|
||||
# Iterate over each ground truth file in the training set
|
||||
for name in self.list_gt:
|
||||
if name == '004647.txt':
|
||||
aa = 5
|
||||
path_gt = os.path.join(self.dir_gt, name)
|
||||
path_m3d = os.path.join(self.dir_m3d, name)
|
||||
path_our = os.path.join(self.dir_our, name)
|
||||
path_3dop = os.path.join(self.dir_3dop, name)
|
||||
path_md = os.path.join(self.dir_md, name)
|
||||
path_psm = os.path.join(self.dir_psm, name)
|
||||
boxes_gt = []
|
||||
truncs_gt = [] # Float from 0 to 1
|
||||
occs_gt = [] # Either 0,1,2,3 fully visible, partly occluded, largely occluded, unknown
|
||||
dds_gt = []
|
||||
dic_fin = defaultdict(list)
|
||||
|
||||
# Iterate over each line of the gt file and save box location and distances
|
||||
with open(path_gt, "r") as f_gt:
|
||||
for line_gt in f_gt:
|
||||
if self.check_conditions(line_gt, mode='gt'):
|
||||
truncs_gt.append(float(line_gt.split()[1]))
|
||||
occs_gt.append(int(line_gt.split()[2]))
|
||||
boxes_gt.append([float(x) for x in line_gt.split()[4:8]])
|
||||
loc_gt = [float(x) for x in line_gt.split()[11:14]]
|
||||
dds_gt.append(math.sqrt(loc_gt[0] ** 2 + loc_gt[1] ** 2 + loc_gt[2] ** 2))
|
||||
cnt_gt += 1
|
||||
|
||||
# Extract annotations for the same file
|
||||
if len(boxes_gt) > 0:
|
||||
boxes_m3d, dds_m3d = self.parse_txts(path_m3d, method='m3d')
|
||||
boxes_3dop, dds_3dop = self.parse_txts(path_3dop, method='3dop')
|
||||
boxes_md, dds_md = self.parse_txts(path_md, method='md')
|
||||
boxes_psm, dds_psm = self.parse_txts(path_psm, method='psm')
|
||||
boxes_our, dds_our, stds_ale, stds_epi, kk_list, dds_geom, xyzs, xy_kps = \
|
||||
self.parse_txts(path_our, method='our')
|
||||
|
||||
if len(boxes_our) > 0 and len(boxes_psm) == 0:
|
||||
aa = 5
|
||||
|
||||
# Compute the error with ground truth
|
||||
|
||||
self.estimate_error_base(boxes_m3d, dds_m3d, boxes_gt, dds_gt, truncs_gt, occs_gt, method='m3d')
|
||||
self.estimate_error_base(boxes_3dop, dds_3dop, boxes_gt, dds_gt, truncs_gt, occs_gt, method='3dop')
|
||||
self.estimate_error_base(boxes_md, dds_md, boxes_gt, dds_gt, truncs_gt, occs_gt, method='md')
|
||||
self.estimate_error_base(boxes_psm, dds_psm, boxes_gt, dds_gt, truncs_gt, occs_gt, method='psm')
|
||||
self.estimate_error_our(boxes_our, dds_our, stds_ale, stds_epi, kk_list, dds_geom, xyzs, xy_kps,
|
||||
boxes_gt, dds_gt, truncs_gt, occs_gt, dic_fin, name)
|
||||
|
||||
# Iterate over all the files together to find a pool of common annotations
|
||||
self.compare_error(boxes_m3d, dds_m3d, boxes_3dop, dds_3dop, boxes_md, dds_md, boxes_our, dds_our,
|
||||
boxes_psm, dds_psm, boxes_gt, dds_gt, truncs_gt, occs_gt, dds_geom)
|
||||
|
||||
# Save statistics
|
||||
for key in self.errors:
|
||||
for clst in self.clusters[:-2]: # M3d and pifpaf does not have annotations above 40 meters
|
||||
dic_stats['test'][key]['mean'][clst] = sum(self.errors[key][clst]) / float(len(self.errors[key][clst]))
|
||||
dic_stats['test'][key]['max'][clst] = max(self.errors[key][clst])
|
||||
dic_stats['test'][key]['cnt'][clst] = len(self.errors[key][clst])
|
||||
|
||||
if key == 'our':
|
||||
for clst in self.clusters[:-2]:
|
||||
dic_stats['test'][key]['std_ale'][clst] = \
|
||||
sum(self.dic_stds['ale'][clst]) / float(len(self.dic_stds['ale'][clst]))
|
||||
dic_stats['test'][key]['std_epi'][clst] = \
|
||||
sum(self.dic_stds['epi'][clst]) / float(len(self.dic_stds['epi'][clst]))
|
||||
dic_stats['test'][key]['interval'][clst] = \
|
||||
sum(self.dic_stds['interval'][clst]) / float(len(self.dic_stds['interval'][clst]))
|
||||
dic_stats['test'][key]['at_risk'][clst] = \
|
||||
sum(self.dic_stds['at_risk'][clst]) / float(len(self.dic_stds['at_risk'][clst]))
|
||||
|
||||
# Print statistics
|
||||
print(" Number of GT annotations: {} ".format(cnt_gt))
|
||||
for key in self.errors:
|
||||
if key in ['our', 'm3d', '3dop']:
|
||||
print(" Number of {} annotations with confidence >= {} : {} "
|
||||
.format(key, self.dic_thresh_conf[key], self.dic_cnt[key]))
|
||||
|
||||
# Include also missed annotations in the statistics
|
||||
matched = len(self.errors[key]['all'])
|
||||
missed = cnt_gt - matched
|
||||
zeros = [0] * missed
|
||||
self.errors[key]['<0.5m'].extend(zeros)
|
||||
self.errors[key]['<1m'].extend(zeros)
|
||||
self.errors[key]['<2m'].extend(zeros)
|
||||
|
||||
for clst in self.clusters[:-9]:
|
||||
print(" {} Average error in cluster {}: {:.2f} with a max error of {:.1f}, "
|
||||
"for {} annotations"
|
||||
.format(key, clst, dic_stats['test'][key]['mean'][clst], dic_stats['test'][key]['max'][clst],
|
||||
dic_stats['test'][key]['cnt'][clst]))
|
||||
|
||||
if key == 'our':
|
||||
print("% of annotation inside the confidence interval: {:.1f} %, "
|
||||
"of which {:.1f} % at higher risk"
|
||||
.format(100 * dic_stats['test'][key]['interval'][clst],
|
||||
100 * dic_stats['test'][key]['at_risk'][clst]))
|
||||
|
||||
for perc in ['<0.5m', '<1m', '<2m']:
|
||||
print("{} Instances with error {}: {:.2f} %"
|
||||
.format(key, perc, 100 * sum(self.errors[key][perc])/len(self.errors[key][perc])))
|
||||
|
||||
print("\n Number of matched annotations: {:.1f} %".format(100 * matched/cnt_gt))
|
||||
print("-"*100)
|
||||
|
||||
# Print images
|
||||
self.print_results(dic_stats, self.show)
|
||||
|
||||
def parse_txts(self, path, method):
|
||||
boxes = []
|
||||
dds = []
|
||||
stds_ale = []
|
||||
stds_epi = []
|
||||
confs = []
|
||||
dds_geom = []
|
||||
xyzs = []
|
||||
xy_kps = []
|
||||
|
||||
# Iterate over each line of the txt file
|
||||
if method == '3dop' or method == 'm3d':
|
||||
try:
|
||||
with open(path, "r") as ff:
|
||||
for line in ff:
|
||||
if self.check_conditions(line, thresh=self.dic_thresh_conf[method], mode=method):
|
||||
boxes.append([float(x) for x in line.split()[4:8]])
|
||||
loc = ([float(x) for x in line.split()[11:14]])
|
||||
dds.append(math.sqrt(loc[0] ** 2 + loc[1] ** 2 + loc[2] ** 2))
|
||||
self.dic_cnt[method] += 1
|
||||
return boxes, dds
|
||||
|
||||
except FileNotFoundError:
|
||||
return [], []
|
||||
|
||||
elif method == 'md':
|
||||
try:
|
||||
with open(path, "r") as ff:
|
||||
for line in ff:
|
||||
box = [float(x[:-1]) for x in line.split()[0:4]]
|
||||
delta_h = (box[3] - box[1]) / 10
|
||||
delta_w = (box[2] - box[0]) / 10
|
||||
assert delta_h > 0 and delta_w > 0, "Bounding box <=0"
|
||||
box[0] -= delta_w
|
||||
box[1] -= delta_h
|
||||
box[2] += delta_w
|
||||
box[3] += delta_h
|
||||
boxes.append(box)
|
||||
dds.append(float(line.split()[5][:-1]))
|
||||
self.dic_cnt[method] += 1
|
||||
return boxes, dds
|
||||
|
||||
except FileNotFoundError:
|
||||
return [], []
|
||||
|
||||
elif method == 'psm':
|
||||
try:
|
||||
with open(path, "r") as ff:
|
||||
for line in ff:
|
||||
box = [float(x[:-1]) for x in line[1:-1].split(',')[0:4]]
|
||||
delta_h = (box[3] - box[1]) / 10
|
||||
delta_w = (box[2] - box[0]) / 10
|
||||
assert delta_h > 0 and delta_w > 0, "Bounding box <=0"
|
||||
box[0] -= delta_w
|
||||
box[1] -= delta_h
|
||||
box[2] += delta_w
|
||||
box[3] += delta_h
|
||||
boxes.append(box)
|
||||
dds.append(float(line.split()[5][:-1]))
|
||||
self.dic_cnt[method] += 1
|
||||
return boxes, dds
|
||||
|
||||
except FileNotFoundError:
|
||||
return [], []
|
||||
|
||||
elif method == 'our':
|
||||
try:
|
||||
with open(path, "r") as ff:
|
||||
file_lines = ff.readlines()
|
||||
for line_our in file_lines[:-1]:
|
||||
line_list = [float(x) for x in line_our.split()]
|
||||
if self.check_conditions(line_list, thresh=self.dic_thresh_conf[method], mode=method):
|
||||
boxes.append(line_list[:4])
|
||||
xyzs.append(line_list[4:7])
|
||||
dds.append(line_list[7])
|
||||
stds_ale.append(line_list[8])
|
||||
stds_epi.append(line_list[9])
|
||||
dds_geom.append(line_list[11])
|
||||
xy_kps.append(line_list[12:])
|
||||
|
||||
self.dic_cnt[method] += 1
|
||||
|
||||
kk_list = [float(x) for x in file_lines[-1].split()]
|
||||
|
||||
return boxes, dds, stds_ale, stds_epi, kk_list, dds_geom, xyzs, xy_kps
|
||||
|
||||
except FileNotFoundError:
|
||||
return [], [], [], [], [], [], [], []
|
||||
|
||||
def estimate_error_base(self, boxes, dds, boxes_gt, dds_gt, truncs_gt, occs_gt, method):
|
||||
|
||||
# Compute error (distance) and save it
|
||||
boxes_gt = copy.deepcopy(boxes_gt)
|
||||
dds_gt = copy.deepcopy(dds_gt)
|
||||
truncs_gt = copy.deepcopy(truncs_gt)
|
||||
occs_gt = copy.deepcopy(occs_gt)
|
||||
|
||||
for idx, box in enumerate(boxes):
|
||||
if len(boxes_gt) >= 1:
|
||||
dd = dds[idx]
|
||||
idx_max, iou_max = self.get_idx_max(box, boxes_gt)
|
||||
cat = self.get_category(boxes_gt[idx_max], truncs_gt[idx_max], occs_gt[idx_max])
|
||||
# Update error if match is found
|
||||
if iou_max > self.dic_thresh_iou[method]:
|
||||
dd_gt = dds_gt[idx_max]
|
||||
self.update_errors(dd, dd_gt, cat, self.errors[method])
|
||||
|
||||
boxes_gt.pop(idx_max)
|
||||
dds_gt.pop(idx_max)
|
||||
truncs_gt.pop(idx_max)
|
||||
occs_gt.pop(idx_max)
|
||||
else:
|
||||
break
|
||||
|
||||
def estimate_error_our(self, boxes, dds, stds_ale, stds_epi, kk_list, dds_geom, xyzs, xy_kps,
|
||||
boxes_gt, dds_gt, truncs_gt, occs_gt, dic_fin, name):
|
||||
|
||||
# Compute error (distance) and save it
|
||||
boxes_gt = copy.deepcopy(boxes_gt)
|
||||
dds_gt = copy.deepcopy(dds_gt)
|
||||
truncs_gt = copy.deepcopy(truncs_gt)
|
||||
occs_gt = copy.deepcopy(occs_gt)
|
||||
|
||||
for idx, box in enumerate(boxes):
|
||||
if len(boxes_gt) >= 1:
|
||||
dd = dds[idx]
|
||||
dd_geom = dds_geom[idx]
|
||||
ale = stds_ale[idx]
|
||||
epi = stds_epi[idx]
|
||||
xyz = xyzs[idx]
|
||||
xy_kp = xy_kps[idx]
|
||||
idx_max, iou_max = self.get_idx_max(box, boxes_gt)
|
||||
cat = self.get_category(boxes_gt[idx_max], truncs_gt[idx_max], occs_gt[idx_max])
|
||||
|
||||
# Update error if match is found
|
||||
if iou_max > self.dic_thresh_iou['our']:
|
||||
dd_gt = dds_gt[idx_max]
|
||||
self.update_errors(dd, dd_gt, cat, self.errors['our'])
|
||||
self.update_errors(dd_geom, dd_gt, cat, self.errors['geom'])
|
||||
self.update_uncertainty(ale, epi, dd, dd_gt, cat)
|
||||
|
||||
boxes_gt.pop(idx_max)
|
||||
dds_gt.pop(idx_max)
|
||||
truncs_gt.pop(idx_max)
|
||||
occs_gt.pop(idx_max)
|
||||
|
||||
# Extract K and save it everything in a json file
|
||||
dic_fin['boxes'].append(box)
|
||||
dic_fin['dds_gt'].append(dd_gt)
|
||||
dic_fin['dds_pred'].append(dd)
|
||||
dic_fin['stds_ale'].append(ale)
|
||||
dic_fin['stds_epi'].append(epi)
|
||||
dic_fin['dds_geom'].append(dd_geom)
|
||||
dic_fin['xyz'].append(xyz)
|
||||
dic_fin['xy_kps'].append(xy_kp)
|
||||
else:
|
||||
break
|
||||
|
||||
# kk_fin = np.array(kk_list).reshape(3, 3).tolist()
|
||||
# dic_fin['K'] = kk_fin
|
||||
# path_json = os.path.join(self.dir_fin, name[:-4] + '.json')
|
||||
# with open(path_json, 'w') as ff:
|
||||
# json.dump(dic_fin, ff)
|
||||
|
||||
def compare_error(self, boxes_m3d, dds_m3d, boxes_3dop, dds_3dop, boxes_md, dds_md, boxes_our, dds_our,
|
||||
boxes_psm, dds_psm, boxes_gt, dds_gt, truncs_gt, occs_gt, dds_geom):
|
||||
|
||||
boxes_gt = copy.deepcopy(boxes_gt)
|
||||
dds_gt = copy.deepcopy(dds_gt)
|
||||
truncs_gt = copy.deepcopy(truncs_gt)
|
||||
occs_gt = copy.deepcopy(occs_gt)
|
||||
|
||||
for idx, box in enumerate(boxes_our):
|
||||
if len(boxes_gt) >= 1:
|
||||
dd_our = dds_our[idx]
|
||||
dd_geom = dds_geom[idx]
|
||||
idx_max, iou_max = self.get_idx_max(box, boxes_gt)
|
||||
cat = self.get_category(boxes_gt[idx_max], truncs_gt[idx_max], occs_gt[idx_max])
|
||||
|
||||
idx_max_3dop, iou_max_3dop = self.get_idx_max(box, boxes_3dop)
|
||||
idx_max_m3d, iou_max_m3d = self.get_idx_max(box, boxes_m3d)
|
||||
idx_max_md, iou_max_md = self.get_idx_max(box, boxes_md)
|
||||
# idx_max_psm, iou_max_psm = self.get_idx_max(box, boxes_psm)
|
||||
iou_max_psm = 1
|
||||
|
||||
iou_min = min(iou_max_3dop, iou_max_m3d, iou_max_md, iou_max_psm)
|
||||
|
||||
if iou_max >= self.dic_thresh_iou['our'] and iou_min >= self.dic_thresh_iou['m3d']:
|
||||
|
||||
dd_gt = dds_gt[idx_max]
|
||||
dd_3dop = dds_3dop[idx_max_3dop]
|
||||
dd_m3d = dds_m3d[idx_max_m3d]
|
||||
dd_md = dds_md[idx_max_md]
|
||||
# dd_psm = dds_psm[idx_max_psm]
|
||||
|
||||
self.update_errors(dd_3dop, dd_gt, cat, self.errors['3dop_merged'])
|
||||
self.update_errors(dd_our, dd_gt, cat, self.errors['our_merged'])
|
||||
self.update_errors(dd_m3d, dd_gt, cat, self.errors['m3d_merged'])
|
||||
self.update_errors(dd_geom, dd_gt, cat, self.errors['geom_merged'])
|
||||
self.update_errors(dd_md, dd_gt, cat, self.errors['md_merged'])
|
||||
# self.update_errors(dd_psm, dd_gt, cat, self.errors['psm_merged'])
|
||||
self.dic_cnt['merged'] += 1
|
||||
|
||||
boxes_gt.pop(idx_max)
|
||||
dds_gt.pop(idx_max)
|
||||
truncs_gt.pop(idx_max)
|
||||
occs_gt.pop(idx_max)
|
||||
else:
|
||||
break
|
||||
|
||||
def update_errors(self, dd, dd_gt, cat, errors):
|
||||
|
||||
"""Compute and save errors between a single box and the gt box which match"""
|
||||
|
||||
diff = abs(dd - dd_gt)
|
||||
clst = self.find_cluster(dd_gt, self.clusters)
|
||||
errors['all'].append(diff)
|
||||
errors[cat].append(diff)
|
||||
errors[clst].append(diff)
|
||||
|
||||
# Check if the distance is less than one or 2 meters
|
||||
if diff <= 0.5:
|
||||
errors['<0.5m'].append(1)
|
||||
else:
|
||||
errors['<0.5m'].append(0)
|
||||
|
||||
if diff <= 1:
|
||||
errors['<1m'].append(1)
|
||||
else:
|
||||
errors['<1m'].append(0)
|
||||
|
||||
if diff <= 2:
|
||||
errors['<2m'].append(1)
|
||||
else:
|
||||
errors['<2m'].append(0)
|
||||
|
||||
def update_uncertainty(self, std_ale, std_epi, dd, dd_gt, cat):
|
||||
|
||||
clst = self.find_cluster(dd_gt, self.clusters)
|
||||
self.dic_stds['ale']['all'].append(std_ale)
|
||||
self.dic_stds['ale'][clst].append(std_ale)
|
||||
self.dic_stds['ale'][cat].append(std_ale)
|
||||
self.dic_stds['epi']['all'].append(std_epi)
|
||||
self.dic_stds['epi'][clst].append(std_epi)
|
||||
self.dic_stds['epi'][cat].append(std_epi)
|
||||
|
||||
# Number of annotations inside the confidence interval
|
||||
if dd_gt <= dd: # Particularly dangerous instances
|
||||
self.dic_stds['at_risk']['all'].append(1)
|
||||
self.dic_stds['at_risk'][clst].append(1)
|
||||
self.dic_stds['at_risk'][cat].append(1)
|
||||
|
||||
if abs(dd - dd_gt) <= (std_epi):
|
||||
self.dic_stds['interval']['all'].append(1)
|
||||
self.dic_stds['interval'][clst].append(1)
|
||||
self.dic_stds['interval'][cat].append(1)
|
||||
|
||||
else:
|
||||
self.dic_stds['interval']['all'].append(0)
|
||||
self.dic_stds['interval'][clst].append(0)
|
||||
self.dic_stds['interval'][cat].append(0)
|
||||
|
||||
|
||||
else:
|
||||
self.dic_stds['at_risk']['all'].append(0)
|
||||
self.dic_stds['at_risk'][clst].append(0)
|
||||
self.dic_stds['at_risk'][cat].append(0)
|
||||
|
||||
|
||||
# self.dic_stds['at_risk']['all'].append(0)
|
||||
# self.dic_stds['at_risk'][clst].append(0)
|
||||
# self.dic_stds['at_risk'][cat].append(0)
|
||||
|
||||
@staticmethod
|
||||
def find_cluster(dd, clusters):
|
||||
"""Find the correct cluster. The first and the last one are not numeric"""
|
||||
|
||||
for clst in clusters[4: -1]:
|
||||
if dd <= int(clst):
|
||||
return clst
|
||||
|
||||
return clusters[-1]
|
||||
154
src/eval/run_kitti.py
Normal file
154
src/eval/run_kitti.py
Normal file
@ -0,0 +1,154 @@
|
||||
|
||||
import torch
|
||||
import math
|
||||
import numpy as np
|
||||
import os
|
||||
import glob
|
||||
import json
|
||||
import logging
|
||||
from models.architectures import TriLinear, LinearModel
|
||||
|
||||
|
||||
class RunKitti:
|
||||
|
||||
def __init__(self, model, dir_ann, dropout, hidden_size, n_stage, n_dropout):
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
|
||||
# Set directories
|
||||
self.dir_ann = dir_ann
|
||||
self.average_y = 0.48
|
||||
self.n_dropout = n_dropout
|
||||
self.n_samples = 100
|
||||
|
||||
list_ann = glob.glob(os.path.join(dir_ann, '*.json'))
|
||||
self.dir_kk = os.path.join('data', 'baselines', 'calib')
|
||||
self.dir_out = os.path.join('data', 'baselines', 'monoloco')
|
||||
if not os.path.exists(self.dir_out):
|
||||
os.makedirs(self.dir_out)
|
||||
print("Created output directory for txt files")
|
||||
|
||||
self.list_basename = [os.path.basename(x).split('.')[0] for x in list_ann]
|
||||
assert self.list_basename, " Missing json annotations file to create txt files for KITTI datasets"
|
||||
|
||||
# Load the model
|
||||
input_size = 17 * 2
|
||||
use_cuda = torch.cuda.is_available()
|
||||
self.device = torch.device("cuda:0" if use_cuda else "cpu")
|
||||
self.model = LinearModel(input_size=input_size, output_size=2, linear_size=hidden_size,
|
||||
p_dropout=dropout, num_stage=n_stage)
|
||||
self.model.load_state_dict(torch.load(model, map_location=lambda storage, loc: storage))
|
||||
self.model.eval() # Default is train
|
||||
self.model.to(self.device)
|
||||
|
||||
# Import
|
||||
from utils.misc import epistemic_variance, laplace_sampling
|
||||
self.epistemic_variance = epistemic_variance
|
||||
self.laplace_sampling = laplace_sampling
|
||||
from visuals.printer import Printer
|
||||
self.Printer = Printer
|
||||
from utils.kitti import eval_geometric, get_calibration
|
||||
self.eval_geometric = eval_geometric
|
||||
self.get_calibration = get_calibration
|
||||
|
||||
from utils.normalize import unnormalize_bi
|
||||
self.unnormalize_bi = unnormalize_bi
|
||||
|
||||
from utils.pifpaf import get_input_data, preprocess_pif
|
||||
self.get_input_data = get_input_data
|
||||
self.preprocess_pif = preprocess_pif
|
||||
|
||||
# Counters
|
||||
self.cnt_ann = 0
|
||||
self.cnt_file = 0
|
||||
self.cnt_no_file = 0
|
||||
|
||||
def run(self):
|
||||
|
||||
# Run inference
|
||||
for basename in self.list_basename:
|
||||
path_calib = os.path.join(self.dir_kk, basename + '.txt')
|
||||
kk, tt = self.get_calibration(path_calib)
|
||||
|
||||
path_ann = os.path.join(self.dir_ann, basename + '.png.pifpaf.json')
|
||||
with open(path_ann, 'r') as f:
|
||||
annotations = json.load(f)
|
||||
|
||||
boxes, keypoints = self.preprocess_pif(annotations)
|
||||
(inputs, xy_kps), (uv_kps, uv_boxes, uv_centers, uv_shoulders) = self.get_input_data(boxes, keypoints, kk)
|
||||
|
||||
dds_geom, xy_centers = self.eval_geometric(uv_kps, uv_centers, uv_shoulders, kk, average_y=0.48)
|
||||
|
||||
self.cnt_ann += len(boxes)
|
||||
|
||||
inputs = torch.from_numpy(np.array(inputs)).float().to(self.device)
|
||||
|
||||
if len(inputs) == 0:
|
||||
self.cnt_no_file += 1
|
||||
|
||||
else:
|
||||
self.cnt_file += 1
|
||||
|
||||
if self.n_dropout > 0:
|
||||
total_outputs = torch.empty((0, len(uv_boxes))).to(self.device)
|
||||
self.model.dropout.training = True
|
||||
for ii in range(self.n_dropout):
|
||||
outputs = self.model(inputs)
|
||||
outputs = self.unnormalize_bi(outputs)
|
||||
samples = self.laplace_sampling(outputs, self.n_samples)
|
||||
total_outputs = torch.cat((total_outputs, samples), 0)
|
||||
varss = total_outputs.std(0)
|
||||
|
||||
else:
|
||||
varss = [0]*len(uv_boxes)
|
||||
|
||||
# Don't use dropout for the mean prediction and aleatoric uncertainty
|
||||
self.model.dropout.training = False
|
||||
outputs_net = self.model(inputs)
|
||||
outputs = outputs_net.cpu().detach().numpy()
|
||||
|
||||
path_txt = os.path.join(self.dir_out, basename + '.txt')
|
||||
with open(path_txt, "w+") as ff:
|
||||
for idx in range(outputs.shape[0]):
|
||||
xx_1 = float(xy_centers[idx][0])
|
||||
yy_1 = float(xy_centers[idx][1])
|
||||
xy_kp = xy_kps[idx]
|
||||
dd = float(outputs[idx][0])
|
||||
std_ale = math.exp(float(outputs[idx][1])) * dd
|
||||
|
||||
zz = dd / math.sqrt(1 + xx_1**2 + yy_1**2)
|
||||
xx_cam_0 = xx_1*zz + tt[0] # Still to verify the sign but negligible
|
||||
yy_cam_0 = yy_1*zz + tt[1]
|
||||
zz_cam_0 = zz + tt[2]
|
||||
dd_cam_0 = math.sqrt(xx_cam_0**2 + yy_cam_0**2 + zz_cam_0**2)
|
||||
|
||||
uv_box = uv_boxes[idx]
|
||||
|
||||
twodecimals = ["%.3f" % vv for vv in [uv_box[0], uv_box[1], uv_box[2], uv_box[3],
|
||||
xx_cam_0, yy_cam_0, zz_cam_0, dd_cam_0,
|
||||
std_ale, varss[idx], uv_box[4], dds_geom[idx]]]
|
||||
|
||||
keypoints_str = ["%.5f" % vv for vv in xy_kp]
|
||||
for item in twodecimals:
|
||||
ff.write("%s " % item)
|
||||
for item in keypoints_str:
|
||||
ff.write("%s " % item)
|
||||
ff.write("\n")
|
||||
|
||||
# Save intrinsic matrix in the last row
|
||||
kk_list = kk.reshape(-1,).tolist()
|
||||
for kk_el in kk_list:
|
||||
ff.write("%f " % kk_el)
|
||||
ff.write("\n")
|
||||
|
||||
# Print statistics
|
||||
print("Saved in {} txt {} annotations. Not found {} images"
|
||||
.format(self.cnt_file, self.cnt_ann, self.cnt_no_file))
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
151
src/features/preprocess_ki.py
Normal file
151
src/features/preprocess_ki.py
Normal file
@ -0,0 +1,151 @@
|
||||
|
||||
import os
|
||||
import glob
|
||||
import math
|
||||
import logging
|
||||
from collections import defaultdict
|
||||
import json
|
||||
import datetime
|
||||
|
||||
|
||||
class PreprocessKitti:
|
||||
"""Prepare arrays with same format as nuScenes preprocessing but using ground truth txt files"""
|
||||
|
||||
def __init__(self, dir_ann, iou_thresh=0.3):
|
||||
|
||||
self.dir_ann = dir_ann
|
||||
self.iou_thresh = iou_thresh
|
||||
self.dir_gt = os.path.join('data', 'baselines', 'gt')
|
||||
self.names_gt = os.listdir(self.dir_gt)
|
||||
self.dir_kk = os.path.join('data', 'baselines', 'calib')
|
||||
self.list_gt = glob.glob(self.dir_gt + '/*.txt')
|
||||
assert os.path.exists(self.dir_gt), "Ground truth dir does not exist"
|
||||
assert os.path.exists(self.dir_ann), "Annotation dir does not exist"
|
||||
|
||||
now = datetime.datetime.now()
|
||||
now_time = now.strftime("%Y%m%d-%H%M")[2:]
|
||||
dir_out = os.path.join('data', 'arrays')
|
||||
self.path_joints = os.path.join(dir_out, 'joints-kitti-' + now_time + '.json')
|
||||
self.path_names = os.path.join(dir_out, 'names-kitti-' + now_time + '.json')
|
||||
path_train = os.path.join('data', 'baselines', 'train.txt')
|
||||
path_val = os.path.join('data', 'baselines', 'val.txt')
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
|
||||
from utils.kitti import get_calibration, check_conditions
|
||||
self.get_calibration = get_calibration
|
||||
self.check_conditions = check_conditions
|
||||
|
||||
from utils.pifpaf import get_input_data, preprocess_pif
|
||||
self.get_input_data = get_input_data
|
||||
self.preprocess_pif = preprocess_pif
|
||||
|
||||
from utils.misc import get_idx_max, append_cluster
|
||||
self.get_idx_max = get_idx_max
|
||||
self.append_cluster = append_cluster
|
||||
|
||||
# self.clusters = ['all', '6', '10', '15', '20', '25', '30', '40', '50', '>50'
|
||||
self.cnt_gt = 0
|
||||
self.cnt_fnf = 0
|
||||
self.dic_cnt = {'train': 0, 'val': 0, 'test': 0}
|
||||
|
||||
# Split training and validation images
|
||||
set_gt = set(self.names_gt)
|
||||
set_train = set()
|
||||
set_val = set()
|
||||
|
||||
with open(path_train, "r") as f_train:
|
||||
for line in f_train:
|
||||
set_train.add(line[:-1] + '.txt')
|
||||
with open(path_val, "r") as f_val:
|
||||
for line in f_val:
|
||||
set_val.add(line[:-1] + '.txt')
|
||||
|
||||
self.set_train = set_gt.intersection(set_train)
|
||||
self.set_val = set_gt.intersection(set_val)
|
||||
assert self.set_train and self.set_val, "No validation or training annotations"
|
||||
|
||||
self.dic_jo = {'train': dict(X=[], Y=[], names=[], kps=[],
|
||||
clst=defaultdict(lambda: defaultdict(list))),
|
||||
'val': dict(X=[], Y=[], names=[], kps=[],
|
||||
clst=defaultdict(lambda: defaultdict(list))),
|
||||
'test': dict(X=[], Y=[], names=[], kps=[],
|
||||
clst=defaultdict(lambda: defaultdict(list)))}
|
||||
|
||||
self.dic_names = defaultdict(lambda: defaultdict(list))
|
||||
|
||||
def run(self):
|
||||
|
||||
for name in self.names_gt:
|
||||
# Extract ground truth
|
||||
if name == '004223.txt':
|
||||
aa = 5
|
||||
path_gt = os.path.join(self.dir_gt, name)
|
||||
basename, _ = os.path.splitext(name)
|
||||
boxes_gt = []
|
||||
dds = []
|
||||
|
||||
if name in self.set_train:
|
||||
phase = 'train'
|
||||
elif name in self.set_val:
|
||||
phase = 'val'
|
||||
else:
|
||||
self.cnt_fnf += 1
|
||||
continue
|
||||
|
||||
# Iterate over each line of the gt file and save box location and distances
|
||||
with open(path_gt, "r") as f_gt:
|
||||
for line_gt in f_gt:
|
||||
if self.check_conditions(line_gt, mode='gt'):
|
||||
box = [float(x) for x in line_gt.split()[4:8]]
|
||||
boxes_gt.append(box)
|
||||
loc_gt = [float(x) for x in line_gt.split()[11:14]]
|
||||
dd = math.sqrt(loc_gt[0] ** 2 + loc_gt[1] ** 2 + loc_gt[2] ** 2)
|
||||
dds.append(dd)
|
||||
self.dic_names[basename + '.png']['boxes'].append(box)
|
||||
self.dic_names[basename + '.png']['dds'].append(dd)
|
||||
self.cnt_gt += 1
|
||||
|
||||
# Extract keypoints
|
||||
path_txt = os.path.join(self.dir_kk, basename + '.txt')
|
||||
kk, tt = self.get_calibration(path_txt)
|
||||
|
||||
# Find the annotations if exists
|
||||
try:
|
||||
with open(os.path.join(self.dir_ann, basename + '.png.pifpaf.json'), 'r') as f:
|
||||
annotations = json.load(f)
|
||||
boxes, keypoints = self.preprocess_pif(annotations)
|
||||
(inputs, xy_kps), (uv_kps, uv_boxes, _, _) = self.get_input_data(boxes, keypoints, kk)
|
||||
|
||||
except FileNotFoundError:
|
||||
uv_boxes = []
|
||||
|
||||
# Match each set of keypoint with a ground truth
|
||||
for ii, box in enumerate(uv_boxes):
|
||||
idx_max, iou_max = self.get_idx_max(box, boxes_gt)
|
||||
|
||||
if iou_max >= self.iou_thresh:
|
||||
|
||||
self.dic_jo[phase]['kps'].append(uv_kps[ii])
|
||||
self.dic_jo[phase]['X'].append(inputs[ii])
|
||||
self.dic_jo[phase]['Y'].append([dds[idx_max]]) # Trick to make it (nn,1)
|
||||
self.dic_jo[phase]['names'].append(name) # One image name for each annotation
|
||||
self.append_cluster(self.dic_jo, phase, inputs[ii], dds[idx_max], uv_kps[ii])
|
||||
self.dic_cnt[phase] += 1
|
||||
boxes_gt.pop(idx_max)
|
||||
dds.pop(idx_max)
|
||||
|
||||
with open(self.path_joints, 'w') as f:
|
||||
json.dump(self.dic_jo, f)
|
||||
with open(os.path.join(self.path_names), 'w') as f:
|
||||
json.dump(self.dic_names, f)
|
||||
for phase in ['train', 'val', 'test']:
|
||||
print("Saved {} annotations for phase {}"
|
||||
.format(self.dic_cnt[phase], phase))
|
||||
print("Number of GT files: {}. Files not found: {}"
|
||||
.format(self.cnt_gt, self.cnt_fnf))
|
||||
print("\nOutput files:\n{}\n{}\n".format(self.path_names, self.path_joints))
|
||||
|
||||
|
||||
|
||||
186
src/features/preprocess_nu.py
Normal file
186
src/features/preprocess_nu.py
Normal file
@ -0,0 +1,186 @@
|
||||
import numpy as np
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
import json
|
||||
import logging
|
||||
from collections import defaultdict
|
||||
import datetime
|
||||
|
||||
|
||||
class PreprocessNuscenes:
|
||||
"""
|
||||
Preprocess Nuscenes dataset
|
||||
"""
|
||||
def __init__(self, dir_ann, dir_nuscenes, dataset, iou_min=0.3):
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
|
||||
self.dir_ann = dir_ann
|
||||
dir_out = os.path.join('data', 'arrays')
|
||||
assert os.path.exists(dir_nuscenes), "Nuscenes directory does not exists"
|
||||
assert os.path.exists(self.dir_ann), "The annotations directory does not exists"
|
||||
assert os.path.exists(dir_out), "Joints directory does not exists"
|
||||
|
||||
now = datetime.datetime.now()
|
||||
now_time = now.strftime("%Y%m%d-%H%M")[2:]
|
||||
self.path_joints = os.path.join(dir_out, 'joints-' + dataset + '-' + now_time + '.json')
|
||||
self.path_names = os.path.join(dir_out, 'names-' + dataset + '-' + now_time + '.json')
|
||||
|
||||
self.iou_min = iou_min
|
||||
|
||||
# Import functions
|
||||
from utils.misc import get_idx_max, append_cluster
|
||||
self.get_idx_max = get_idx_max
|
||||
self.append_cluster = append_cluster
|
||||
from utils.nuscenes import select_categories
|
||||
self.select_categories = select_categories
|
||||
from utils.camera import project_3d
|
||||
self.project_3d = project_3d
|
||||
from utils.pifpaf import get_input_data, preprocess_pif
|
||||
self.get_input_data = get_input_data
|
||||
self.preprocess_pif = preprocess_pif
|
||||
from nuscenes.nuscenes import NuScenes
|
||||
from nuscenes.utils import splits
|
||||
self.splits = splits
|
||||
|
||||
# Initialize dicts to save joints for training
|
||||
self.dic_jo = {'train': dict(X=[], Y=[], names=[], kps=[], boxes_3d=[], K=[],
|
||||
clst=defaultdict(lambda: defaultdict(list))),
|
||||
'val': dict(X=[], Y=[], names=[], kps=[], boxes_3d=[], K=[],
|
||||
clst=defaultdict(lambda: defaultdict(list))),
|
||||
'test': dict(X=[], Y=[], names=[], kps=[], boxes_3d=[], K=[],
|
||||
clst=defaultdict(lambda: defaultdict(list)))
|
||||
}
|
||||
# Names as keys to retrieve it easily
|
||||
self.dic_names = defaultdict(lambda: defaultdict(list))
|
||||
|
||||
self.cameras = ['CAM_FRONT', 'CAM_FRONT_LEFT', 'CAM_FRONT_RIGHT', 'CAM_BACK', 'CAM_BACK_LEFT', 'CAM_BACK_RIGHT']
|
||||
|
||||
# Split training and validation base on the dataset type
|
||||
if dataset == 'nuscenes':
|
||||
self.nusc = NuScenes(version='v1.0-trainval', dataroot=dir_nuscenes, verbose=True)
|
||||
self.scenes = self.nusc.scene
|
||||
split_scenes = self.splits.create_splits_scenes()
|
||||
self.split_train, self.split_val = split_scenes['train'], split_scenes['val']
|
||||
|
||||
elif dataset == 'nuscenes_mini':
|
||||
self.nusc = NuScenes(version='v1.0-mini', dataroot=dir_nuscenes, verbose=True)
|
||||
self.scenes = self.nusc.scene
|
||||
split_scenes = self.splits.create_splits_scenes()
|
||||
self.split_train, self.split_val = split_scenes['train'], split_scenes['val']
|
||||
|
||||
elif dataset == 'nuscenes_teaser':
|
||||
self.nusc = NuScenes(version='v1.0-trainval', dataroot=dir_nuscenes, verbose=True)
|
||||
with open("data/baselines/teaser_scenes.txt", "r") as ff:
|
||||
teaser_scenes = ff.read().splitlines()
|
||||
self.scenes = self.nusc.scene
|
||||
self.scenes = [scene for scene in self.scenes if scene['token'] in teaser_scenes]
|
||||
with open("data/baselines/split_teaser_scenes.json", "r") as ff:
|
||||
dic_split = json.load(ff)
|
||||
self.split_train = [scene['name'] for scene in self.scenes if scene['token'] in dic_split['train']]
|
||||
self.split_val = [scene['name'] for scene in self.scenes if scene['token'] in dic_split['val']]
|
||||
|
||||
def run(self):
|
||||
"""
|
||||
Prepare arrays for training
|
||||
"""
|
||||
cnt_scenes = 0
|
||||
cnt_samples = 0
|
||||
cnt_sd = 0
|
||||
cnt_ann = 0
|
||||
start = time.time()
|
||||
|
||||
for ii, scene in enumerate(self.scenes):
|
||||
end_scene = time.time()
|
||||
current_token = scene['first_sample_token']
|
||||
cnt_scenes += 1
|
||||
if ii == 0:
|
||||
time_left = "Nan"
|
||||
else:
|
||||
time_left = str((end_scene-start_scene)/60 * (len(self.scenes) - ii))[:4]
|
||||
|
||||
sys.stdout.write('\r' + 'Elaborating scene {}, remaining time {} minutes'.format(cnt_scenes, time_left) + '\t\n')
|
||||
start_scene = time.time()
|
||||
if scene['name'] in self.split_train:
|
||||
phase = 'train'
|
||||
elif scene['name'] in self.split_val:
|
||||
phase = 'val'
|
||||
else:
|
||||
print("phase name not in training or validation split")
|
||||
continue
|
||||
|
||||
while not current_token == "":
|
||||
sample_dic = self.nusc.get('sample', current_token)
|
||||
cnt_samples += 1
|
||||
|
||||
# Extract all the sample_data tokens for each sample
|
||||
for cam in self.cameras:
|
||||
sd_token = sample_dic['data'][cam]
|
||||
cnt_sd += 1
|
||||
path_im, boxes_obj, kk = self.nusc.get_sample_data(sd_token, box_vis_level=1) # At least one corner
|
||||
|
||||
# Extract all the annotations of the person
|
||||
boxes_gt = []
|
||||
dds = []
|
||||
boxes_3d = []
|
||||
name = os.path.basename(path_im)
|
||||
for box_obj in boxes_obj:
|
||||
if box_obj.name[:6] != 'animal':
|
||||
general_name = box_obj.name.split('.')[0] + '.' + box_obj.name.split('.')[1]
|
||||
else:
|
||||
general_name = 'animal'
|
||||
if general_name in self.select_categories('all'):
|
||||
box = self.project_3d(box_obj, kk)
|
||||
dd = np.linalg.norm(box_obj.center)
|
||||
boxes_gt.append(box)
|
||||
dds.append(dd)
|
||||
box_3d = box_obj.center.tolist() + box_obj.wlh.tolist()
|
||||
boxes_3d.append(box_3d)
|
||||
self.dic_names[name]['boxes'].append(box)
|
||||
self.dic_names[name]['dds'].append(dd)
|
||||
|
||||
# Run IoU with pifpaf detections and save
|
||||
path_pif = os.path.join(self.dir_ann, name + '.pifpaf.json')
|
||||
exists = os.path.isfile(path_pif)
|
||||
|
||||
if exists:
|
||||
with open(path_pif, 'r') as f:
|
||||
annotations = json.load(f)
|
||||
|
||||
boxes, keypoints = self.preprocess_pif(annotations, im_size=None)
|
||||
(inputs, xy_kps), (uv_kps, uv_boxes, _, _) = self.get_input_data(boxes, keypoints, kk)
|
||||
|
||||
for ii, box in enumerate(uv_boxes):
|
||||
idx_max, iou_max = self.get_idx_max(box, boxes_gt)
|
||||
|
||||
if iou_max > self.iou_min:
|
||||
try:
|
||||
self.dic_jo[phase]['kps'].append(uv_kps[ii])
|
||||
except TypeError:
|
||||
aa = 4
|
||||
self.dic_jo[phase]['X'].append(inputs[ii])
|
||||
self.dic_jo[phase]['Y'].append([dds[idx_max]]) # Trick to make it (nn,1)
|
||||
self.dic_jo[phase]['names'].append(name) # One image name for each annotation
|
||||
self.dic_jo[phase]['boxes_3d'].append(boxes_3d[idx_max])
|
||||
self.dic_jo[phase]['K'].append(kk.tolist())
|
||||
self.append_cluster(self.dic_jo, phase, inputs[ii], dds[idx_max], uv_kps[ii])
|
||||
boxes_gt.pop(idx_max)
|
||||
dds.pop(idx_max)
|
||||
boxes_3d.pop(idx_max)
|
||||
cnt_ann += 1
|
||||
sys.stdout.write('\r' + 'Saved annotations {}'
|
||||
.format(cnt_ann) + '\t')
|
||||
|
||||
current_token = sample_dic['next']
|
||||
|
||||
with open(os.path.join(self.path_joints), 'w') as f:
|
||||
json.dump(self.dic_jo, f)
|
||||
with open(os.path.join(self.path_names), 'w') as f:
|
||||
json.dump(self.dic_names, f)
|
||||
end = time.time()
|
||||
|
||||
print("\nSaved {} annotations for {} samples in {} scenes. Total time: {:.1f} minutes"
|
||||
.format(cnt_ann, cnt_samples, cnt_scenes, (end-start)/60))
|
||||
print("\nOutput files:\n{}\n{}\n".format(self.path_names, self.path_joints))
|
||||
37
src/features/resize_images.py
Normal file
37
src/features/resize_images.py
Normal file
@ -0,0 +1,37 @@
|
||||
|
||||
import glob
|
||||
import logging
|
||||
import os
|
||||
import cv2
|
||||
import sys
|
||||
|
||||
|
||||
def resize(input_glob, output_dir, factor=2):
|
||||
"""
|
||||
Resize images using multiplicative factor
|
||||
"""
|
||||
list_im = glob.glob(input_glob)
|
||||
|
||||
|
||||
for idx, path_in in enumerate(list_im):
|
||||
|
||||
basename, _ = os.path.splitext(os.path.basename(path_in))
|
||||
im = cv2.imread(path_in)
|
||||
assert im is not None, "Image not found"
|
||||
|
||||
# Paddle the image if requested and resized the dataset to a fixed dataset
|
||||
h_im = im.shape[0]
|
||||
w_im = im.shape[1]
|
||||
w_new = round(factor * w_im)
|
||||
h_new = round(factor * h_im)
|
||||
|
||||
print("resizing image {} to: {} x {}".format(basename, w_new, h_new))
|
||||
im_new = cv2.resize(im, (w_new, h_new))
|
||||
|
||||
# Save the image
|
||||
name_im = basename + '.png'
|
||||
path_out = os.path.join(output_dir, name_im)
|
||||
cv2.imwrite(path_out, im_new)
|
||||
sys.stdout.write('\r' + 'Saving image number: {}'.format(idx) + '\t')
|
||||
|
||||
|
||||
166
src/main.py
Normal file
166
src/main.py
Normal file
@ -0,0 +1,166 @@
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, os.path.join('.', 'features'))
|
||||
sys.path.insert(0, os.path.join('.', 'models'))
|
||||
from openpifpaf.network import nets
|
||||
from openpifpaf import decoder
|
||||
from features.preprocess_nu import PreprocessNuscenes
|
||||
from features.preprocess_ki import PreprocessKitti
|
||||
from predict.predict_2d_3d import predict
|
||||
from features.trial import trials
|
||||
from models.trainer import Trainer
|
||||
from eval.run_kitti import RunKitti
|
||||
from eval.geom_baseline import GeomBaseline
|
||||
from models.hyp_tuning import HypTuning
|
||||
from eval.kitti_eval import KittiEval
|
||||
|
||||
|
||||
def cli():
|
||||
|
||||
parser = argparse.ArgumentParser(
|
||||
description=__doc__, formatter_class=argparse.ArgumentDefaultsHelpFormatter)
|
||||
|
||||
# Subparser definition
|
||||
subparsers = parser.add_subparsers(help='Different parsers for main actions', dest='command')
|
||||
predict_parser = subparsers.add_parser("predict")
|
||||
prep_parser = subparsers.add_parser("prep")
|
||||
training_parser = subparsers.add_parser("train")
|
||||
eval_parser = subparsers.add_parser("eval")
|
||||
|
||||
# Preprocess input data
|
||||
prep_parser.add_argument('--dataset',
|
||||
help='datasets to preprocess: nuscenes, nuscenes_teaser, nuscenes_mini, kitti',
|
||||
default='nuscenes')
|
||||
prep_parser.add_argument('--dir_ann', help='directory of annotations of 2d joints',
|
||||
default='/data/lorenzo-data/nuscenes_new/annotations/scale_1_april_new')
|
||||
prep_parser.add_argument('--dir_nuscenes', help='directory of nuscenes devkit',
|
||||
default='/data/lorenzo-data/nuscenes_new/')
|
||||
|
||||
# Predict (2D pose and/or 3D location from images)
|
||||
# 0) General arguments
|
||||
predict_parser.add_argument('--networks', nargs='+', help='Run pifpaf and/or monoloco', default=['monoloco'])
|
||||
predict_parser.add_argument('images', nargs='*', help='input images')
|
||||
predict_parser.add_argument('--glob', help='glob expression for input images (for many images)')
|
||||
predict_parser.add_argument('-o', '--output-directory', help='Output directory')
|
||||
predict_parser.add_argument('--output_types', nargs='+', default=['json'],
|
||||
help='what to output: json keypoints skeleton for Pifpaf'
|
||||
'json bird front combined for Monoloco')
|
||||
predict_parser.add_argument('--show', help='to show images', action='store_true')
|
||||
|
||||
# 1)Pifpaf arguments
|
||||
nets.cli(predict_parser)
|
||||
decoder.cli(predict_parser, force_complete_pose=True, instance_threshold=0.1)
|
||||
predict_parser.add_argument('--model_pifpaf', help='pifpaf model to load',
|
||||
default="data/models/resnet152-190412.pkl")
|
||||
predict_parser.add_argument('--scale', default=1.0, type=float, help='change the scale of the image to preprocess')
|
||||
|
||||
# 2) Monoloco argument
|
||||
predict_parser.add_argument('--model', help='path of MonoLoco model to load',
|
||||
default="data/models/best_model__seed_2_.pickle")
|
||||
predict_parser.add_argument('--path_gt', help='path of json file with gt 3d localization',
|
||||
default='data/arrays/names-kitti-000.json')
|
||||
predict_parser.add_argument('--dir_calib', default='data/baselines/calib/', help='directory of calib_files')
|
||||
predict_parser.add_argument('--transform', help='transformation for the pose', default='None')
|
||||
predict_parser.add_argument('--draw_kps', help='to draw kps in the images', action='store_true')
|
||||
predict_parser.add_argument('--predict', help='whether to make prediction', action='store_true')
|
||||
predict_parser.add_argument('--no_calib', help='use standard intrinsic matrix', action='store_true')
|
||||
predict_parser.add_argument('--z_max', type=int, help='', default=22)
|
||||
predict_parser.add_argument('--y_scale', type=float, help='', default=1)
|
||||
predict_parser.add_argument('--n_dropout', type=int, help='Epistemic uncertainty evaluation', default=0)
|
||||
predict_parser.add_argument('--combined', help='to print combined images', action='store_true')
|
||||
|
||||
# Training
|
||||
training_parser.add_argument('--joints', help='Json file with input joints',
|
||||
default='data/arrays/joints-nuscenes-190507-0852.json')
|
||||
training_parser.add_argument('--save', help='whether to save model and log file', action='store_false')
|
||||
training_parser.add_argument('--hyp', help='run hyperparameters tuning', action='store_true')
|
||||
training_parser.add_argument('-e', '--epochs', type=int, help='number of epochs to train for', default=150)
|
||||
training_parser.add_argument('--bs', type=int, default=256, help='input batch size')
|
||||
training_parser.add_argument('--baseline', help='whether to train using the baseline', action='store_true')
|
||||
training_parser.add_argument('--dropout', type=float, help='dropout. Default no dropout', default=0.2)
|
||||
training_parser.add_argument('--lr', type=float, help='learning rate', default=0.002)
|
||||
training_parser.add_argument('--sched_step', type=float, help='scheduler step time (epochs)', default=20)
|
||||
training_parser.add_argument('--sched_gamma', type=float, help='Scheduler multiplication every step', default=0.9)
|
||||
training_parser.add_argument('--hidden_size', type=int, help='Number of hidden units in the model', default=256)
|
||||
training_parser.add_argument('--n_stage', type=int, help='Number of stages in the model', default=3)
|
||||
training_parser.add_argument('--multiplier', type=int, help='Size of the grid of hyp search', default=1)
|
||||
training_parser.add_argument('--r_seed', type=int, help='specify the seed for training and hyp tuning', default=1)
|
||||
|
||||
# Evaluation
|
||||
eval_parser.add_argument('--dataset', help='datasets to evaluate, kitti or nuscenes', default='kitti')
|
||||
eval_parser.add_argument('--geometric', help='to evaluate geometric distance', action='store_true')
|
||||
eval_parser.add_argument('--dir_ann', help='directory of annotations of 2d joints',
|
||||
default='/data/lorenzo-data/kitti/annotations/pif_factor_2_april')
|
||||
eval_parser.add_argument('--model', help='path of MonoLoco model to load',
|
||||
default="data/models/monoloco-190507-0943.pkl")
|
||||
eval_parser.add_argument('--joints', help='Json file with input joints to evaluate (for nuscenes)',
|
||||
default='data/arrays/joints-nuscenes-190508-1102.json')
|
||||
eval_parser.add_argument('--n_dropout', type=int, help='Epistemic uncertainty evaluation', default=0)
|
||||
eval_parser.add_argument('--run_kitti', help='create txt files for validation of kitti', action='store_true')
|
||||
eval_parser.add_argument('--dropout', type=float, help='dropout. Default no dropout', default=0.2)
|
||||
eval_parser.add_argument('--hidden_size', type=int, help='Number of hidden units in the model', default=256)
|
||||
eval_parser.add_argument('--n_stage', type=int, help='Number of stages in the model', default=3)
|
||||
eval_parser.add_argument('--show', help='whether to show eval statistics', action='store_true')
|
||||
eval_parser.add_argument('--trial', help='just to test', action='store_true')
|
||||
|
||||
args = parser.parse_args()
|
||||
return args
|
||||
|
||||
|
||||
def main():
|
||||
args = cli()
|
||||
|
||||
if args.command == 'predict':
|
||||
_ = predict(args)
|
||||
|
||||
elif args.command == 'prep':
|
||||
if 'nuscenes' in args.dataset:
|
||||
prep = PreprocessNuscenes(args.dir_ann, args.dir_nuscenes, args.dataset)
|
||||
prep.run()
|
||||
if 'kitti' in args.dataset:
|
||||
prep = PreprocessKitti(args.dir_ann)
|
||||
prep.run()
|
||||
|
||||
elif args.command == 'train':
|
||||
if args.hyp:
|
||||
hyp_tuning = HypTuning(joints=args.joints, epochs=args.epochs,
|
||||
baseline=args.baseline, dropout=args.dropout,
|
||||
multiplier=args.multiplier, r_seed=args.r_seed)
|
||||
hyp_tuning.train()
|
||||
else:
|
||||
training = Trainer(joints=args.joints, epochs=args.epochs, bs=args.bs,
|
||||
baseline=args.baseline, dropout=args.dropout, lr=args.lr, sched_step=args.sched_step,
|
||||
n_stage=args.n_stage, sched_gamma=args.sched_gamma, hidden_size=args.hidden_size,
|
||||
r_seed=args.r_seed, save=args.save)
|
||||
|
||||
_ = training.train()
|
||||
_ = training.evaluate()
|
||||
|
||||
elif args.command == 'eval':
|
||||
if args.geometric:
|
||||
geometric_baseline = GeomBaseline(args.joints)
|
||||
geometric_baseline.run()
|
||||
|
||||
if args.run_kitti:
|
||||
run_kitti = RunKitti(model=args.model, dir_ann=args.dir_ann,
|
||||
dropout=args.dropout, hidden_size=args.hidden_size, n_stage=args.n_stage,
|
||||
n_dropout=args.n_dropout)
|
||||
run_kitti.run()
|
||||
|
||||
if args.dataset == 'kitti':
|
||||
kitti_eval = KittiEval(show=args.show)
|
||||
kitti_eval.run()
|
||||
|
||||
if 'nuscenes' in args.dataset:
|
||||
training = Trainer(joints=args.joints)
|
||||
_ = training.evaluate(load=True, model=args.model, debug=False)
|
||||
|
||||
else:
|
||||
raise ValueError("Main subparser not recognized or not provided")
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
120
src/models/architectures.py
Normal file
120
src/models/architectures.py
Normal file
@ -0,0 +1,120 @@
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
class TriLinear(nn.Module):
|
||||
"""
|
||||
As Bilinear but without skip connection
|
||||
"""
|
||||
def __init__(self, input_size, output_size, p_dropout, linear_size=1024):
|
||||
super(TriLinear, self).__init__()
|
||||
|
||||
self.input_size = input_size
|
||||
self.output_size = output_size
|
||||
self.l_size = linear_size
|
||||
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
self.dropout = nn.Dropout(p_dropout)
|
||||
|
||||
self.w1 = nn.Linear(self.input_size, self.l_size)
|
||||
self.batch_norm1 = nn.BatchNorm1d(self.l_size)
|
||||
|
||||
self.w2 = nn.Linear(self.l_size, self.l_size)
|
||||
self.batch_norm2 = nn.BatchNorm1d(self.l_size)
|
||||
|
||||
self.w3 = nn.Linear(self.l_size, self.output_size)
|
||||
|
||||
def forward(self, x):
|
||||
y = self.w1(x)
|
||||
y = self.batch_norm1(y)
|
||||
y = self.relu(y)
|
||||
y = self.dropout(y)
|
||||
|
||||
y = self.w2(y)
|
||||
y = self.batch_norm2(y)
|
||||
y = self.relu(y)
|
||||
y = self.dropout(y)
|
||||
|
||||
y = self.w3(y)
|
||||
|
||||
return y
|
||||
|
||||
def weight_init(m):
|
||||
"""TO initialize weights using kaiming initialization"""
|
||||
if isinstance(m, nn.Linear):
|
||||
nn.init.kaiming_normal_(m.weight)
|
||||
|
||||
|
||||
class Linear(nn.Module):
|
||||
def __init__(self, linear_size, p_dropout=0.5):
|
||||
super(Linear, self).__init__()
|
||||
self.l_size = linear_size
|
||||
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
self.dropout = nn.Dropout(p_dropout)
|
||||
|
||||
self.w1 = nn.Linear(self.l_size, self.l_size)
|
||||
self.batch_norm1 = nn.BatchNorm1d(self.l_size)
|
||||
|
||||
self.w2 = nn.Linear(self.l_size, self.l_size)
|
||||
self.batch_norm2 = nn.BatchNorm1d(self.l_size)
|
||||
|
||||
def forward(self, x):
|
||||
y = self.w1(x)
|
||||
y = self.batch_norm1(y)
|
||||
y = self.relu(y)
|
||||
y = self.dropout(y)
|
||||
|
||||
y = self.w2(y)
|
||||
y = self.batch_norm2(y)
|
||||
y = self.relu(y)
|
||||
y = self.dropout(y)
|
||||
|
||||
out = x + y
|
||||
|
||||
return out
|
||||
|
||||
|
||||
class LinearModel(nn.Module):
|
||||
|
||||
"""Class from Simple yet effective baseline"""
|
||||
def __init__(self, input_size, output_size, linear_size=256, p_dropout=0.2, num_stage=3):
|
||||
super(LinearModel, self).__init__()
|
||||
|
||||
self.input_size = input_size
|
||||
self.output_size = output_size
|
||||
self.linear_size = linear_size
|
||||
self.p_dropout = p_dropout
|
||||
self.num_stage = num_stage
|
||||
|
||||
# process input to linear size
|
||||
self.w1 = nn.Linear(self.input_size, self.linear_size)
|
||||
self.batch_norm1 = nn.BatchNorm1d(self.linear_size)
|
||||
|
||||
self.linear_stages = []
|
||||
for l in range(num_stage):
|
||||
self.linear_stages.append(Linear(self.linear_size, self.p_dropout))
|
||||
self.linear_stages = nn.ModuleList(self.linear_stages)
|
||||
|
||||
# post processing
|
||||
self.w2 = nn.Linear(self.linear_size, self.output_size)
|
||||
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
self.dropout = nn.Dropout(self.p_dropout)
|
||||
|
||||
def forward(self, x):
|
||||
# pre-processing
|
||||
y = self.w1(x)
|
||||
y = self.batch_norm1(y)
|
||||
y = self.relu(y)
|
||||
y = self.dropout(y)
|
||||
|
||||
# linear layers
|
||||
for i in range(self.num_stage):
|
||||
y = self.linear_stages[i](y)
|
||||
|
||||
y = self.w2(y)
|
||||
|
||||
return y
|
||||
66
src/models/datasets.py
Normal file
66
src/models/datasets.py
Normal file
@ -0,0 +1,66 @@
|
||||
|
||||
import json
|
||||
import numpy as np
|
||||
import torch
|
||||
import os
|
||||
from collections import defaultdict
|
||||
|
||||
from torch.utils.data import Dataset
|
||||
|
||||
|
||||
class NuScenesDataset(Dataset):
|
||||
"""
|
||||
Get mask joints or ground truth joints and transform into tensors
|
||||
"""
|
||||
|
||||
def __init__(self, joints, phase):
|
||||
"""
|
||||
Load inputs and outputs from the pickles files from gt joints, mask joints or both
|
||||
"""
|
||||
assert(phase in ['train', 'val', 'test'])
|
||||
|
||||
with open(joints, 'r') as f:
|
||||
dic_jo = json.load(f)
|
||||
|
||||
# Define input and output for normal training and inference
|
||||
self.inputs = np.array(dic_jo[phase]['X'])
|
||||
self.outputs = np.array(dic_jo[phase]['Y']).reshape(-1, 1)
|
||||
self.names = dic_jo[phase]['names']
|
||||
self.kps = np.array(dic_jo[phase]['kps'])
|
||||
|
||||
# Extract annotations divided in clusters
|
||||
self.dic_clst = dic_jo[phase]['clst']
|
||||
|
||||
def __len__(self):
|
||||
"""
|
||||
:return: number of samples (m)
|
||||
"""
|
||||
return self.inputs.shape[0]
|
||||
|
||||
def __getitem__(self, idx):
|
||||
"""
|
||||
Reading the tensors when required. E.g. Retrieving one element or one batch at a time
|
||||
:param idx: corresponding to m
|
||||
"""
|
||||
inputs = torch.from_numpy(self.inputs[idx, :]).float()
|
||||
outputs = torch.from_numpy(np.array(self.outputs[idx])).float()
|
||||
names = self.names[idx]
|
||||
kps = self.kps[idx, :]
|
||||
|
||||
return inputs, outputs, names, kps
|
||||
|
||||
def get_cluster_annotations(self, clst):
|
||||
"""Return normalized annotations corresponding to a certain cluster
|
||||
"""
|
||||
inputs = torch.from_numpy(np.array(self.dic_clst[clst]['X'])).float()
|
||||
outputs = torch.from_numpy(np.array(self.dic_clst[clst]['Y'])).float()
|
||||
count = len(self.dic_clst[clst]['Y'])
|
||||
|
||||
return inputs, outputs, count
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
130
src/models/hyp_tuning.py
Normal file
130
src/models/hyp_tuning.py
Normal file
@ -0,0 +1,130 @@
|
||||
import math
|
||||
import os
|
||||
import json
|
||||
import time
|
||||
import logging
|
||||
import torch
|
||||
import random
|
||||
import datetime
|
||||
import numpy as np
|
||||
from models.trainer import Trainer
|
||||
|
||||
|
||||
class HypTuning:
|
||||
|
||||
def __init__(self, joints, epochs, baseline, dropout, multiplier=1, r_seed=1):
|
||||
"""
|
||||
Initialize directories, load the data and parameters for the training
|
||||
"""
|
||||
|
||||
# Initialize Directories
|
||||
self.joints = joints
|
||||
self.baseline = baseline
|
||||
self.dropout = dropout
|
||||
self.num_epochs = epochs
|
||||
self.baseline = baseline
|
||||
self.r_seed = r_seed
|
||||
dir_out = os.path.join('data', 'models')
|
||||
dir_logs = os.path.join('data', 'logs')
|
||||
assert os.path.exists(dir_out), "Output directory not found"
|
||||
if not os.path.exists(dir_logs):
|
||||
os.makedirs(dir_logs)
|
||||
|
||||
now = datetime.datetime.now()
|
||||
now_time = now.strftime("%Y%m%d-%H%M")[2:]
|
||||
|
||||
if baseline:
|
||||
name_out = 'hyp-baseline-' + now_time
|
||||
else:
|
||||
name_out = 'hyp-monoloco-' + now_time
|
||||
|
||||
self.path_log = os.path.join(dir_logs, name_out + now_time)
|
||||
self.path_model = os.path.join(dir_out, name_out + now_time + '.pkl')
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
|
||||
# Initialize grid of parameters
|
||||
random.seed(r_seed)
|
||||
np.random.seed(r_seed)
|
||||
self.sched_gamma_list = [0.8, 0.9, 1, 0.8, 0.9, 1] * multiplier
|
||||
random.shuffle(self.sched_gamma_list)
|
||||
self.sched_step = [10, 20, 30, 40, 50, 60] * multiplier
|
||||
random.shuffle(self.sched_step)
|
||||
self.bs_list = [64, 128, 256, 512, 1024, 2048] * multiplier
|
||||
random.shuffle(self.bs_list)
|
||||
self.hidden_list = [128, 256, 512, 128, 256, 512] * multiplier
|
||||
random.shuffle(self.hidden_list)
|
||||
self.n_stage_list = [3, 3, 3, 3, 3, 3] * multiplier
|
||||
random.shuffle(self.n_stage_list)
|
||||
# Learning rate
|
||||
aa = math.log(0.001, 10)
|
||||
bb = math.log(0.03, 10)
|
||||
log_lr_list = np.random.uniform(aa, bb, int(6 * multiplier)).tolist()
|
||||
self.lr_list = [10 ** xx for xx in log_lr_list]
|
||||
# plt.hist(self.lr_list, bins=50)
|
||||
# plt.show()
|
||||
|
||||
def train(self):
|
||||
"""Train multiple times using log-space random search"""
|
||||
|
||||
best_acc_val = 20
|
||||
dic_best = {}
|
||||
dic_err_best = {}
|
||||
start = time.time()
|
||||
cnt = 0
|
||||
for idx, lr in enumerate(self.lr_list):
|
||||
bs = self.bs_list[idx]
|
||||
sched_gamma = self.sched_gamma_list[idx]
|
||||
sched_step = self.sched_step[idx]
|
||||
hidden_size = self.hidden_list[idx]
|
||||
n_stage = self.n_stage_list[idx]
|
||||
|
||||
training = Trainer(joints=self.joints, epochs=self.num_epochs,
|
||||
bs=bs, baseline=self.baseline, dropout=self.dropout, lr=lr, sched_step=sched_step,
|
||||
sched_gamma=sched_gamma, hidden_size=hidden_size, n_stage=n_stage,
|
||||
save=False, print_loss=False, r_seed=self.r_seed)
|
||||
|
||||
best_epoch = training.train()
|
||||
dic_err, model = training.evaluate()
|
||||
acc_val = dic_err['val']['all']['mean']
|
||||
cnt += 1
|
||||
print("Combination number: {}".format(cnt))
|
||||
|
||||
if acc_val < best_acc_val:
|
||||
dic_best['lr'] = lr
|
||||
dic_best['joints'] = self.joints
|
||||
dic_best['bs'] = bs
|
||||
dic_best['baseline'] = self.baseline
|
||||
dic_best['sched_gamma'] = sched_gamma
|
||||
dic_best['sched_step'] = sched_step
|
||||
dic_best['hidden_size'] = hidden_size
|
||||
dic_best['n_stage'] = n_stage
|
||||
dic_best['acc_val'] = dic_err['val']['all']['mean']
|
||||
dic_best['best_epoch'] = best_epoch
|
||||
dic_best['random_seed'] = self.r_seed
|
||||
# dic_best['acc_test'] = dic_err['test']['all']['mean']
|
||||
|
||||
dic_err_best = dic_err
|
||||
best_acc_val = acc_val
|
||||
model_best = model
|
||||
torch.save(model_best.state_dict(), self.path_model)
|
||||
|
||||
with open(self.path_log, 'w') as f:
|
||||
json.dump(dic_best, f)
|
||||
|
||||
end = time.time()
|
||||
print('\n\n\n')
|
||||
self.logger.info(" Tried {} combinations".format(cnt))
|
||||
self.logger.info(" Total time for hyperparameters search: {:.2f} minutes".format((end - start) / 60))
|
||||
self.logger.info(" Best hyperparameters are:")
|
||||
for key, value in dic_best.items():
|
||||
self.logger.info(" {}: {}".format(key, value))
|
||||
|
||||
print()
|
||||
self.logger.info("Accuracy in each cluster:")
|
||||
|
||||
for key in dic_err_best['val']:
|
||||
self.logger.info("Val: error in cluster {} = {} ".format(key, dic_err_best['val'][key]['mean']))
|
||||
self.logger.info("Final accuracy Val: {:.2f}".format(dic_best['acc_val']))
|
||||
self.logger.info("\nSaved the model: {}".format(self.path_model))
|
||||
144
src/models/losses.py
Normal file
144
src/models/losses.py
Normal file
@ -0,0 +1,144 @@
|
||||
import math
|
||||
import torch
|
||||
import numpy as np
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
|
||||
class CustomL1Loss(torch.nn.Module):
|
||||
"""
|
||||
L1 loss with more weight to errors at a shorter distance
|
||||
It inherits from nn.module so it supports backward
|
||||
"""
|
||||
|
||||
def __init__(self, dic_norm, device, beta=1):
|
||||
super(CustomL1Loss, self).__init__()
|
||||
|
||||
self.dic_norm = dic_norm
|
||||
self.device = device
|
||||
self.beta = beta
|
||||
|
||||
@staticmethod
|
||||
def compute_weights(xx, beta=1):
|
||||
"""
|
||||
Return the appropriate weight depending on the distance and the hyperparameter chosen
|
||||
alpha = 1 refers to the curve of A Photogrammetric Approach for Real-time...
|
||||
It is made for unnormalized outputs (to be more understandable)
|
||||
From 70 meters on every value is weighted the same (0.1**beta)
|
||||
Alpha is optional value from Focal loss. Yet to be analyzed
|
||||
"""
|
||||
# alpha = np.maximum(1, 10 ** (beta - 1))
|
||||
alpha = 1
|
||||
ww = np.maximum(0.1, 1 - xx / 78)**beta
|
||||
|
||||
return alpha * ww
|
||||
|
||||
def print_loss(self):
|
||||
xx = np.linspace(0, 80, 100)
|
||||
y1 = self.compute_weights(xx, beta=1)
|
||||
y2 = self.compute_weights(xx, beta=2)
|
||||
y3 = self.compute_weights(xx, beta=3)
|
||||
plt.plot(xx, y1)
|
||||
plt.plot(xx, y2)
|
||||
plt.plot(xx, y3)
|
||||
plt.xlabel("Distance [m]")
|
||||
plt.ylabel("Loss function Weight")
|
||||
plt.legend(("Beta = 1", "Beta = 2", "Beta = 3"))
|
||||
plt.show()
|
||||
|
||||
def forward(self, output, target):
|
||||
|
||||
unnormalized_output = output.cpu().detach().numpy() * self.dic_norm['std']['Y'] + self.dic_norm['mean']['Y']
|
||||
weights_np = self.compute_weights(unnormalized_output, self.beta)
|
||||
weights = torch.from_numpy(weights_np).float().to(self.device) # To make weights in the same cuda device
|
||||
losses = torch.abs(output - target) * weights
|
||||
loss = losses.mean() # Mean over the batch
|
||||
# self.print_loss()
|
||||
|
||||
return loss
|
||||
|
||||
|
||||
class LaplacianLoss(torch.nn.Module):
|
||||
"""1D Gaussian with std depending on the absolute distance
|
||||
"""
|
||||
def __init__(self, size_average=True, reduce=True, evaluate=False):
|
||||
super(LaplacianLoss, self).__init__()
|
||||
self.size_average = size_average
|
||||
self.reduce = reduce
|
||||
self.evaluate = evaluate
|
||||
|
||||
def laplacian_1d(self, mu_si, xx):
|
||||
"""
|
||||
1D Gaussian Loss. f(x | mu, sigma). The network outputs mu and sigma. X is the ground truth distance.
|
||||
This supports backward().
|
||||
Inspired by
|
||||
https://github.com/naba89/RNN-Handwriting-Generation-Pytorch/blob/master/loss_functions.py
|
||||
|
||||
"""
|
||||
mu, si = mu_si[:, 0:1], mu_si[:, 1:2]
|
||||
# norm = xx - mu
|
||||
norm = 1 - mu / xx # Relative
|
||||
|
||||
term_a = torch.abs(norm) * torch.exp(-si)
|
||||
term_b = si
|
||||
norm_bi = (np.mean(np.abs(norm.cpu().detach().numpy())), np.mean(torch.exp(si).cpu().detach().numpy()))
|
||||
|
||||
if self.evaluate:
|
||||
return norm_bi
|
||||
else:
|
||||
return term_a + term_b
|
||||
|
||||
def forward(self, outputs, targets):
|
||||
|
||||
values = self.laplacian_1d(outputs, targets)
|
||||
|
||||
if not self.reduce or self.evaluate:
|
||||
return values
|
||||
if self.size_average:
|
||||
mean_values = torch.mean(values)
|
||||
return mean_values
|
||||
return torch.sum(values)
|
||||
|
||||
|
||||
class GaussianLoss(torch.nn.Module):
|
||||
"""1D Gaussian with std depending on the absolute distance
|
||||
"""
|
||||
def __init__(self, device, size_average=True, reduce=True, evaluate=False):
|
||||
super(GaussianLoss, self).__init__()
|
||||
self.size_average = size_average
|
||||
self.reduce = reduce
|
||||
self.evaluate = evaluate
|
||||
self.device = device
|
||||
|
||||
def gaussian_1d(self, mu_si, xx):
|
||||
"""
|
||||
1D Gaussian Loss. f(x | mu, sigma). The network outputs mu and sigma. X is the ground truth distance.
|
||||
This supports backward().
|
||||
Inspired by
|
||||
https://github.com/naba89/RNN-Handwriting-Generation-Pytorch/blob/master/loss_functions.py
|
||||
|
||||
"""
|
||||
mu, si = mu_si[:, 0:1], mu_si[:, 1:2]
|
||||
|
||||
min_si = torch.ones(si.size()).cuda(self.device) * 0.1
|
||||
si = torch.max(min_si, si)
|
||||
norm = xx - mu
|
||||
term_a = (norm / si)**2 / 2
|
||||
term_b = torch.log(si * math.sqrt(2 * math.pi))
|
||||
|
||||
norm_si = (np.mean(np.abs(norm.cpu().detach().numpy())), np.mean(si.cpu().detach().numpy()))
|
||||
|
||||
if self.evaluate:
|
||||
return norm_si
|
||||
else:
|
||||
return term_a + term_b
|
||||
|
||||
def forward(self, outputs, targets):
|
||||
|
||||
values = self.gaussian_1d(outputs, targets)
|
||||
|
||||
if not self.reduce or self.evaluate:
|
||||
return values
|
||||
if self.size_average:
|
||||
mean_values = torch.mean(values)
|
||||
return mean_values
|
||||
return torch.sum(values)
|
||||
358
src/models/trainer.py
Normal file
358
src/models/trainer.py
Normal file
@ -0,0 +1,358 @@
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import copy
|
||||
import numpy as np
|
||||
import matplotlib.pyplot as plt
|
||||
import os
|
||||
import datetime
|
||||
import logging
|
||||
from collections import defaultdict
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
|
||||
from torch.utils.data import DataLoader
|
||||
from torch.optim import lr_scheduler
|
||||
|
||||
from models.datasets import NuScenesDataset
|
||||
from models.architectures import LinearModel
|
||||
from utils.misc import set_logger
|
||||
from models.losses import LaplacianLoss
|
||||
|
||||
|
||||
class Trainer:
|
||||
"""
|
||||
Training and evaluation of a neural network which predicts 3D localization and confidence intervals
|
||||
given 2d joints
|
||||
"""
|
||||
|
||||
def __init__(self, joints, epochs=100, bs=256, dropout=0.2, lr=0.002,
|
||||
sched_step=20, sched_gamma=1, hidden_size=256, n_stage=3, r_seed=1, n_dropout=0, n_samples=100,
|
||||
baseline=False, save=False, print_loss=False):
|
||||
"""
|
||||
Initialize directories, load the data and parameters for the training
|
||||
"""
|
||||
|
||||
# Initialize directories and parameters
|
||||
dir_out = os.path.join('data', 'models')
|
||||
assert os.path.exists(dir_out), "Output directory not found"
|
||||
dir_logs = os.path.join('data', 'logs')
|
||||
if not os.path.exists(dir_logs):
|
||||
os.makedirs(dir_logs)
|
||||
assert os.path.exists(joints), "Input file not found"
|
||||
|
||||
self.joints = joints
|
||||
self.num_epochs = epochs
|
||||
self.save = save
|
||||
self.print_loss = print_loss
|
||||
self.baseline = baseline
|
||||
self.lr = lr
|
||||
self.sched_step = sched_step
|
||||
self.sched_gamma = sched_gamma
|
||||
n_joints = 17
|
||||
input_size = n_joints * 2
|
||||
self.output_size = 2
|
||||
self.clusters = ['10', '20', '30', '>30']
|
||||
self.hidden_size = hidden_size
|
||||
self.n_stage = n_stage
|
||||
self.dir_out = dir_out
|
||||
self.n_dropout = n_dropout
|
||||
self.n_samples = n_samples
|
||||
self.r_seed = r_seed
|
||||
|
||||
from utils.normalize import unnormalize_bi
|
||||
self.unnormalize_bi = unnormalize_bi
|
||||
from utils.misc import epistemic_variance, laplace_sampling
|
||||
self.epistemic_variance = epistemic_variance
|
||||
self.laplace_sampling = laplace_sampling
|
||||
|
||||
# Loss functions and output names
|
||||
now = datetime.datetime.now()
|
||||
now_time = now.strftime("%Y%m%d-%H%M")[2:]
|
||||
|
||||
if baseline:
|
||||
name_out = 'baseline-' + now_time
|
||||
self.criterion = nn.L1Loss().cuda()
|
||||
self.output_size = 1
|
||||
else:
|
||||
name_out = 'monoloco-' + now_time
|
||||
self.criterion = LaplacianLoss().cuda()
|
||||
self.output_size = 2
|
||||
self.criterion_eval = nn.L1Loss().cuda()
|
||||
|
||||
if self.save:
|
||||
self.path_model = os.path.join(dir_out, name_out + '.pkl')
|
||||
self.logger = set_logger(os.path.join(dir_logs, name_out))
|
||||
self.logger.info("Training arguments: \nepochs: {} \nbatch_size: {} \ndropout: {}"
|
||||
"\nbaseline: {} \nlearning rate: {} \nscheduler step: {} \nscheduler gamma: {} "
|
||||
"\ninput_size: {} \nhidden_size: {} \nn_stages: {} \nr_seed: {}"
|
||||
"\ninput_file: {}"
|
||||
.format(epochs, bs, dropout, baseline, lr, sched_step, sched_gamma, input_size,
|
||||
hidden_size, n_stage, r_seed, self.joints))
|
||||
else:
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
|
||||
# Select the device and load the data
|
||||
use_cuda = torch.cuda.is_available()
|
||||
self.device = torch.device("cuda:0" if use_cuda else "cpu")
|
||||
print('Device: ', self.device)
|
||||
|
||||
# Set the seed for random initialization
|
||||
torch.manual_seed(r_seed)
|
||||
if use_cuda:
|
||||
torch.cuda.manual_seed(r_seed)
|
||||
|
||||
# Dataloader
|
||||
self.dataloaders = {phase: DataLoader(NuScenesDataset(self.joints, phase=phase),
|
||||
batch_size=bs, shuffle=True) for phase in ['train', 'val']}
|
||||
|
||||
self.dataset_sizes = {phase: len(NuScenesDataset(self.joints, phase=phase))
|
||||
for phase in ['train', 'val', 'test']}
|
||||
|
||||
# Define the model
|
||||
self.logger.info('Sizes of the dataset: {}'.format(self.dataset_sizes))
|
||||
print(">>> creating model")
|
||||
self.model = LinearModel(input_size=input_size, output_size=self.output_size, linear_size=hidden_size,
|
||||
p_dropout=dropout, num_stage=self.n_stage)
|
||||
self.model.to(self.device)
|
||||
print(">>> total params: {:.2f}M".format(sum(p.numel() for p in self.model.parameters()) / 1000000.0))
|
||||
|
||||
# Optimizer and scheduler
|
||||
self.optimizer = torch.optim.Adam(params=self.model.parameters(), lr=lr)
|
||||
self.scheduler = lr_scheduler.StepLR(self.optimizer, step_size=self.sched_step, gamma=self.sched_gamma)
|
||||
|
||||
def train(self):
|
||||
|
||||
# Initialize the variable containing model weights
|
||||
since = time.time()
|
||||
best_model_wts = copy.deepcopy(self.model.state_dict())
|
||||
best_acc = 1e6
|
||||
best_epoch = 0
|
||||
epoch_losses_tr = []
|
||||
epoch_losses_val = []
|
||||
epoch_norms = []
|
||||
epoch_sis = []
|
||||
|
||||
for epoch in range(self.num_epochs):
|
||||
|
||||
# Each epoch has a training and validation phase
|
||||
for phase in ['train', 'val']:
|
||||
if phase == 'train':
|
||||
self.scheduler.step()
|
||||
self.model.train() # Set model to training mode
|
||||
else:
|
||||
self.model.eval() # Set model to evaluate mode
|
||||
|
||||
running_loss_tr = 0.0
|
||||
running_loss_eval = 0.0
|
||||
norm_tr = 0.0
|
||||
bi_tr = 0.0
|
||||
|
||||
# Iterate over data.
|
||||
for inputs, labels, _, _ in self.dataloaders[phase]:
|
||||
inputs = inputs.to(self.device)
|
||||
labels = labels.to(self.device)
|
||||
|
||||
# zero the parameter gradients
|
||||
self.optimizer.zero_grad()
|
||||
|
||||
# forward
|
||||
# track history if only in train
|
||||
with torch.set_grad_enabled(phase == 'train'):
|
||||
outputs = self.model(inputs)
|
||||
|
||||
if self.output_size == 2:
|
||||
outputs_eval = outputs[:, 0:1] # Fundamental to put slices
|
||||
else:
|
||||
outputs_eval = outputs
|
||||
|
||||
loss = self.criterion(outputs, labels)
|
||||
loss_eval = self.criterion_eval(outputs_eval, labels) # L1 loss to evaluation
|
||||
|
||||
# backward + optimize only if in training phase
|
||||
if phase == 'train':
|
||||
loss.backward()
|
||||
self.optimizer.step()
|
||||
|
||||
# statistics
|
||||
running_loss_tr += loss.item() * inputs.size(0)
|
||||
running_loss_eval += loss_eval.item() * inputs.size(0)
|
||||
|
||||
epoch_loss = running_loss_tr / self.dataset_sizes[phase]
|
||||
epoch_acc = running_loss_eval / self.dataset_sizes[phase] # Average distance in meters
|
||||
epoch_norm = float(norm_tr / self.dataset_sizes[phase])
|
||||
epoch_si = float(bi_tr / self.dataset_sizes[phase])
|
||||
if phase == 'train':
|
||||
epoch_losses_tr.append(epoch_loss)
|
||||
epoch_norms.append(epoch_norm)
|
||||
epoch_sis.append(epoch_si)
|
||||
else:
|
||||
epoch_losses_val.append(epoch_acc)
|
||||
|
||||
if epoch % 5 == 1:
|
||||
sys.stdout.write('\r' + 'Epoch: {:.0f} Training Loss: {:.3f} Val Loss {:.3f}'
|
||||
.format(epoch, epoch_losses_tr[-1], epoch_losses_val[-1]) + '\t')
|
||||
|
||||
# deep copy the model
|
||||
if phase == 'val' and epoch_acc < best_acc:
|
||||
best_acc = epoch_acc
|
||||
best_epoch = epoch
|
||||
best_model_wts = copy.deepcopy(self.model.state_dict())
|
||||
|
||||
time_elapsed = time.time() - since
|
||||
self.logger.info('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
|
||||
self.logger.info('Best val Acc: {:3f}'.format(best_acc))
|
||||
self.logger.info('Saved weights of the model at epoch: {}\n'.format(best_epoch))
|
||||
|
||||
if self.print_loss:
|
||||
epoch_losses_val_scaled = [x - 4 for x in epoch_losses_val] # to compare with L1 Loss
|
||||
plt.plot(epoch_losses_tr[10:], label='Training Loss')
|
||||
plt.plot(epoch_losses_val_scaled[10:], label='Validation Loss')
|
||||
plt.legend()
|
||||
plt.show()
|
||||
|
||||
# load best model weights
|
||||
self.model.load_state_dict(best_model_wts)
|
||||
|
||||
return best_epoch
|
||||
|
||||
def evaluate(self, load=False, model=None, debug=False):
|
||||
|
||||
# To load a model instead of using the trained one
|
||||
if load:
|
||||
# self.path_out = os.path.join(self.dir_out, 'best_model_paper.pickle')
|
||||
self.model.load_state_dict(torch.load(model, map_location=lambda storage, loc: storage))
|
||||
|
||||
# Average distance on training and test set after unnormalizing
|
||||
self.model.eval()
|
||||
dic_err = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: 0))) # initialized to zero
|
||||
phase = 'val'
|
||||
dataloader_eval = DataLoader(NuScenesDataset(self.joints, phase=phase),
|
||||
batch_size=5000, shuffle=True)
|
||||
|
||||
size_eval = len(NuScenesDataset(self.joints, phase=phase))
|
||||
|
||||
with torch.no_grad():
|
||||
for inputs, labels, _, _ in dataloader_eval:
|
||||
inputs = inputs.to(self.device)
|
||||
labels = labels.to(self.device)
|
||||
|
||||
# Debug plot for input-output distributions
|
||||
if debug:
|
||||
inputs_shoulder = inputs.cpu().numpy()[:, 5]
|
||||
inputs_hip = inputs.cpu().numpy()[:, 11]
|
||||
labels = labels.cpu().numpy()
|
||||
heights = inputs_hip - inputs_shoulder
|
||||
plt.figure(1)
|
||||
plt.hist(heights, bins='auto')
|
||||
plt.show()
|
||||
plt.figure(2)
|
||||
plt.hist(labels, bins='auto')
|
||||
plt.show()
|
||||
exit()
|
||||
|
||||
# Manually reactivate dropout in eval
|
||||
self.model.dropout.training = True
|
||||
total_outputs = torch.empty((0, len(labels))).to(self.device)
|
||||
|
||||
if self.n_dropout > 0:
|
||||
for ii in range(self.n_dropout):
|
||||
outputs = self.model(inputs)
|
||||
outputs = self.unnormalize_bi(outputs)
|
||||
samples = self.laplace_sampling(outputs, self.n_samples)
|
||||
total_outputs = torch.cat((total_outputs, samples), 0)
|
||||
varss = total_outputs.std(0)
|
||||
else:
|
||||
varss = [0]
|
||||
|
||||
# Don't use dropout for the mean prediction
|
||||
self.model.dropout.training = False
|
||||
|
||||
# Forward pass
|
||||
outputs = self.model(inputs)
|
||||
|
||||
if not self.baseline:
|
||||
outputs = self.unnormalize_bi(outputs)
|
||||
|
||||
avg_distance = float(self.criterion_eval(outputs[:, 0:1], labels).item())
|
||||
|
||||
dic_err[phase]['all'] = self.compute_stats(outputs, labels, varss, dic_err[phase]['all'], size_eval)
|
||||
|
||||
print('-'*120)
|
||||
self.logger.info("\nAverage distance on the {} set: {:.2f}"
|
||||
.format(phase, dic_err[phase]['all']['mean']))
|
||||
|
||||
self.logger.info("Aleatoric Uncertainty: {:.2f}, inside the interval: {:.1f}%"
|
||||
.format(dic_err[phase]['all']['bi'], dic_err[phase]['all']['conf_bi']*100))
|
||||
|
||||
# TODO Add evaluation variance
|
||||
# self.logger.info("Combined Uncertainty: {:.2f} with a max of {:.2f}, inside the interval: {:.1f}%\n"
|
||||
# .format(stat_var[0], stat_var[1], stat_var[2]*100))
|
||||
|
||||
# Evaluate performances on different clusters and save statistics
|
||||
|
||||
nuscenes = NuScenesDataset(self.joints, phase=phase)
|
||||
for clst in self.clusters:
|
||||
inputs, labels, size_eval = nuscenes.get_cluster_annotations(clst)
|
||||
inputs, labels = inputs.to(self.device), labels.to(self.device)
|
||||
|
||||
# Forward pass on each cluster
|
||||
outputs = self.model(inputs)
|
||||
if not self.baseline:
|
||||
outputs = self.unnormalize_bi(outputs)
|
||||
|
||||
dic_err[phase][clst] = self.compute_stats(outputs, labels, [0], dic_err[phase][clst], size_eval)
|
||||
|
||||
self.logger.info("{} error in cluster {} = {:.2f} for {} instances. "
|
||||
"Aleatoric of {:.2f} with {:.1f}% inside the interval"
|
||||
.format(phase, clst, dic_err[phase][clst]['mean'], size_eval,
|
||||
dic_err[phase][clst]['bi'], dic_err[phase][clst]['conf_bi'] * 100))
|
||||
|
||||
# Save the model and the results
|
||||
if self.save and not load:
|
||||
torch.save(self.model.state_dict(), self.path_model)
|
||||
print('-'*120)
|
||||
self.logger.info("model saved: {} \n".format(self.path_model))
|
||||
else:
|
||||
self.logger.info("model not saved\n")
|
||||
|
||||
return dic_err, self.model
|
||||
|
||||
def compute_stats(self, outputs, labels_orig, varss, dic_err, size_eval):
|
||||
"""Compute mean std (aleatoric) and max of torch tensors"""
|
||||
|
||||
labels = labels_orig.view(-1, )
|
||||
mean_mu = float(self.criterion_eval(outputs[:, 0], labels).item())
|
||||
max_mu = float(torch.max(torch.abs((outputs[:, 0] - labels))).item())
|
||||
|
||||
if self.baseline:
|
||||
return (mean_mu, max_mu), (0, 0, 0)
|
||||
|
||||
else:
|
||||
mean_bi = torch.mean(outputs[:, 1]).item()
|
||||
max_bi = torch.max(outputs[:, 1]).item()
|
||||
|
||||
low_bound_bi = labels >= (outputs[:, 0] - outputs[:, 1])
|
||||
up_bound_bi = labels <= (outputs[:, 0] + outputs[:, 1])
|
||||
bools_bi = low_bound_bi & up_bound_bi
|
||||
conf_bi = float(torch.sum(bools_bi)) / float(bools_bi.shape[0])
|
||||
|
||||
if varss[0] == 0:
|
||||
aa = 5
|
||||
|
||||
else:
|
||||
mean_var = torch.mean(varss).item()
|
||||
max_var = torch.max(varss).item()
|
||||
|
||||
low_bound_var = labels >= (outputs[:, 0] - varss)
|
||||
up_bound_var = labels <= (outputs[:, 0] + varss)
|
||||
bools_var = low_bound_var & up_bound_var
|
||||
conf_var = float(torch.sum(bools_var)) / float(bools_var.shape[0])
|
||||
|
||||
dic_err['mean'] += mean_mu * (outputs.size(0) / size_eval)
|
||||
dic_err['bi'] += mean_bi * (outputs.size(0) / size_eval)
|
||||
dic_err['count'] += (outputs.size(0) / size_eval)
|
||||
dic_err['conf_bi'] += conf_bi * (outputs.size(0) / size_eval)
|
||||
|
||||
return dic_err
|
||||
168
src/predict/predict_2d_3d.py
Normal file
168
src/predict/predict_2d_3d.py
Normal file
@ -0,0 +1,168 @@
|
||||
"""Predict poses for given images."""
|
||||
|
||||
import argparse
|
||||
import glob
|
||||
import json
|
||||
import os
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
|
||||
from openpifpaf.network import nets
|
||||
from openpifpaf import decoder, show
|
||||
from openpifpaf import transforms
|
||||
from predict.predict_monoloco import PredictMonoLoco
|
||||
from utils.pifpaf import preprocess_pif
|
||||
|
||||
import torchvision
|
||||
import torch
|
||||
|
||||
from PIL import Image, ImageFile
|
||||
|
||||
|
||||
class ImageList(torch.utils.data.Dataset):
|
||||
def __init__(self, image_paths, scale, image_transform=None):
|
||||
self.image_paths = image_paths
|
||||
self.image_transform = image_transform or transforms.image_transform
|
||||
self.scale = scale
|
||||
|
||||
# data = datasets.ImageList(args.images, preprocess=transforms.RescaleRelative(2
|
||||
# .0)
|
||||
|
||||
def __getitem__(self, index):
|
||||
image_path = self.image_paths[index]
|
||||
ImageFile.LOAD_TRUNCATED_IMAGES = True
|
||||
with open(image_path, 'rb') as f:
|
||||
image = Image.open(f).convert('RGB')
|
||||
|
||||
if self.scale > 1.01 or self.scale < 0.99:
|
||||
image = torchvision.transforms.functional.resize(image,
|
||||
(round(self.scale * image.size[1]),
|
||||
round(self.scale * image.size[0])),
|
||||
interpolation=Image.BICUBIC)
|
||||
original_image = torchvision.transforms.functional.to_tensor(image)
|
||||
image = self.image_transform(image)
|
||||
|
||||
return image_path, original_image, image
|
||||
|
||||
def __len__(self):
|
||||
return len(self.image_paths)
|
||||
|
||||
|
||||
def elaborate_cli(args):
|
||||
|
||||
# Merge the model_pifpaf argument
|
||||
if not args.checkpoint:
|
||||
args.checkpoint = args.model_pifpaf
|
||||
# glob
|
||||
if args.glob:
|
||||
args.images += glob.glob(args.glob)
|
||||
if not args.images:
|
||||
raise Exception("no image files given")
|
||||
|
||||
# add args.device
|
||||
args.device = torch.device('cpu')
|
||||
args.pin_memory = False
|
||||
if torch.cuda.is_available():
|
||||
args.device = torch.device('cuda')
|
||||
args.pin_memory = True
|
||||
|
||||
# Add num_workers
|
||||
args.loader_workers = 8
|
||||
|
||||
# Add visualization defaults
|
||||
args.figure_width = 10
|
||||
args.dpi_factor = 1.0
|
||||
|
||||
return args
|
||||
|
||||
|
||||
def predict(args):
|
||||
|
||||
elaborate_cli(args)
|
||||
|
||||
# load model
|
||||
model, _ = nets.factory_from_args(args)
|
||||
model = model.to(args.device)
|
||||
processor = decoder.factory_from_args(args, model)
|
||||
|
||||
# data
|
||||
data = ImageList(args.images, scale=args.scale)
|
||||
data_loader = torch.utils.data.DataLoader(
|
||||
data, batch_size=1, shuffle=False,
|
||||
pin_memory=args.pin_memory, num_workers=args.loader_workers)
|
||||
|
||||
# Visualizer
|
||||
keypoint_painter = show.KeypointPainter(show_box=True)
|
||||
skeleton_painter = show.KeypointPainter(show_box=False, color_connections=True,
|
||||
markersize=1, linewidth=4)
|
||||
|
||||
keypoints_whole = []
|
||||
for idx, (image_paths, image_tensors, processed_images_cpu) in enumerate(data_loader):
|
||||
images = image_tensors.permute(0, 2, 3, 1)
|
||||
|
||||
processed_images = processed_images_cpu.to(args.device, non_blocking=True)
|
||||
fields_batch = processor.fields(processed_images)
|
||||
|
||||
# unbatch
|
||||
for image_path, image, processed_image_cpu, fields in zip(
|
||||
image_paths,
|
||||
images,
|
||||
processed_images_cpu,
|
||||
fields_batch):
|
||||
|
||||
if args.output_directory is None:
|
||||
output_path = image_path
|
||||
else:
|
||||
file_name = os.path.basename(image_path)
|
||||
output_path = os.path.join(args.output_directory, file_name)
|
||||
print('image', idx, image_path, output_path)
|
||||
|
||||
processor.set_cpu_image(image, processed_image_cpu)
|
||||
keypoint_sets, scores = processor.keypoint_sets(fields)
|
||||
|
||||
# Correct to not change the confidence
|
||||
scale_np = np.array([args.scale, args.scale, 1] * 17).reshape(17, 3)
|
||||
|
||||
pifpaf_out = [
|
||||
{'keypoints': np.around(kps / scale_np, 1).reshape(-1).tolist(),
|
||||
'bbox': [np.min(kps[:, 0]) / args.scale, np.min(kps[:, 1]) / args.scale,
|
||||
np.max(kps[:, 0]) / args.scale, np.max(kps[:, 1]) / args.scale]}
|
||||
for kps in keypoint_sets
|
||||
]
|
||||
|
||||
# Save json file
|
||||
if 'pifpaf' in args.networks:
|
||||
|
||||
if 'json' in args.output_types and keypoint_sets.size > 0:
|
||||
with open(output_path + '.pifpaf.json', 'w') as f:
|
||||
json.dump(pifpaf_out, f)
|
||||
|
||||
if keypoint_sets.size > 0:
|
||||
keypoints_whole.append(np.around((keypoint_sets / scale_np), 1)
|
||||
.reshape(keypoint_sets.shape[0], -1).tolist())
|
||||
|
||||
if 'keypoints' in args.output_types:
|
||||
with show.image_canvas(image,
|
||||
output_path + '.keypoints.png',
|
||||
show=args.show,
|
||||
fig_width=args.figure_width,
|
||||
dpi_factor=args.dpi_factor) as ax:
|
||||
keypoint_painter.keypoints(ax, keypoint_sets)
|
||||
|
||||
if 'skeleton' in args.output_types:
|
||||
with show.image_canvas(image,
|
||||
output_path + '.skeleton.png',
|
||||
show=args.show,
|
||||
fig_width=args.figure_width,
|
||||
dpi_factor=args.dpi_factor) as ax:
|
||||
skeleton_painter.keypoints(ax, keypoint_sets, scores=scores)
|
||||
|
||||
if 'monoloco' in args.networks:
|
||||
im_size = (float(image.size()[1] / args.scale),
|
||||
float(image.size()[0] / args.scale)) # Width, Height (original)
|
||||
boxes, keypoints = preprocess_pif(pifpaf_out, im_size)
|
||||
predict_monoloco = PredictMonoLoco(boxes, keypoints, image_path, output_path, args)
|
||||
predict_monoloco.run()
|
||||
|
||||
return keypoints_whole
|
||||
196
src/predict/predict_monoloco.py
Normal file
196
src/predict/predict_monoloco.py
Normal file
@ -0,0 +1,196 @@
|
||||
|
||||
"""
|
||||
From a json file output images and json annotations
|
||||
"""
|
||||
|
||||
import torch
|
||||
import sys
|
||||
import numpy as np
|
||||
from collections import defaultdict
|
||||
import os
|
||||
import json
|
||||
import logging
|
||||
import time
|
||||
from models.architectures import LinearModel
|
||||
|
||||
|
||||
class PredictMonoLoco:
|
||||
|
||||
def __init__(self, boxes, keypoints, image_path, output_path, args):
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
self.logger = logging.getLogger(__name__)
|
||||
basename, _ = os.path.splitext(os.path.basename(image_path))
|
||||
|
||||
# Check for ground-truth file
|
||||
try:
|
||||
with open(args.path_gt, 'r') as f:
|
||||
self.dic_names = json.load(f)
|
||||
except FileNotFoundError:
|
||||
self.dic_names = None
|
||||
|
||||
print('-' * 120)
|
||||
print("Monoloco: ground truth file not found")
|
||||
print('-' * 120)
|
||||
|
||||
# Check for calibration file
|
||||
if not args.no_calib:
|
||||
self.path_calib = os.path.join(args.dir_calib, basename + '.txt')
|
||||
assert os.path.isfile(self.path_calib), "Calibration file not found. Change dir or use --no_calib"
|
||||
|
||||
self.boxes = boxes
|
||||
self.keypoints = keypoints
|
||||
self.image_path = image_path
|
||||
self.output_path = output_path
|
||||
self.device = args.device
|
||||
self.draw_kps = args.draw_kps
|
||||
self.y_scale = args.y_scale # y_scale = 1.85 for kitti combined
|
||||
self.no_calib = args.no_calib
|
||||
self.z_max = args.z_max
|
||||
self.output_types = args.output_types
|
||||
self.show = args.show
|
||||
output_size = 2
|
||||
self.n_samples = 100
|
||||
self.n_dropout = args.n_dropout
|
||||
if self.n_dropout > 0:
|
||||
self.epistemic = True
|
||||
else:
|
||||
self.epistemic = False
|
||||
self.iou_min = 0.25
|
||||
|
||||
# load the model
|
||||
input_size = 17 * 2
|
||||
|
||||
# self.model = TriLinear(input_size=input_size, output_size=output_size, p_dropout=dropout)
|
||||
self.model = LinearModel(input_size=input_size, output_size=output_size)
|
||||
self.model.load_state_dict(torch.load(args.model, map_location=lambda storage, loc: storage))
|
||||
self.model.eval() # Default is train
|
||||
self.model.to(self.device)
|
||||
|
||||
# Import
|
||||
from utils.camera import preprocess_single, get_keypoints, get_depth
|
||||
self.preprocess_single = preprocess_single
|
||||
self.get_keypoints = get_keypoints
|
||||
self.get_depth = get_depth
|
||||
|
||||
from utils.misc import epistemic_variance, laplace_sampling, get_idx_max
|
||||
self.epistemic_variance = epistemic_variance
|
||||
self.laplace_sampling = laplace_sampling
|
||||
self.get_idx_max = get_idx_max
|
||||
from visuals.printer import Printer
|
||||
self.Printer = Printer
|
||||
|
||||
from utils.normalize import unnormalize_bi
|
||||
self.unnormalize_bi = unnormalize_bi
|
||||
|
||||
from utils.kitti import get_simplified_calibration, get_calibration
|
||||
self.get_simplified_calibration = get_simplified_calibration
|
||||
self.get_calibration = get_calibration
|
||||
|
||||
from utils.pifpaf import get_input_data
|
||||
self.get_input_data = get_input_data
|
||||
|
||||
def run(self):
|
||||
|
||||
cnt = 0
|
||||
|
||||
if self.no_calib:
|
||||
# kk = [[1266.4, 0., 816.27], [0, 1266.4, 491.5], [0., 0., 1.]]
|
||||
kk = [[718.3351, 0., 600.3891], [0., 718.3351, 181.5122], [0., 0., 1.]]
|
||||
|
||||
else:
|
||||
kk, tt = self.get_calibration(self.path_calib)
|
||||
# kk = self.get_simplified_calibration(self.path_calib)
|
||||
|
||||
# kk = [[1266.4, 0., 816.27], [0, 1266.4, 491.5], [0., 0., 1.]]
|
||||
# self.factor = 1
|
||||
# self.y_scale = 1
|
||||
(inputs_norm, xy_kps), (uv_kps, uv_boxes, uv_centers, uv_shoulders) = \
|
||||
self.get_input_data(self.boxes, self.keypoints, kk, left_to_right=True)
|
||||
|
||||
# Conversion into torch tensor
|
||||
if len(inputs_norm) > 0:
|
||||
with torch.no_grad():
|
||||
inputs = torch.from_numpy(np.array(inputs_norm)).float()
|
||||
inputs = inputs.to(self.device)
|
||||
# self.model.to("cpu")
|
||||
start = time.time()
|
||||
# Manually reactivate dropout in eval
|
||||
self.model.dropout.training = True
|
||||
total_outputs = torch.empty((0, len(xy_kps))).to(self.device)
|
||||
|
||||
if self.n_dropout > 0:
|
||||
for ii in range(self.n_dropout):
|
||||
outputs = self.model(inputs)
|
||||
outputs = self.unnormalize_bi(outputs)
|
||||
samples = self.laplace_sampling(outputs, self.n_samples)
|
||||
total_outputs = torch.cat((total_outputs, samples), 0)
|
||||
varss = total_outputs.std(0)
|
||||
else:
|
||||
varss = [0] * len(inputs_norm)
|
||||
|
||||
# # Don't use dropout for the mean prediction
|
||||
start_single = time.time()
|
||||
self.model.dropout.training = False
|
||||
outputs = self.model(inputs)
|
||||
outputs = self.unnormalize_bi(outputs)
|
||||
end = time.time()
|
||||
print("Total Forward pass time = {:.2f} ms".format((end-start) * 1000))
|
||||
print("Single pass time = {:.2f} ms".format((end - start_single) * 1000))
|
||||
|
||||
# Print image and save json
|
||||
dic_out = defaultdict(list)
|
||||
name = os.path.basename(self.image_path)
|
||||
if self.dic_names:
|
||||
boxes_gt, dds_gt = self.dic_names[name]['boxes'], self.dic_names[name]['dds']
|
||||
|
||||
for idx, box in enumerate(uv_boxes):
|
||||
dd_pred = float(outputs[idx][0])
|
||||
ale = float(outputs[idx][1])
|
||||
var_y = float(varss[idx])
|
||||
|
||||
# Find the corresponding ground truth if available
|
||||
if self.dic_names:
|
||||
|
||||
idx_max, iou_max = self.get_idx_max(box, boxes_gt)
|
||||
if iou_max > self.iou_min:
|
||||
dd_real = dds_gt[idx_max]
|
||||
boxes_gt.pop(idx_max)
|
||||
dds_gt.pop(idx_max)
|
||||
|
||||
# In case of no matching
|
||||
else:
|
||||
dd_real = 0
|
||||
# In case of no ground truth
|
||||
else:
|
||||
dd_real = dd_pred
|
||||
|
||||
uv_center = uv_centers[idx]
|
||||
xyz_real = self.get_depth(uv_center, kk, dd_real)
|
||||
xyz_pred = self.get_depth(uv_center, kk, dd_pred)
|
||||
|
||||
dic_out['boxes'].append(box)
|
||||
dic_out['dds_real'].append(dd_real)
|
||||
dic_out['dds_pred'].append(dd_pred)
|
||||
dic_out['stds_ale'].append(ale)
|
||||
dic_out['stds_epi'].append(var_y)
|
||||
dic_out['xyz_real'].append(xyz_real)
|
||||
dic_out['xyz_pred'].append(xyz_pred)
|
||||
dic_out['xy_kps'].append(xy_kps[idx])
|
||||
dic_out['uv_kps'].append(uv_kps[idx])
|
||||
dic_out['uv_centers'].append(uv_center)
|
||||
dic_out['uv_shoulders'].append(uv_shoulders[idx])
|
||||
|
||||
if any((xx in self.output_types for xx in ['front', 'bird', 'combined'])):
|
||||
|
||||
printer = self.Printer(self.image_path, self.output_path, dic_out, kk,
|
||||
y_scale=self.y_scale, output_types=self.output_types,
|
||||
show=self.show, z_max=self.z_max, epistemic=self.epistemic)
|
||||
printer.print()
|
||||
|
||||
if 'json' in self.output_types:
|
||||
|
||||
with open(os.path.join(self.output_path + '.monoloco.json'), 'w') as ff:
|
||||
json.dump(dic_out, ff)
|
||||
|
||||
sys.stdout.write('\r' + 'Saving image {}'.format(cnt) + '\t')
|
||||
202
src/utils/camera.py
Normal file
202
src/utils/camera.py
Normal file
@ -0,0 +1,202 @@
|
||||
|
||||
import numpy as np
|
||||
import math
|
||||
|
||||
|
||||
def pixel_to_camera(uv1, kk, z_met):
|
||||
"""
|
||||
(3,) array --> (3,) array
|
||||
Convert a point in pixel coordinate to absolute camera coordinates
|
||||
"""
|
||||
|
||||
kk_1 = np.linalg.inv(kk)
|
||||
xyz_met_norm = np.dot(kk_1, uv1)
|
||||
xyz_met = xyz_met_norm * z_met
|
||||
return xyz_met
|
||||
|
||||
|
||||
def project_to_pixels(xyz, kk):
|
||||
"""Project a single point in space into the image"""
|
||||
xx, yy, zz = np.dot(kk, xyz)
|
||||
uu = int(xx / zz)
|
||||
vv = int(yy / zz)
|
||||
|
||||
return uu, vv
|
||||
|
||||
|
||||
def project_3d(box_obj, kk):
|
||||
"""
|
||||
Project a 3D bounding box into the image plane using the central corners
|
||||
"""
|
||||
|
||||
box_2d = []
|
||||
|
||||
# Obtain the 3d points of the box
|
||||
xc, yc, zc = box_obj.center
|
||||
ww, ll, hh, = box_obj.wlh
|
||||
|
||||
# Points corresponding to a box at the z of the center
|
||||
x1 = xc - ww/2
|
||||
y1 = yc - hh/2 # Y axis directed below
|
||||
x2 = xc + ww/2
|
||||
y2 = yc + hh/2
|
||||
xyz1 = np.array([x1, y1, zc])
|
||||
xyz2 = np.array([x2, y2, zc])
|
||||
corners_3d = np.array([xyz1, xyz2])
|
||||
|
||||
# Project them and convert into pixel coordinates
|
||||
for xyz in corners_3d:
|
||||
xx, yy, zz = np.dot(kk, xyz)
|
||||
uu = xx / zz
|
||||
vv = yy / zz
|
||||
box_2d.append(uu)
|
||||
box_2d.append(vv)
|
||||
|
||||
return box_2d
|
||||
|
||||
|
||||
def preprocess_single(kps, kk):
|
||||
|
||||
""" Preprocess input of a single annotations
|
||||
Input_kps = list of 4 elements with 0=x, 1=y, 2= confidence, 3 = ? in pixels
|
||||
Output_kps = [x0, y0, x1,...x15, y15] in meters normalized (z=1) and zero-centered using the center of the box
|
||||
"""
|
||||
|
||||
kps_uv = []
|
||||
kps_0c = []
|
||||
kps_orig = []
|
||||
|
||||
# Create center of the bounding box using min max of the keypoints
|
||||
uu_c, vv_c = get_keypoints(kps[0], kps[1], mode='center')
|
||||
uv_center = np.array([uu_c, vv_c, 1])
|
||||
|
||||
# Create a list of single arrays of (u, v, 1)
|
||||
for idx, _ in enumerate(kps[0]):
|
||||
uv_kp = np.array([kps[0][idx], kps[1][idx], 1])
|
||||
kps_uv.append(uv_kp)
|
||||
|
||||
# Take the ground joint
|
||||
vv_gr = max(kps[1])
|
||||
|
||||
# Projection in normalized image coordinates and zero-center with the center of the bounding box
|
||||
xy1_center = pixel_to_camera(uv_center, kk, 1) * 10
|
||||
for idx, kp in enumerate(kps_uv):
|
||||
kp_proj = pixel_to_camera(kp, kk, 1) * 10
|
||||
kp_proj_0c = kp_proj - xy1_center
|
||||
kps_0c.append(float(kp_proj_0c[0]))
|
||||
kps_0c.append(float(kp_proj_0c[1]))
|
||||
|
||||
kp_orig = pixel_to_camera(kp, kk, 1)
|
||||
kps_orig.append(float(kp_orig[0]))
|
||||
kps_orig.append(float(kp_orig[1]))
|
||||
|
||||
|
||||
# Append the y of the ground foot to the keypoints
|
||||
# kp_gr = np.array([0, vv_gr, 1])
|
||||
# xy1_gr = pixel_to_camera(kp_gr, kk, 1)
|
||||
# kps_0c.append(float(xy1_gr[1]))
|
||||
|
||||
return kps_0c, kps_orig
|
||||
|
||||
|
||||
def get_keypoints(kps_0, kps_1, mode):
|
||||
"""Get the center of 2 lists"""
|
||||
|
||||
assert mode == 'center' or mode == 'shoulder' or mode == 'hip'
|
||||
|
||||
if mode == 'center':
|
||||
uu = (max(kps_0) - min(kps_0)) / 2 + min(kps_0)
|
||||
vv = (max(kps_1) - min(kps_1)) / 2 + min(kps_1)
|
||||
|
||||
elif mode == 'shoulder':
|
||||
uu = float(np.average(kps_0[5:7]))
|
||||
vv = float(np.average(kps_1[5:7]))
|
||||
|
||||
elif mode == 'hip':
|
||||
uu = float(np.average(kps_0[11:13]))
|
||||
vv = float(np.average(kps_1[11:13]))
|
||||
|
||||
return uu, vv
|
||||
|
||||
|
||||
def transform_kp(kps, tr_mode):
|
||||
"""Apply different transformations to the keypoints based on the tr_mode"""
|
||||
|
||||
assert tr_mode == "None" or tr_mode == "singularity" or tr_mode == "upper" or tr_mode == "lower" \
|
||||
or tr_mode == "horizontal" or tr_mode == "vertical" or tr_mode == "lateral" \
|
||||
or tr_mode == 'shoulder' or tr_mode == 'knee' or tr_mode == 'upside' or tr_mode == 'falling' \
|
||||
or tr_mode == 'random'
|
||||
|
||||
uu_c, vv_c = get_keypoints(kps[0], kps[1], mode='center')
|
||||
|
||||
if tr_mode == "None":
|
||||
return kps
|
||||
|
||||
elif tr_mode == "singularity":
|
||||
uus = [uu_c for uu in kps[0]]
|
||||
vvs = [vv_c for vv in kps[1]]
|
||||
|
||||
elif tr_mode == "vertical":
|
||||
uus = [uu_c for uu in kps[0]]
|
||||
vvs = kps[1]
|
||||
|
||||
elif tr_mode == 'horizontal':
|
||||
uus = kps[0]
|
||||
vvs = [vv_c for vv in kps[1]]
|
||||
|
||||
elif tr_mode == 'lower':
|
||||
uus = kps[0]
|
||||
vvs = kps[1][:9] + [vv_c for vv in kps[1][9:]]
|
||||
|
||||
elif tr_mode == 'upper':
|
||||
uus = kps[0]
|
||||
vvs = [vv_c for vv in kps[1][:9]] + kps[1][9:]
|
||||
|
||||
elif tr_mode == 'lateral':
|
||||
uus = []
|
||||
for idx, kp in enumerate(kps[0]):
|
||||
if idx % 2 == 1:
|
||||
uus.append(kp)
|
||||
else:
|
||||
uus.append(uu_c)
|
||||
vvs = kps[1]
|
||||
|
||||
elif tr_mode == 'shoulder':
|
||||
uus = kps[0]
|
||||
vvs = kps[1][:7] + [kps[1][6] for vv in kps[1][7:]]
|
||||
|
||||
elif tr_mode == 'knee':
|
||||
uus = kps[0]
|
||||
vvs = [kps[1][14] for vv in kps[1][:13]] + kps[1][13:]
|
||||
|
||||
elif tr_mode == 'up':
|
||||
uus = kps[0]
|
||||
vvs = [kp - 300 for kp in kps[1]]
|
||||
|
||||
elif tr_mode == 'falling':
|
||||
uus = [kps[0][16] - kp + kps[1][16] for kp in kps[1]]
|
||||
vvs = [kps[1][16] - kp + kps[0][16] for kp in kps[0]]
|
||||
|
||||
elif tr_mode == 'random':
|
||||
uu_min = min(kps[0])
|
||||
uu_max = max(kps[0])
|
||||
vv_min = min(kps[1])
|
||||
vv_max = max(kps[1])
|
||||
np.random.seed(6)
|
||||
uus = np.random.uniform(uu_min, uu_max, len(kps[0])).tolist()
|
||||
vvs = np.random.uniform(vv_min, vv_max, len(kps[1])).tolist()
|
||||
|
||||
return [uus, vvs, kps[2], []]
|
||||
|
||||
|
||||
def get_depth(uv_center, kk, dd):
|
||||
|
||||
if len(uv_center) == 2:
|
||||
uv_center.extend([1])
|
||||
uv_center_np = np.array(uv_center)
|
||||
xyz_norm = pixel_to_camera(uv_center, kk, 1)
|
||||
zz = dd / math.sqrt(1 + xyz_norm[0] ** 2 + xyz_norm[1] ** 2)
|
||||
|
||||
xyz = pixel_to_camera(uv_center_np, kk, zz).tolist()
|
||||
|
||||
return xyz
|
||||
125
src/utils/kitti.py
Normal file
125
src/utils/kitti.py
Normal file
@ -0,0 +1,125 @@
|
||||
|
||||
import numpy as np
|
||||
import copy
|
||||
|
||||
from utils.camera import pixel_to_camera, get_keypoints
|
||||
from eval.geom_baseline import compute_distance_single
|
||||
|
||||
|
||||
def eval_geometric(uv_kps, uv_centers, uv_shoulders, kk, average_y=0.48):
|
||||
"""
|
||||
Evaluate geometric distance
|
||||
"""
|
||||
xy_centers = []
|
||||
dds_geom = []
|
||||
for idx, _ in enumerate(uv_centers):
|
||||
uv_center = copy.deepcopy(uv_centers[idx])
|
||||
uv_center.append(1)
|
||||
uv_shoulder = copy.deepcopy(uv_shoulders[idx])
|
||||
uv_shoulder.append(1)
|
||||
uv_kp = uv_kps[idx]
|
||||
xy_center = pixel_to_camera(uv_center, kk, 1)
|
||||
xy_centers.append(xy_center.tolist())
|
||||
|
||||
uu_2, vv_2 = get_keypoints(uv_kp[0], uv_kp[1], mode='hip')
|
||||
uv_hip = [uu_2, vv_2, 1]
|
||||
|
||||
zz, _ = compute_distance_single(uv_shoulder, uv_hip, kk, average_y)
|
||||
xyz_center = np.array([xy_center[0], xy_center[1], zz])
|
||||
dd_geom = float(np.linalg.norm(xyz_center))
|
||||
dds_geom.append(dd_geom)
|
||||
|
||||
return dds_geom, xy_centers
|
||||
|
||||
|
||||
def get_calibration(path_txt):
|
||||
"""Read calibration parameters from txt file:
|
||||
For the left color camera we use P2 which is K * [I|t]
|
||||
|
||||
P = [fu, 0, x0, fu*t1-x0*t3
|
||||
0, fv, y0, fv*t2-y0*t3
|
||||
0, 0, 1, t3]
|
||||
|
||||
check also http://ksimek.github.io/2013/08/13/intrinsic/
|
||||
|
||||
Simple case test:
|
||||
xyz = np.array([2, 3, 30, 1]).reshape(4, 1)
|
||||
xyz_2 = xyz[0:-1] + tt
|
||||
uv_temp = np.dot(kk, xyz_2)
|
||||
uv_1 = uv_temp / uv_temp[-1]
|
||||
kk_1 = np.linalg.inv(kk)
|
||||
xyz_temp2 = np.dot(kk_1, uv_1)
|
||||
xyz_new_2 = xyz_temp2 * xyz_2[2]
|
||||
xyz_fin_2 = xyz_new_2 - tt
|
||||
"""
|
||||
|
||||
with open(path_txt, "r") as ff:
|
||||
file = ff.readlines()
|
||||
p2_str = file[2].split()[1:]
|
||||
p2_list = [float(xx) for xx in p2_str]
|
||||
p2 = np.array(p2_list).reshape(3, 4)
|
||||
|
||||
kk = p2[:, :-1]
|
||||
f_x = kk[0, 0]
|
||||
f_y = kk[1, 1]
|
||||
x0 = kk[2, 0]
|
||||
y0 = kk[2, 1]
|
||||
aa = p2[0, 3]
|
||||
bb = p2[1, 3]
|
||||
t3 = p2[2, 3]
|
||||
t1 = (aa - x0*t3) / f_x
|
||||
t2 = (bb - y0*t3) / f_y
|
||||
tt = np.array([t1, t2, t3]).reshape(3, 1)
|
||||
return kk, tt
|
||||
|
||||
|
||||
def get_simplified_calibration(path_txt):
|
||||
|
||||
with open(path_txt, "r") as ff:
|
||||
file = ff.readlines()
|
||||
|
||||
for line in file:
|
||||
if line[:4] == 'K_02':
|
||||
kk_str = line[4:].split()[1:]
|
||||
kk_list = [float(xx) for xx in kk_str]
|
||||
kk = np.array(kk_list).reshape(3, 3).tolist()
|
||||
return kk
|
||||
|
||||
raise ValueError('Matrix K_02 not found in the file')
|
||||
|
||||
|
||||
def check_conditions(line, mode, thresh=0.5):
|
||||
|
||||
"""Check conditions of our or m3d txt file"""
|
||||
|
||||
check = False
|
||||
assert mode == 'gt' or mode == 'm3d' or mode == '3dop' or mode == 'our', "Type not recognized"
|
||||
|
||||
if mode == 'm3d' or mode == '3dop':
|
||||
conf = line.split()[15]
|
||||
if line[:10] == 'pedestrian' and float(conf) >= thresh:
|
||||
check = True
|
||||
|
||||
elif mode == 'gt':
|
||||
if line[:10] == 'Pedestrian':
|
||||
check = True
|
||||
|
||||
elif mode == 'our':
|
||||
if line[10] >= thresh:
|
||||
check = True
|
||||
|
||||
return check
|
||||
|
||||
|
||||
def get_category(box, trunc, occ):
|
||||
|
||||
hh = box[3] - box[1]
|
||||
|
||||
if hh >= 40 and trunc <= 0.15 and occ <= 0:
|
||||
cat = 'easy'
|
||||
elif trunc <= 0.3 and occ <= 1:
|
||||
cat = 'moderate'
|
||||
else:
|
||||
cat = 'hard'
|
||||
|
||||
return cat
|
||||
207
src/utils/misc.py
Normal file
207
src/utils/misc.py
Normal file
@ -0,0 +1,207 @@
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
import time
|
||||
import logging
|
||||
# from shapely.geometry import box as Sbox
|
||||
|
||||
def set_logger(log_path):
|
||||
"""Set the logger to log info in terminal and file `log_path`.
|
||||
```
|
||||
logging.info("Starting training...")
|
||||
```
|
||||
Args:
|
||||
log_path: (string) where to log
|
||||
"""
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.setLevel(logging.INFO)
|
||||
|
||||
# Logging to a file
|
||||
file_handler = logging.FileHandler(log_path)
|
||||
file_handler.setFormatter(logging.Formatter('%(asctime)s:%(levelname)s: %(message)s'))
|
||||
logger.addHandler(file_handler)
|
||||
|
||||
# Logging to console
|
||||
stream_handler = logging.StreamHandler()
|
||||
stream_handler.setFormatter(logging.Formatter('%(message)s'))
|
||||
logger.addHandler(stream_handler)
|
||||
|
||||
return logger
|
||||
|
||||
|
||||
def calculate_iou(box1, box2):
|
||||
|
||||
# Calculate the (x1, y1, x2, y2) coordinates of the intersection of box1 and box2. Calculate its Area.
|
||||
xi1 = max(box1[0], box2[0])
|
||||
yi1 = max(box1[1], box2[1])
|
||||
xi2 = min(box1[2], box2[2])
|
||||
yi2 = min(box1[3], box2[3])
|
||||
inter_area = max((xi2 - xi1), 0) * max((yi2 - yi1), 0) # Max keeps into account not overlapping box
|
||||
|
||||
# Calculate the Union area by using Formula: Union(A,B) = A + B - Inter(A,B)
|
||||
box1_area = (box1[2] - box1[0]) * (box1[3] - box1[1])
|
||||
box2_area = (box2[2] - box2[0]) * (box2[3] - box2[1])
|
||||
union_area = box1_area + box2_area - inter_area
|
||||
|
||||
# compute the IoU
|
||||
iou = inter_area / union_area
|
||||
|
||||
return iou
|
||||
|
||||
|
||||
def get_idx_max(box, boxes_gt):
|
||||
|
||||
"""Compute and save errors between a single box and the gt box which match"""
|
||||
|
||||
iou_max = 0
|
||||
idx_max = 0
|
||||
for idx_gt, box_gt in enumerate(boxes_gt):
|
||||
iou = calculate_iou(box, box_gt)
|
||||
if iou > iou_max:
|
||||
idx_max = idx_gt
|
||||
iou_max = iou
|
||||
|
||||
return idx_max, iou_max
|
||||
|
||||
|
||||
def reparametrize_box3d(box):
|
||||
"""Reparametrized 3D box in the XZ plane and add the height"""
|
||||
|
||||
hh, ww, ll = box[0:3]
|
||||
x_c, y_c, z_c = box[3:6]
|
||||
|
||||
x1 = x_c - ll/2
|
||||
z1 = z_c - ww/2
|
||||
x2 = x_c + ll/2
|
||||
z2 = z_c + ww / 2
|
||||
|
||||
return [x1, z1, x2, z2, hh]
|
||||
|
||||
|
||||
# def calculate_iou3d(box3d_1, box3d_2):
|
||||
# """3D intersection over union. Boxes are parametrized as x1, z1, x2, z2, hh
|
||||
# We compute 2d iou in the birds plane and then add a factor for height differences (0-1)"""
|
||||
#
|
||||
# poly1 = Sbox(box3d_1[0], box3d_1[1], box3d_1[2], box3d_1[3])
|
||||
# poly2 = Sbox(box3d_2[0], box3d_2[1], box3d_2[2], box3d_2[3])
|
||||
#
|
||||
# inter_2d = poly1.intersection(poly2).area
|
||||
# union_2d = poly1.area + poly2.area - inter_2d
|
||||
#
|
||||
# # height_factor = 1 - abs(box3d_1[4] - box3d_2[4]) / max(box3d_1[4], box3d_2[4])
|
||||
#
|
||||
# #
|
||||
# iou_3d = inter_2d / union_2d # * height_factor
|
||||
#
|
||||
# return iou_3d
|
||||
|
||||
|
||||
def laplace_sampling(outputs, n_samples):
|
||||
|
||||
# np.random.seed(1)
|
||||
t0 = time.time()
|
||||
mu = outputs[:, 0]
|
||||
bi = torch.abs(outputs[:, 1])
|
||||
|
||||
|
||||
# Analytical
|
||||
# uu = np.random.uniform(low=-0.5, high=0.5, size=mu.shape[0])
|
||||
# xx = mu - bi * np.sign(uu) * np.log(1 - 2 * np.abs(uu))
|
||||
|
||||
# Sampling
|
||||
cuda_check = outputs.is_cuda
|
||||
|
||||
if cuda_check:
|
||||
get_device = outputs.get_device()
|
||||
device = torch.device(type="cuda", index=get_device)
|
||||
else:
|
||||
device = torch.device("cpu")
|
||||
t1 = time.time()
|
||||
|
||||
xxs = torch.empty((0, mu.shape[0])).to(device)
|
||||
t2 = time.time()
|
||||
|
||||
laplace = torch.distributions.Laplace(mu, bi)
|
||||
t3 = time.time()
|
||||
for ii in range(1):
|
||||
xx = laplace.sample((n_samples,))
|
||||
t4a = time.time()
|
||||
xxs = torch.cat((xxs, xx.view(n_samples, -1)), 0)
|
||||
t4 = time.time()
|
||||
|
||||
# time_tot = t4 - t0
|
||||
# time_1 = t1 - t0
|
||||
# time_2 = t2 - t1
|
||||
# time_3 = t3 - t2
|
||||
# time_4a = t4a - t3
|
||||
# time_4 = t4 - t3
|
||||
# print("Time 1: {:.1f}%".format(time_1 / time_tot * 100))
|
||||
# print("Time 2: {:.1f}%".format(time_2 / time_tot * 100))
|
||||
# print("Time 3: {:.1f}%".format(time_3 / time_tot * 100))
|
||||
# print("Time 4a: {:.1f}%".format(time_4a / time_tot * 100))
|
||||
# print("Time 4: {:.1f}%".format(time_4 / time_tot * 100))
|
||||
return xxs
|
||||
|
||||
|
||||
def epistemic_variance(total_outputs):
|
||||
"""Compute epistemic variance"""
|
||||
|
||||
# var_y = np.sum(total_outputs**2, axis=0) / total_outputs.shape[0] - (np.mean(total_outputs, axis=0))**2
|
||||
var_y = np.var(total_outputs, axis=0)
|
||||
lb = np.quantile(a=total_outputs, q=0.25, axis=0)
|
||||
up = np.quantile(a=total_outputs, q=0.75, axis=0)
|
||||
var_new = (up-lb)
|
||||
|
||||
return var_y, var_new
|
||||
|
||||
|
||||
def append_cluster(dic_jo, phase, xx, dd, kps):
|
||||
|
||||
"""Append the annotation based on its distance"""
|
||||
|
||||
# if dd <= 6:
|
||||
# dic_jo[phase]['clst']['6']['kps'].append(kps)
|
||||
# dic_jo[phase]['clst']['6']['X'].append(xx)
|
||||
# dic_jo[phase]['clst']['6']['Y'].append([dd]) # Trick to make it (nn,1) instead of (nn, )
|
||||
|
||||
if dd <= 10:
|
||||
dic_jo[phase]['clst']['10']['kps'].append(kps)
|
||||
dic_jo[phase]['clst']['10']['X'].append(xx)
|
||||
dic_jo[phase]['clst']['10']['Y'].append([dd])
|
||||
|
||||
# elif dd <= 15:
|
||||
# dic_jo[phase]['clst']['15']['kps'].append(kps)
|
||||
# dic_jo[phase]['clst']['15']['X'].append(xx)
|
||||
# dic_jo[phase]['clst']['15']['Y'].append([dd])
|
||||
|
||||
elif dd <= 20:
|
||||
dic_jo[phase]['clst']['20']['kps'].append(kps)
|
||||
dic_jo[phase]['clst']['20']['X'].append(xx)
|
||||
dic_jo[phase]['clst']['20']['Y'].append([dd])
|
||||
|
||||
# elif dd <= 25:
|
||||
# dic_jo[phase]['clst']['25']['kps'].append(kps)
|
||||
# dic_jo[phase]['clst']['25']['X'].append(xx)
|
||||
# dic_jo[phase]['clst']['25']['Y'].append([dd])
|
||||
|
||||
elif dd <= 30:
|
||||
dic_jo[phase]['clst']['30']['kps'].append(kps)
|
||||
dic_jo[phase]['clst']['30']['X'].append(xx)
|
||||
dic_jo[phase]['clst']['30']['Y'].append([dd])
|
||||
|
||||
# elif dd <= 40:
|
||||
# dic_jo[phase]['clst']['40']['kps'].append(kps)
|
||||
# dic_jo[phase]['clst']['40']['X'].append(xx)
|
||||
# dic_jo[phase]['clst']['40']['Y'].append([dd])
|
||||
#
|
||||
# elif dd <= 50:
|
||||
# dic_jo[phase]['clst']['50']['kps'].append(kps)
|
||||
# dic_jo[phase]['clst']['50']['X'].append(xx)
|
||||
# dic_jo[phase]['clst']['50']['Y'].append([dd])
|
||||
|
||||
else:
|
||||
dic_jo[phase]['clst']['>30']['kps'].append(kps)
|
||||
dic_jo[phase]['clst']['>30']['X'].append(xx)
|
||||
dic_jo[phase]['clst']['>30']['Y'].append([dd])
|
||||
|
||||
|
||||
107
src/utils/normalize.py
Normal file
107
src/utils/normalize.py
Normal file
@ -0,0 +1,107 @@
|
||||
|
||||
import torch
|
||||
import numpy as np
|
||||
|
||||
|
||||
def normalize_labels(labels, dic_norm):
|
||||
"""Apply the log to remove right-skewness and then apply Gaussian Normalization"""
|
||||
|
||||
labels_new = torch.div(torch.log(labels) - dic_norm['mean']['Y'], dic_norm['std']['Y'])
|
||||
return labels_new
|
||||
|
||||
|
||||
def unnormalize_labels(labels, dic_norm):
|
||||
"""Apply the log to remove right-skewness and then apply Gaussian Normalization"""
|
||||
|
||||
return torch.exp(labels * dic_norm['std']['Y'] + dic_norm['mean']['Y'])
|
||||
|
||||
|
||||
def unnormalize_outputs(outputs, dic_norm, dim, normalize):
|
||||
"""Unnnormalize outputs, with an option to only pass from si to bi"""
|
||||
|
||||
if dim == 1:
|
||||
assert normalize, "Inconsistency"
|
||||
return unnormalize_labels(outputs, dic_norm)
|
||||
|
||||
else:
|
||||
mu = outputs[:, 0:1]
|
||||
si = outputs[:, 1:2]
|
||||
bi = torch.exp(si)
|
||||
|
||||
if normalize:
|
||||
mu = unnormalize_labels(mu, dic_norm)
|
||||
bi = torch.exp(bi * dic_norm['std']['Y'])
|
||||
|
||||
return torch.cat((mu, bi), 1)
|
||||
|
||||
|
||||
def unnormalize_bi(outputs):
|
||||
"""Unnormalize relative bi of a nunmpy array"""
|
||||
|
||||
outputs[:, 1] = torch.exp(outputs[:, 1]) * outputs[:, 0]
|
||||
return outputs
|
||||
|
||||
|
||||
# def normalize_arrays_jo(dic_jo):
|
||||
# """Normalize according to the mean and std of each keypoint in the training dataset
|
||||
# PS normalization of training also for test and val"""
|
||||
#
|
||||
# # Normalize
|
||||
# phase = 'train'
|
||||
# kps_orig_tr = np.array(dic_jo[phase]['X'])
|
||||
# # dd_orig_tr = np.array(dic_jo[phase]['Y']).reshape(-1, 1)
|
||||
# kps_mean = np.mean(kps_orig_tr, axis=0)
|
||||
# plt.hist(kps_orig_tr, bins=100)
|
||||
# plt.show()
|
||||
# kps_std = np.std(kps_orig_tr, axis=0)
|
||||
# # dd_mean = np.mean(dd_orig_tr, axis=0)
|
||||
# # dd_std = np.std(dd_orig_tr, axis=0)
|
||||
#
|
||||
# for phase in dic_jo:
|
||||
#
|
||||
# # Compute the normalized arrays
|
||||
# kps_orig = np.array(dic_jo[phase]['X'])
|
||||
# dd_orig = np.array(dic_jo[phase]['Y']).reshape(-1, 1)
|
||||
# kps_norm = np.divide((kps_orig - kps_mean), kps_std)
|
||||
#
|
||||
# # dd_norm = np.divide((dd_orig - dd_mean), dd_std) # ! No normalization on the output
|
||||
#
|
||||
# # Substitute the new values in the dictionary and save the mean and std
|
||||
# dic_jo[phase]['X'] = kps_norm.tolist()
|
||||
# dic_jo[phase]['mean']['X'] = kps_mean.tolist()
|
||||
# dic_jo[phase]['std']['X'] = kps_std.tolist()
|
||||
#
|
||||
# dic_jo[phase]['Y'] = dd_orig.tolist()
|
||||
# # dic_jo[phase]['mean']['Y'] = float(dd_mean)
|
||||
# # dic_jo[phase]['std']['Y'] = float(dd_std)
|
||||
#
|
||||
# # Normalize all the clusters
|
||||
# for clst in dic_jo[phase]['clst']:
|
||||
#
|
||||
# # Extract
|
||||
# kps_orig = np.array(dic_jo[phase]['clst'][clst]['X'])
|
||||
# dd_orig = np.array(dic_jo[phase]['clst'][clst]['Y']).reshape(-1, 1)
|
||||
# # Normalize
|
||||
# kps_norm = np.divide((kps_orig - kps_mean), kps_std)
|
||||
#
|
||||
# # dd_norm = np.divide((dd_orig - dd_mean), dd_std) #! No normalization on the output
|
||||
#
|
||||
# # Put back
|
||||
# dic_jo[phase]['clst'][clst]['X'] = kps_norm.tolist()
|
||||
# dic_jo[phase]['clst'][clst]['Y'] = dd_orig.tolist()
|
||||
#
|
||||
# return dic_jo
|
||||
#
|
||||
#
|
||||
# def check_cluster_dim(dic_jo):
|
||||
# """ Check that the sum of the clusters corresponds to all annotations"""
|
||||
#
|
||||
# for phase in ['train', 'val', 'test']:
|
||||
# cnt_clst = 0
|
||||
# cnt_all = len(dic_jo[phase]['X'])
|
||||
# for clst in dic_jo[phase]['clst']:
|
||||
# cnt_clst += len(dic_jo[phase]['clst'][clst]['X'])
|
||||
# assert cnt_all == cnt_clst
|
||||
|
||||
|
||||
|
||||
110
src/utils/nuscenes.py
Normal file
110
src/utils/nuscenes.py
Normal file
@ -0,0 +1,110 @@
|
||||
|
||||
import random
|
||||
import json
|
||||
import os
|
||||
import numpy as np
|
||||
|
||||
|
||||
def get_unique_tokens(list_fin):
|
||||
"""
|
||||
list of json files --> list of unique scene tokens
|
||||
"""
|
||||
list_token_scene = []
|
||||
|
||||
# Open one json file at a time
|
||||
for name_fin in list_fin:
|
||||
with open(name_fin, 'r') as f:
|
||||
dict_fin = json.load(f)
|
||||
|
||||
# Check if the token scene is already in the list and if not add it
|
||||
if dict_fin['token_scene'] not in list_token_scene:
|
||||
list_token_scene.append(dict_fin['token_scene'])
|
||||
|
||||
return list_token_scene
|
||||
|
||||
|
||||
def split_scenes(list_token_scene, tr, val, dir_main, save=False, load=True):
|
||||
"""
|
||||
Split the list according tr, val percentages (test percentage is a consequence) after shuffling the order
|
||||
"""
|
||||
|
||||
path_split = os.path.join(dir_main, 'scenes', 'split_scenes.json')
|
||||
|
||||
if save:
|
||||
random.seed(1)
|
||||
random.shuffle(list_token_scene) # it shuffles in place
|
||||
n_scenes = len(list_token_scene)
|
||||
n_train = round(n_scenes * tr / 100)
|
||||
n_val = round(n_scenes * val / 100)
|
||||
list_train = list_token_scene[0: n_train]
|
||||
list_val = list_token_scene[n_train: n_train + n_val]
|
||||
list_test = list_token_scene[n_train + n_val:]
|
||||
|
||||
dic_split = {'train': list_train, 'val': list_val, 'test': list_test}
|
||||
with open(path_split, 'w') as f:
|
||||
json.dump(dic_split, f)
|
||||
|
||||
if load:
|
||||
with open(path_split, 'r') as f:
|
||||
dic_split = json.load(f)
|
||||
|
||||
return dic_split
|
||||
|
||||
|
||||
def select_categories(cat):
|
||||
"""
|
||||
Choose the categories to extract annotations from
|
||||
"""
|
||||
assert cat == 'person' or cat == 'all' or cat == 'car'
|
||||
|
||||
if cat == 'person':
|
||||
categories = ['human.pedestrian']
|
||||
|
||||
elif cat == 'all':
|
||||
categories = ['human.pedestrian',
|
||||
'vehicle.bicycle', 'vehicle.motorcycle']
|
||||
|
||||
elif cat == 'car':
|
||||
categories = ['vehicle']
|
||||
|
||||
return categories
|
||||
|
||||
|
||||
def update_with_tokens(dict_gt, nusc, token_sd):
|
||||
"""
|
||||
Update with tokens corresponding to the token_sd
|
||||
"""
|
||||
|
||||
table_sample_data = nusc.get('sample_data', token_sd) # Extract the whole record to get the sample token
|
||||
token_sample = table_sample_data['sample_token'] # Extract the sample_token from the table
|
||||
table_sample = nusc.get('sample', token_sample) # Get the record of the sample
|
||||
token_scene = table_sample['scene_token']
|
||||
dict_gt['token_sample_data'] = token_sd
|
||||
dict_gt['token_sample'] = token_sample
|
||||
dict_gt['token_scene'] = token_scene
|
||||
return dict_gt
|
||||
|
||||
|
||||
def update_with_box(dict_gt, box):
|
||||
|
||||
bbox = np.zeros(7, )
|
||||
flag_child = False
|
||||
|
||||
# Save the 3D bbox
|
||||
bbox[0:3] = box.center
|
||||
bbox[3:6] = box.wlh
|
||||
bbox[6] = box.orientation.degrees
|
||||
dict_gt['boxes'].append(bbox.tolist()) # Save as list to be serializable by a json file
|
||||
if box.name == 'human.pedestrian.child':
|
||||
flag_child = True
|
||||
|
||||
return dict_gt, flag_child
|
||||
|
||||
# def merge_gt_pif(dic_gt, im_path):
|
||||
#
|
||||
# # Extract ground truth boxes
|
||||
# boxes_2d, _ = self.project_3d(dic_gt, mode='box_center') # Project the 3D gt into the image plane
|
||||
# boxes_gt = [np.array([box[0][0], box[0][1], box[1][0], box[1][1]]) for box in boxes_2d]
|
||||
# # Match each annotation and save them in the json file
|
||||
# self.merge_gt_pif_single(dic_gt, boxes_gt, boxes_pif, keypoints, basename,
|
||||
# iou_thresh=self.iou_thresh)
|
||||
100
src/utils/pifpaf.py
Normal file
100
src/utils/pifpaf.py
Normal file
@ -0,0 +1,100 @@
|
||||
|
||||
import numpy as np
|
||||
from utils.camera import preprocess_single, get_keypoints, pixel_to_camera
|
||||
|
||||
|
||||
def preprocess_pif(annotations, im_size=None):
|
||||
"""
|
||||
Preprocess pif annotations:
|
||||
1. enlarge the box of 10%
|
||||
2. Constraint it inside the image (if image_size provided)
|
||||
"""
|
||||
|
||||
boxes = []
|
||||
keypoints = []
|
||||
|
||||
for dic in annotations:
|
||||
box = dic['bbox']
|
||||
if box[3] < 0.5: # Check for no detections (boxes 0,0,0,0)
|
||||
return [], []
|
||||
|
||||
else:
|
||||
kps = prepare_pif_kps(dic['keypoints'])
|
||||
conf = float(np.mean(np.array(kps[2])))
|
||||
|
||||
# Add 10% for y
|
||||
delta_h = (box[3] - box[1]) / 10
|
||||
delta_w = (box[2] - box[0]) / 10
|
||||
assert delta_h > 0 and delta_w > 0, "Bounding box <=0"
|
||||
box[0] -= delta_w
|
||||
box[1] -= delta_h
|
||||
box[2] += delta_w
|
||||
box[3] += delta_h
|
||||
|
||||
# Put the box inside the image
|
||||
if im_size is not None:
|
||||
box[0] = max(0, box[0])
|
||||
box[1] = max(0, box[1])
|
||||
box[2] = min(box[2], im_size[0])
|
||||
box[3] = min(box[3], im_size[1])
|
||||
|
||||
box.append(conf)
|
||||
boxes.append(box)
|
||||
keypoints.append(kps)
|
||||
|
||||
return boxes, keypoints
|
||||
|
||||
|
||||
def get_input_data(boxes, keypoints, kk, left_to_right=False):
|
||||
inputs = []
|
||||
xy_centers = []
|
||||
uv_boxes = []
|
||||
uv_centers = []
|
||||
uv_shoulders = []
|
||||
uv_kps = []
|
||||
xy_kps = []
|
||||
|
||||
if left_to_right: # Order boxes from left to right
|
||||
ordered = np.argsort([xx[0] for xx in boxes])
|
||||
|
||||
else: # Order boxes from most to less confident
|
||||
confs = []
|
||||
for idx, box in enumerate(boxes):
|
||||
confs.append(box[4])
|
||||
ordered = np.argsort(confs).tolist()[::-1]
|
||||
|
||||
for idx in ordered:
|
||||
kps = keypoints[idx]
|
||||
uv_kps.append(kps)
|
||||
uv_boxes.append(boxes[idx])
|
||||
|
||||
uu_c, vv_c = get_keypoints(kps[0], kps[1], "center")
|
||||
uv_centers.append([round(uu_c), round(vv_c)])
|
||||
xy_center = pixel_to_camera(np.array([uu_c, vv_c, 1]), kk, 1)
|
||||
xy_centers.append(xy_center)
|
||||
|
||||
uu_1, vv_1 = get_keypoints(kps[0], kps[1], "shoulder")
|
||||
uv_shoulders.append([round(uu_1), round(vv_1)])
|
||||
|
||||
# 2 steps of input normalization for each instance
|
||||
kps_prep, kps_orig = preprocess_single(kps, kk)
|
||||
inputs.append(kps_prep)
|
||||
xy_kps.append(kps_orig)
|
||||
|
||||
return (inputs, xy_kps), (uv_kps, uv_boxes, uv_centers, uv_shoulders)
|
||||
|
||||
|
||||
def prepare_pif_kps(kps_in):
|
||||
"""Convert from a list of 51 to a list of 3, 17"""
|
||||
|
||||
keypoints = np.array(kps_in).reshape(-1, 3).tolist()
|
||||
xxs = []
|
||||
yys = []
|
||||
ccs = []
|
||||
|
||||
for kp_triple in keypoints:
|
||||
xxs.append(kp_triple[0])
|
||||
yys.append(kp_triple[1])
|
||||
ccs.append(kp_triple[2])
|
||||
|
||||
return [xxs, yys, ccs]
|
||||
139
src/visuals/paper.py
Normal file
139
src/visuals/paper.py
Normal file
@ -0,0 +1,139 @@
|
||||
|
||||
import numpy as np
|
||||
import os
|
||||
import matplotlib.pyplot as plt
|
||||
from matplotlib.patches import Ellipse
|
||||
from visuals.printer import get_angle
|
||||
from visuals.printer import get_confidence
|
||||
|
||||
|
||||
def paper():
|
||||
"""Print paper figures"""
|
||||
dir_out = os.path.join('data', 'all_images', 'paper')
|
||||
|
||||
method = True
|
||||
task_error = True
|
||||
|
||||
# Pull figure
|
||||
if method:
|
||||
fig_name = 'output_method.png'
|
||||
z_max = 5
|
||||
x_max = z_max * 25 / 30
|
||||
|
||||
z_1 = 2
|
||||
z_2 = 4
|
||||
x_1 = 1
|
||||
x_2 = -2
|
||||
std_1 = 0.75
|
||||
std_2 = 1.5
|
||||
angle_1 = get_angle(x_1, z_1)
|
||||
angle_2 = get_angle(x_2, z_2)
|
||||
|
||||
(x_1_down, x_1_up), (z_1_down, z_1_up) = get_confidence(x_1, z_1, std_1)
|
||||
(x_2_down, x_2_up), (z_2_down, z_2_up) = get_confidence(x_2, z_2, std_2)
|
||||
#
|
||||
# fig = plt.figure(0)
|
||||
# ax = fig.add_subplot(1, 1, 1)
|
||||
#
|
||||
# ell_1 = Ellipse((x_1, z_1), width=std_1 * 2, height=0.3, angle=angle_1, color='b', fill=False)
|
||||
#
|
||||
# ell_2 = Ellipse((x_2, z_2), width=std_2 * 2, height=0.3, angle=angle_2, color='b', fill=False)
|
||||
#
|
||||
# ax.add_patch(ell_1)
|
||||
# ax.add_patch(ell_2)
|
||||
# plt.plot(x_1_down, z_1_down, marker='o', markersize=8, color='salmon')
|
||||
# plt.plot(x_1, z_1, 'go', markersize=8)
|
||||
# plt.plot(x_1_up, z_1_up, 'o', markersize=8, color='cornflowerblue')
|
||||
#
|
||||
# plt.plot(x_2_down, z_2_down, marker='o', markersize=8, color='salmon')
|
||||
# plt.plot(x_2, z_2, 'go', markersize=8)
|
||||
# plt.plot(x_2_up, z_2_up, 'bo', markersize=8, color='cornflowerblue')
|
||||
#
|
||||
# plt.plot([0, x_max], [0, z_max], 'k--')
|
||||
# plt.plot([0, -x_max], [0, z_max], 'k--')
|
||||
# plt.xticks([])
|
||||
# plt.yticks([])
|
||||
# plt.xlabel('X [m]')
|
||||
# plt.ylabel('Z [m]')
|
||||
# plt.show()
|
||||
# plt.close()
|
||||
|
||||
fig = plt.figure(1)
|
||||
ax = fig.add_subplot(1, 1, 1)
|
||||
|
||||
ell_3 = Ellipse((0, 2), width=std_1 * 2, height=0.3, angle=-90, color='b', fill=False, linewidth=2.5)
|
||||
ell_4 = Ellipse((0, 2), width=std_1 * 3, height=0.3, angle=-90, color='r', fill=False,
|
||||
linestyle='dashed', linewidth=2.5)
|
||||
ax.add_patch(ell_4)
|
||||
ax.add_patch(ell_3)
|
||||
plt.plot(0, 2, marker='o', color='skyblue', markersize=9)
|
||||
plt.plot([0, 3], [0, 4], 'k--')
|
||||
plt.plot([0, -3], [0, 4], 'k--')
|
||||
plt.xlim(-3, 3)
|
||||
plt.ylim(0, 3.5)
|
||||
plt.xticks([])
|
||||
plt.yticks([])
|
||||
plt.xlabel('X [m]')
|
||||
plt.ylabel('Z [m]')
|
||||
plt.savefig(os.path.join(dir_out, fig_name))
|
||||
plt.show()
|
||||
plt.close()
|
||||
|
||||
# Task error figure
|
||||
if task_error:
|
||||
plt.figure(2)
|
||||
fig_name = 'task_error.png'
|
||||
xx = np.linspace(0, 40, 100)
|
||||
mm_male = 7 / 178
|
||||
mm_female = 7 / 165
|
||||
mm_young_male = mm_male + (178-164) / 178
|
||||
mm_young_female = mm_female + (165-156) / 165
|
||||
mm_gender = gmm()
|
||||
yy_male = target_error(xx, mm_male)
|
||||
yy_female = target_error(xx, mm_female)
|
||||
yy_young_male = target_error(xx, mm_young_male)
|
||||
yy_young_female = target_error(xx, mm_young_female)
|
||||
yy_gender = target_error(xx, mm_gender)
|
||||
yy_gps = np.linspace(5., 5., xx.shape[0])
|
||||
plt.grid(linewidth=0.3)
|
||||
plt.plot(xx, yy_gps, color='y', label='GPS')
|
||||
plt.plot(xx, yy_young_male, linestyle='dotted',linewidth=2.1, color='b', label='Adult/young male')
|
||||
plt.plot(xx, yy_young_female, linestyle='dotted',linewidth=2.1, color='darkorange', label='Adult/young female')
|
||||
plt.plot(xx, yy_gender, '--', color='lightgreen', linewidth=2.8, label='Generic adult (task error)')
|
||||
plt.plot(xx, yy_female, '-.', linewidth=1.7, color='darkorange', label='Adult female')
|
||||
plt.plot(xx, yy_male, '-.', linewidth=1.7, color='b', label='Adult male')
|
||||
|
||||
plt.xlim(np.min(xx), np.max(xx))
|
||||
plt.xlabel("Distance from the camera [m]")
|
||||
plt.ylabel("Localization error due to human height variation [m]")
|
||||
plt.legend(loc=(0.01, 0.55)) # Location from 0 to 1 from lower left
|
||||
plt.savefig(os.path.join(dir_out, fig_name))
|
||||
plt.show()
|
||||
plt.close()
|
||||
|
||||
|
||||
|
||||
def target_error(xx, mm):
|
||||
return mm * xx
|
||||
|
||||
def gmm():
|
||||
mu_men = 178
|
||||
std_men = 7
|
||||
mu_women = 165
|
||||
std_women = 7
|
||||
N_men = np.random.normal(mu_men, std_men, 100000)
|
||||
N_women = np.random.normal(mu_women, std_women, 100000)
|
||||
N_gmm = np.concatenate((N_men, N_women))
|
||||
mu_gmm = np.mean(N_gmm)
|
||||
std_gmm = np.std(N_gmm)
|
||||
# sns.distplot(N_men, hist=False, rug=False, label="Men")
|
||||
# sns.distplot(N_women, hist=False, rug=False, label="Women")
|
||||
# sns.distplot(N_gmm, hist=False, rug=False, label="GMM")
|
||||
# plt.xlabel("X [cm]")
|
||||
# plt.ylabel("Height distributions of men and women")
|
||||
# plt.legend()
|
||||
# plt.show()
|
||||
print("Variace of GMM distribution: {:.2f}".format(std_gmm))
|
||||
mm_gender = std_gmm / mu_gmm
|
||||
|
||||
return mm_gender
|
||||
281
src/visuals/printer.py
Normal file
281
src/visuals/printer.py
Normal file
@ -0,0 +1,281 @@
|
||||
|
||||
import os
|
||||
import math
|
||||
import numpy as np
|
||||
import matplotlib.pyplot as plt
|
||||
import matplotlib.cm as cm
|
||||
import matplotlib
|
||||
from mpl_toolkits.axes_grid1 import make_axes_locatable
|
||||
from matplotlib.patches import Ellipse, Circle
|
||||
import cv2
|
||||
from collections import OrderedDict
|
||||
from PIL import Image
|
||||
|
||||
|
||||
class Printer:
|
||||
"""
|
||||
Print results on images: birds eye view and computed distance
|
||||
"""
|
||||
|
||||
def __init__(self, image_path, output_path, dic_ann, kk, output_types, y_scale=1, show=False,
|
||||
draw_kps=False, text=True, legend=True, epistemic=False, z_max=30, fig_width=10):
|
||||
|
||||
self.kk = kk
|
||||
self.y_scale = y_scale # 1.85 for kitti combined
|
||||
self.output_types = output_types
|
||||
self.show = show
|
||||
self.draw_kps = draw_kps
|
||||
self.text = text
|
||||
self.epistemic = epistemic
|
||||
self.legend = legend
|
||||
self.z_max = z_max
|
||||
self.fig_width = fig_width
|
||||
|
||||
from utils.camera import pixel_to_camera, get_depth
|
||||
self.pixel_to_camera = pixel_to_camera
|
||||
self.get_depth = get_depth
|
||||
|
||||
# Define the output dir
|
||||
self.path_out = output_path
|
||||
|
||||
# Define the birds eye view
|
||||
self.xx_gt = [xx[0] for xx in dic_ann['xyz_real']]
|
||||
self.zz_gt = [xx[2] for xx in dic_ann['xyz_real']]
|
||||
self.xx_pred = [xx[0] for xx in dic_ann['xyz_pred']]
|
||||
self.zz_pred = [xx[2] for xx in dic_ann['xyz_pred']]
|
||||
self.dds_real = dic_ann['dds_real']
|
||||
self.stds_ale = dic_ann['stds_ale']
|
||||
self.stds_ale_epi = dic_ann['stds_epi']
|
||||
self.uv_centers = dic_ann['uv_centers']
|
||||
self.uv_shoulders = dic_ann['uv_shoulders']
|
||||
self.uv_kps = dic_ann['uv_kps']
|
||||
|
||||
# Load the image
|
||||
with open(image_path, 'rb') as f:
|
||||
self.im = Image.open(f).convert('RGB')
|
||||
|
||||
# Resize image for aesthetic proportions
|
||||
if y_scale >= 1.02 or y_scale <= 0.98:
|
||||
ww = self.im.size[0]
|
||||
hh = self.im.size[1]
|
||||
self.im = self.im.resize((ww, round(hh * y_scale)))
|
||||
|
||||
self.uv_camera = (int(self.im.size[0] / 2), self.im.size[1])
|
||||
|
||||
def print(self):
|
||||
"""
|
||||
Show frontal, birds-eye-view or combined visualization
|
||||
With or without ground truth
|
||||
Either front and/or bird visualization or combined one
|
||||
"""
|
||||
# Parameters
|
||||
radius = 14
|
||||
radius_kps = 6
|
||||
fontsize_bv = 16
|
||||
fontsize = 18
|
||||
textcolor = 'darkorange'
|
||||
color_kps = 'yellow'
|
||||
|
||||
# Initialize combined
|
||||
if 'combined' in self.output_types:
|
||||
width = self.fig_width + 0.6 * self.fig_width
|
||||
height = self.fig_width * self.im.size[1] / self.im.size[0]
|
||||
fig_ar_1 = 1.7 # 1.3 before
|
||||
width_ratio = 1.9
|
||||
ext = '.combined.png'
|
||||
|
||||
fig, (ax1, ax0) = plt.subplots(1, 2, sharey=False, gridspec_kw={'width_ratios': [1, width_ratio]},
|
||||
figsize=(width, height))
|
||||
ax1.set_aspect(fig_ar_1)
|
||||
fig.set_tight_layout(True)
|
||||
fig.subplots_adjust(left=0.02, right=0.98, bottom=0, top=1, hspace=0, wspace=0.02)
|
||||
|
||||
assert 'front' not in self.output_types and 'bird' not in self.output_types, \
|
||||
"--combined arguments is not supported with other visualizations"
|
||||
|
||||
# Initialize front
|
||||
elif 'front' in self.output_types:
|
||||
width = self.fig_width
|
||||
height = self.fig_width * self.im.size[1] / self.im.size[0]
|
||||
|
||||
plt.figure(0)
|
||||
fig0, ax0 = plt.subplots(1, 1, figsize=(width, height))
|
||||
fig0.set_tight_layout(True)
|
||||
|
||||
# Create front
|
||||
if any(xx in self.output_types for xx in ['front', 'combined']):
|
||||
|
||||
ax0.set_axis_off()
|
||||
ax0.set_xlim(0, self.im.size[0])
|
||||
ax0.set_ylim(self.im.size[1], 0)
|
||||
ax0.imshow(self.im)
|
||||
z_min = 0
|
||||
bar_ticks = self.z_max // 5 + 1
|
||||
|
||||
cmap = cm.get_cmap('jet')
|
||||
num = 0
|
||||
for idx, uv in enumerate(self.uv_shoulders):
|
||||
|
||||
if self.draw_kps:
|
||||
ax0 = self.show_kps(ax0, self.uv_kps[idx], self.y_scale, radius_kps, color_kps)
|
||||
|
||||
elif self.zz_pred[idx] < self.z_max and 0 < self.zz_gt[idx] < self.z_max:
|
||||
color = cmap((self.zz_pred[idx] % self.z_max) / self.z_max)
|
||||
circle = Circle((uv[0], uv[1] * self.y_scale), radius=radius, color=color, fill=True)
|
||||
ax0.add_patch(circle)
|
||||
|
||||
if self.text:
|
||||
ax0.text(uv[0]+radius, uv[1] * self.y_scale - radius, str(num),
|
||||
fontsize=fontsize, color=textcolor, weight='bold')
|
||||
num += 1
|
||||
|
||||
ax0.get_xaxis().set_visible(False)
|
||||
ax0.get_yaxis().set_visible(False)
|
||||
|
||||
divider = make_axes_locatable(ax0)
|
||||
cax = divider.append_axes('right', size='3%', pad=0.05)
|
||||
|
||||
norm = matplotlib.colors.Normalize(vmin=z_min, vmax=self.z_max)
|
||||
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
|
||||
sm.set_array([])
|
||||
plt.colorbar(sm, ticks=np.linspace(z_min, self.z_max, bar_ticks),
|
||||
boundaries=np.arange(z_min - 0.05, self.z_max + 0.1, .1), cax=cax, label='Z [m]')
|
||||
# Save Front
|
||||
if 'front' in self.output_types:
|
||||
if self.show:
|
||||
plt.show()
|
||||
else:
|
||||
plt.savefig(self.path_out + '.front.png', bbox_inches='tight')
|
||||
|
||||
# Initialize Bird
|
||||
if 'bird' in self.output_types:
|
||||
plt.close()
|
||||
plt.figure(1)
|
||||
ext = '.bird.png'
|
||||
fig1, ax1 = plt.subplots(1, 1)
|
||||
fig1.set_tight_layout(True)
|
||||
|
||||
# Create bird or combine it with front)
|
||||
if any(xx in self.output_types for xx in ['bird', 'combined']):
|
||||
im_height = 375 # TODO change it
|
||||
uv_max = np.array([0, im_height, 1])
|
||||
xyz_max = self.pixel_to_camera(uv_max, self.kk, self.z_max)
|
||||
x_max = abs(xyz_max[0]) # shortcut to avoid oval circles in case of different kk
|
||||
|
||||
for idx, _ in enumerate(self.xx_gt):
|
||||
if 0 < self.zz_gt[idx] < self.z_max:
|
||||
target = get_target_error(self.dds_real[idx])
|
||||
|
||||
angle = get_angle(self.xx_gt[idx], self.zz_gt[idx])
|
||||
ellipse_real = Ellipse((self.xx_gt[idx], self.zz_gt[idx]), width=target * 2, height=1,
|
||||
angle=angle, color='lightgreen', fill=True, label="Task error")
|
||||
ax1.add_patch(ellipse_real)
|
||||
ax1.plot(self.xx_gt[idx], self.zz_gt[idx], 'kx', label="Ground truth", markersize=3)
|
||||
|
||||
# Print prediction and the real ground truth. Color of prediction depends if ground truth exists
|
||||
num = 0
|
||||
for idx, _ in enumerate(self.xx_pred):
|
||||
if 0 < self.zz_gt[idx] < self.z_max: # only the merging ones and inside the interval
|
||||
|
||||
angle = get_angle(self.xx_pred[idx], self.zz_pred[idx])
|
||||
ellipse_ale = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_ale[idx] * 2,
|
||||
height=1, angle=angle, color='b', fill=False, label="Aleatoric Uncertainty",
|
||||
linewidth=1.3)
|
||||
ellipse_var = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_ale_epi[idx] * 2,
|
||||
height=1, angle=angle, color='r', fill=False, label="Uncertainty", linewidth=1,
|
||||
linestyle='--')
|
||||
|
||||
ax1.add_patch(ellipse_ale)
|
||||
if self.epistemic:
|
||||
ax1.add_patch(ellipse_var)
|
||||
|
||||
ax1.plot(self.xx_pred[idx], self.zz_pred[idx], 'ro', label="Predicted", markersize=3)
|
||||
|
||||
# Plot the number
|
||||
if not self.draw_kps:
|
||||
(_, x_pos), (_, z_pos) = get_confidence(self.xx_pred[idx], self.zz_pred[idx], self.stds_ale_epi[idx])
|
||||
|
||||
if self.text:
|
||||
ax1.text(x_pos, z_pos, str(num), fontsize=fontsize_bv, color='darkorange')
|
||||
num += 1
|
||||
|
||||
# To avoid repetitions in the legend
|
||||
if self.legend:
|
||||
handles, labels = ax1.get_legend_handles_labels()
|
||||
by_label = OrderedDict(zip(labels, handles))
|
||||
ax1.legend(by_label.values(), by_label.keys())
|
||||
|
||||
# Adding field of view
|
||||
ax1.plot([0, x_max], [0, self.z_max], 'k--')
|
||||
ax1.plot([0, -x_max], [0, self.z_max], 'k--')
|
||||
ax1.set_ylim(0, self.z_max+1)
|
||||
# ax1.set_xmin([-10, 10])
|
||||
# ax1.set_ymin([0, 20])
|
||||
# x_max_field = min(x_max, z_max / 1.55) # ! Shortcut to avoid oval circles in case of different k
|
||||
# plt.axis([-x_max_field, x_max_field, 0, z_max])
|
||||
# plt.xticks([])
|
||||
# plt.title('Birds eye view')
|
||||
ax1.set_xlabel("X [m]")
|
||||
ax1.set_ylabel("Z [m]")
|
||||
|
||||
if self.show:
|
||||
plt.show()
|
||||
else:
|
||||
plt.savefig(self.path_out + ext, bbox_inches='tight')
|
||||
|
||||
if self.draw_kps:
|
||||
im = cv2.imread(self.path_out + ext)
|
||||
im = self.increase_brightness(im, value=30)
|
||||
hh = im.size[1]
|
||||
ww = im.size[0]
|
||||
im_new = im[0:hh, 0:round(ww/1.7)]
|
||||
cv2.imwrite(self.path_out, im_new)
|
||||
|
||||
plt.close('all')
|
||||
|
||||
def show_kps(self, ax0, uv_kp_single, y_scale, radius, color):
|
||||
|
||||
for idx, uu in enumerate(uv_kp_single[0]):
|
||||
vv = uv_kp_single[1][idx]
|
||||
circle = Circle((uu, vv * y_scale), radius=radius, color=color, fill=True)
|
||||
ax0.add_patch(circle)
|
||||
|
||||
return ax0
|
||||
|
||||
def increase_brightness(self, img, value=30):
|
||||
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
|
||||
h, s, v = cv2.split(hsv)
|
||||
|
||||
lim = 255 - value
|
||||
v[v > lim] = 255
|
||||
v[v <= lim] += value
|
||||
|
||||
final_hsv = cv2.merge((h, s, v))
|
||||
img = cv2.cvtColor(final_hsv, cv2.COLOR_HSV2BGR)
|
||||
return img
|
||||
|
||||
|
||||
def get_confidence(xx, zz, std):
|
||||
"""Obtain the points to plot the confidence of each annotation"""
|
||||
|
||||
theta = math.atan2(zz, xx)
|
||||
|
||||
delta_x = std * math.cos(theta)
|
||||
delta_z = std * math.sin(theta)
|
||||
|
||||
return (xx - delta_x, xx + delta_x), (zz - delta_z, zz + delta_z)
|
||||
|
||||
|
||||
def get_angle(xx, zz):
|
||||
"""Obtain the points to plot the confidence of each annotation"""
|
||||
|
||||
theta = math.atan2(zz, xx)
|
||||
angle = theta * (180 / math.pi)
|
||||
|
||||
return angle
|
||||
|
||||
|
||||
def get_target_error(dd):
|
||||
"""Get target error not knowing the gender"""
|
||||
mm_gender = 0.0556
|
||||
return mm_gender * dd
|
||||
260
src/visuals/results.py
Normal file
260
src/visuals/results.py
Normal file
@ -0,0 +1,260 @@
|
||||
|
||||
import os
|
||||
import time
|
||||
import numpy as np
|
||||
import matplotlib.pyplot as plt
|
||||
from matplotlib.patches import Ellipse
|
||||
|
||||
|
||||
def print_results(dic_stats, show=False, save=False):
|
||||
|
||||
"""
|
||||
Visualize error as function of the distance on the test set and compare it with target errors based on human
|
||||
height analyses. We consider:
|
||||
Position error in meters due to a height variation of 7 cm (Standard deviation already knowing the sex)
|
||||
Position error not knowing the gender (13cm as average difference --> 7.5cm of error to add)
|
||||
"""
|
||||
dir_out = 'docs'
|
||||
phase = 'test'
|
||||
x_min = 0
|
||||
x_max = 38
|
||||
xx = np.linspace(0, 60, 100)
|
||||
mm_std = 0.04
|
||||
mm_gender = 0.0556
|
||||
excl_clusters = ['all', '50', '>50', 'easy', 'moderate', 'hard']
|
||||
clusters = tuple([clst for clst in dic_stats[phase]['our']['mean'] if clst not in excl_clusters])
|
||||
yy_gender = target_error(xx, mm_gender)
|
||||
yy_gps = np.linspace(5., 5., xx.shape[0])
|
||||
|
||||
# # Recall results
|
||||
# plt.figure(1)
|
||||
# plt.xlabel("Distance [meters]")
|
||||
# plt.ylabel("Detected instances")
|
||||
# plt.xlim(x_min, x_max)
|
||||
# for method in ['our', 'm3d', '3dop']:
|
||||
# xxs = get_distances(clusters)
|
||||
# dic_cnt = dic_stats[phase][method]['cnt']
|
||||
# cnts = get_values(dic_cnt, clusters)
|
||||
# plt.plot(xxs, cnts, marker='s', label=method + '_method')
|
||||
# plt.legend()
|
||||
# plt.show()
|
||||
#
|
||||
# Precision on same instances
|
||||
fig_name = 'results.png'
|
||||
plt.xlabel("Distance [meters]")
|
||||
plt.ylabel("Average localization error [m]")
|
||||
plt.xlim(x_min, x_max)
|
||||
labels = ['Mono3D', 'Geometric Baseline', 'MonoDepth', 'Our MonoLoco', '3DOP']
|
||||
mks = ['*', '^', 'p', 's', 'o']
|
||||
mksizes = [6, 6, 6, 6, 6]
|
||||
lws = [1.5, 1.5, 1.5, 2.2, 1.6]
|
||||
colors = ['r', 'deepskyblue', 'grey', 'b', 'darkorange']
|
||||
lstyles = ['solid', 'solid', 'solid', 'solid', 'dashdot']
|
||||
|
||||
plt.plot(xx, yy_gps, '-', label="GPS Error", color='y')
|
||||
for idx, method in enumerate(['m3d_merged', 'geom_merged', 'md_merged', 'our_merged', '3dop_merged']):
|
||||
dic_errs = dic_stats[phase][method]['mean']
|
||||
errs = get_values(dic_errs, clusters)
|
||||
xxs = get_distances(clusters)
|
||||
|
||||
plt.plot(xxs, errs, marker=mks[idx], markersize=mksizes[idx], linewidth=lws[idx], label=labels[idx],
|
||||
linestyle=lstyles[idx], color=colors[idx])
|
||||
plt.plot(xx, yy_gender, '--', label="Task error", color='lightgreen', linewidth=2.5)
|
||||
plt.legend(loc='upper left')
|
||||
if show:
|
||||
plt.show()
|
||||
else:
|
||||
plt.savefig(os.path.join(dir_out, fig_name))
|
||||
plt.close()
|
||||
|
||||
# FIGURE SPREAD
|
||||
fig_name = 'spread.png'
|
||||
# fig = plt.figure(3)
|
||||
# ax = fig.add_subplot(1, 1, 1)
|
||||
fig, ax = plt.subplots(2, sharex=True)
|
||||
plt.xlabel("Distance [m]")
|
||||
plt.ylabel("Aleatoric uncertainty [m]")
|
||||
ar = 0.5 # Change aspect ratio of ellipses
|
||||
scale = 1.5 # Factor to scale ellipses
|
||||
rec_c = 0 # Center of the rectangle
|
||||
# rec_h = 2.8 # Height of the rectangle
|
||||
plots_line = True
|
||||
# ax[0].set_ylim([-3, 3])
|
||||
# ax[1].set_ylim([0, 3])
|
||||
# ax[1].set_ylabel("Aleatoric uncertainty [m]")
|
||||
# ax[0].set_ylabel("Confidence intervals")
|
||||
|
||||
dic_ale = dic_stats[phase]['our']['std_ale']
|
||||
bbs = np.array(get_values(dic_ale, clusters))
|
||||
xxs = get_distances(clusters)
|
||||
yys = target_error(np.array(xxs), mm_gender)
|
||||
# ale_het = tuple(bbs - yys)
|
||||
# plt.plot(xxs, ale_het, marker='s', label=method)
|
||||
ax[1].plot(xxs, bbs, marker='s', color='b', label="Spread b")
|
||||
ax[1].plot(xxs, yys, '--', color='lightgreen', label="Task error", linewidth=2.5)
|
||||
yys_up = [rec_c + ar/2 * scale * yy for yy in yys]
|
||||
bbs_up = [rec_c + ar/2 * scale * bb for bb in bbs]
|
||||
yys_down = [rec_c - ar/2 * scale * yy for yy in yys]
|
||||
bbs_down = [rec_c - ar/2 * scale * bb for bb in bbs]
|
||||
|
||||
if plots_line:
|
||||
ax[0].plot(xxs, yys_up, '--', color='lightgreen', markersize=5, linewidth=1)
|
||||
ax[0].plot(xxs, yys_down, '--', color='lightgreen', markersize=5, linewidth=1)
|
||||
ax[0].plot(xxs, bbs_up, marker='s', color='b', markersize=5, linewidth=0.7)
|
||||
ax[0].plot(xxs, bbs_down, marker='s', color='b', markersize=5, linewidth=0.7)
|
||||
|
||||
# rectangle = Rectangle((2, rec_c - (rec_h/2)), width=34, height=rec_h, color='black', fill=False)
|
||||
|
||||
for idx, xx in enumerate(xxs):
|
||||
te = Ellipse((xx, rec_c), width=yys[idx]*ar*scale, height=scale, angle=90, color='lightgreen', fill=True)
|
||||
bi = Ellipse((xx, rec_c), width=bbs[idx]*ar*scale, height=scale, angle=90, color='b',linewidth=1.8,
|
||||
fill=False)
|
||||
|
||||
# ax[0].add_patch(rectangle)
|
||||
ax[0].add_patch(te)
|
||||
ax[0].add_patch(bi)
|
||||
|
||||
fig.subplots_adjust(hspace=0.1)
|
||||
plt.setp([aa.get_yticklabels() for aa in fig.axes[:-1]], visible=False)
|
||||
plt.legend()
|
||||
if show:
|
||||
plt.show()
|
||||
plt.close()
|
||||
|
||||
|
||||
def target_error(xx, mm):
|
||||
return mm * xx
|
||||
|
||||
|
||||
def get_distances(clusters):
|
||||
|
||||
clusters_ext = list(clusters)
|
||||
clusters_ext.insert(0, str(0))
|
||||
distances = []
|
||||
for idx, _ in enumerate(clusters_ext[:-1]):
|
||||
clst_0 = float(clusters_ext[idx])
|
||||
clst_1 = float(clusters_ext[idx + 1])
|
||||
distances.append((clst_1 - clst_0) / 2 + clst_0)
|
||||
return tuple(distances)
|
||||
|
||||
|
||||
def get_values(dic_err, clusters):
|
||||
|
||||
errs = []
|
||||
for key in clusters:
|
||||
errs.append(dic_err[key])
|
||||
return errs
|
||||
|
||||
|
||||
def get_confidence_points(confidences, distances, errors):
|
||||
|
||||
confidence_points = []
|
||||
distance_points = []
|
||||
for idx, dd in enumerate(distances):
|
||||
|
||||
conf_perc = confidences[idx]
|
||||
confidence_points.append(errors[idx] + conf_perc)
|
||||
confidence_points.append(errors[idx] - conf_perc)
|
||||
distance_points.append(dd)
|
||||
distance_points.append(dd)
|
||||
|
||||
return distance_points, confidence_points
|
||||
|
||||
|
||||
|
||||
# def histogram(self):
|
||||
# """
|
||||
# Visualize histograms to compare error performances of net and baseline
|
||||
# """
|
||||
# # for mode in ['mean', 'std']:
|
||||
# for mode in ['mean']:
|
||||
#
|
||||
# # for phase in ['train', 'val', 'test']:
|
||||
# for phase in ['test']:
|
||||
#
|
||||
# err_geom = []
|
||||
# err_net = []
|
||||
# counts = []
|
||||
# clusters = ('10', '20', '30', '40', '50', '>50', 'all')
|
||||
# for clst in clusters:
|
||||
#
|
||||
# err_geom.append(dic_geom[phase]['error'][clst][mode])
|
||||
# counts.append(dic_geom[phase]['error'][clst]['count'])
|
||||
# err_net.append(dic_net[phase][clst][mode])
|
||||
#
|
||||
# nn = len(clusters)
|
||||
#
|
||||
# # TODO Shortcut to print 2 models in a single histogram
|
||||
# err_net_l1 = [1.1, 1.32, 2.19, 3.29, 4.38, 6.58, 2.21]
|
||||
#
|
||||
# ind = np.arange(nn) # the x locations for the groups
|
||||
# width = 0.30 # the width of the bars
|
||||
#
|
||||
# fig, ax = plt.subplots()
|
||||
# rects1 = ax.bar(ind, err_geom, width, color='b')
|
||||
# rects2 = ax.bar(ind + width, err_net_l1, width, color='r')
|
||||
# rects3 = ax.bar(ind + 2 * width, err_net, width, color='g')
|
||||
#
|
||||
# # add some text for labels, title and axes ticks
|
||||
# ax.set_ylabel('Distance error [m]')
|
||||
# ax.set_title(mode + ' of errors in ' + phase)
|
||||
# ax.set_xticks(ind + width / 2)
|
||||
# ax.set_xticklabels(clusters)
|
||||
#
|
||||
# ax.legend((rects1[0], rects2[0], rects3[0]), ('Geometrical', 'L1 Loss', 'Laplacian Loss'))
|
||||
#
|
||||
# # Attach a text label above each bar displaying number of annotations
|
||||
# for idx, rect in enumerate(rects1):
|
||||
# height = rect.get_height()
|
||||
# count = counts[idx]
|
||||
# ax.text(rect.get_x() + rect.get_width() / 2., 1.05 * height,
|
||||
# '%d' % int(count), ha='center', va='bottom')
|
||||
#
|
||||
# plt.show()
|
||||
|
||||
# def error(self):
|
||||
# """
|
||||
# Visualize error as function of the distance on the test set and compare it with target errors based on human
|
||||
# height analyses. We consider:
|
||||
# Position error in meters due to a height variation of 7 cm (Standard deviation already knowing the sex)
|
||||
# Position error not knowing the gender (13cm as average difference --> 7.5cm of error to add)
|
||||
# """
|
||||
# phase = 'test'
|
||||
# xx = np.linspace(0, 60, 100)
|
||||
# mm_std = 0.04
|
||||
# mm_gender = 0.0556
|
||||
# clusters = tuple([clst for clst in dic_net[phase] if clst not in ['all', '>50']])
|
||||
# # errors_geom = [dic_geom[phase]['error'][clst][mode] for clst in clusters]
|
||||
# errors_net = [dic_net[phase][clst]['mean'] for clst in clusters]
|
||||
# confidences = [dic_net[phase][clst]['std'][0] for clst in clusters]
|
||||
# distances = get_distances(clusters)
|
||||
# dds, zzs = get_confidence_points(confidences, distances, errors_net)
|
||||
#
|
||||
# # Set the plot
|
||||
# fig = plt.figure()
|
||||
# ax = plt.subplot(111)
|
||||
# plt.xlim(0, 50)
|
||||
# plt.xlabel("Distance [meters]")
|
||||
# plt.ylabel("Error [meters]")
|
||||
# plt.title("Error on Z Position Estimate due to height variation")
|
||||
#
|
||||
# # Plot the target errors
|
||||
# yy_std = target_error(xx, mm_std)
|
||||
# yy_gender = target_error(xx, mm_gender)
|
||||
# # yy_child = target_error(xx, mm_child)
|
||||
# # yy_test = target_error(xx, mm_test)
|
||||
# yy_gps = np.linspace(5., 5., xx.shape[0])
|
||||
# plt.plot(xx, yy_std, '--', label="Knowing the gender", color='g')
|
||||
# plt.plot(xx, yy_gender, '--', label="NOT knowing the gender", color='b')
|
||||
# plt.plot(xx, yy_gps, '-', label="GPS Error", color='y')
|
||||
#
|
||||
# for idx in range(0, len(dds), 2):
|
||||
#
|
||||
# plt.plot(dds[idx: idx+2], zzs[idx: idx+2], color='k')
|
||||
#
|
||||
# # Plot the geometric and net errors as dots
|
||||
# _ = ax.plot(distances, errors_net, 'ro', marker='s', label="Network Error (test)", markersize='8')
|
||||
# # _ = ax.plot(distances, errors_geom, 'ko', marker='o', label="Baseline Error", markersize='5')
|
||||
#
|
||||
# ax.legend(loc='upper left')
|
||||
# plt.show()
|
||||
Loading…
Reference in New Issue
Block a user