update readme
This commit is contained in:
parent
cea055bb7d
commit
2c97f20fe9
75
docs/MonoLoco++.md
Normal file
75
docs/MonoLoco++.md
Normal file
@ -0,0 +1,75 @@
|
|||||||
|
|
||||||
|
# Perceiving Humans: from Monocular 3D Localization to Social Distancing
|
||||||
|
|
||||||
|
> Perceiving humans in the context of Intelligent Transportation Systems (ITS)
|
||||||
|
often relies on multiple cameras or expensive LiDAR sensors.
|
||||||
|
In this work, we present a new cost- effective vision-based method that perceives humans’ locations in 3D
|
||||||
|
and their body orientation from a single image.
|
||||||
|
We address the challenges related to the ill-posed monocular 3D tasks by proposing a deep learning method
|
||||||
|
that predicts confidence intervals in contrast to point estimates. Our neural network architecture estimates
|
||||||
|
humans 3D body locations and their orientation with a measure of uncertainty.
|
||||||
|
Our vision-based system (i) is privacy-safe, (ii) works with any fixed or moving cameras,
|
||||||
|
and (iii) does not rely on ground plane estimation.
|
||||||
|
We demonstrate the performance of our method with respect to three applications:
|
||||||
|
locating humans in 3D, detecting social interactions,
|
||||||
|
and verifying the compliance of recent safety measures due to the COVID-19 outbreak.
|
||||||
|
Indeed, we show that we can rethink the concept of “social distancing” as a form of social interaction
|
||||||
|
in contrast to a simple location-based rule. We publicly share the source code towards an open science mission.
|
||||||
|
|
||||||
|
## Preprocessing
|
||||||
|
|
||||||
|
# Kitti
|
||||||
|
Annotations from a pose detector needs to be stored in a folder.
|
||||||
|
For example by using [openpifpaf](https://github.com/vita-epfl/openpifpaf):
|
||||||
|
```
|
||||||
|
python -m openpifpaf.predict \
|
||||||
|
--glob "<kitti images directory>/*.png" \
|
||||||
|
--json-output <directory to contain predictions>
|
||||||
|
--checkpoint=shufflenetv2k30 \
|
||||||
|
--instance-threshold=0.05 --seed-threshold 0.05 --force-complete-pose
|
||||||
|
```
|
||||||
|
Once the step is complete:
|
||||||
|
`python -m monstereo.run prep --dir_ann <directory that contains predictions> --monocular`
|
||||||
|
|
||||||
|
|
||||||
|
### Collective Activity Dataset
|
||||||
|
To evaluate on of the [collective activity dataset](http://vhosts.eecs.umich.edu/vision//activity-dataset.html)
|
||||||
|
(without any training) we selected 6 scenes that contain people talking to each other.
|
||||||
|
This allows for a balanced dataset, but any other configuration will work.
|
||||||
|
|
||||||
|
THe expected structure for the dataset is the following:
|
||||||
|
|
||||||
|
collective_activity
|
||||||
|
├── images
|
||||||
|
├── annotations
|
||||||
|
|
||||||
|
where images and annotations inside have the following name convention:
|
||||||
|
|
||||||
|
IMAGES: seq<sequence_name>_frame<frame_name>.jpg
|
||||||
|
ANNOTATIONS: seq<sequence_name>_annotations.txt
|
||||||
|
|
||||||
|
With respect to the original datasets the images and annotations are moved to a single folder
|
||||||
|
and the sequence is added in their name. One command to do this is:
|
||||||
|
|
||||||
|
`rename -v -n 's/frame/seq14_frame/' f*.jpg`
|
||||||
|
|
||||||
|
which for example change the name of all the jpg images in that folder adding the sequence number
|
||||||
|
(remove `-n` after checking it works)
|
||||||
|
|
||||||
|
Pifpaf annotations should also be saved in a single folder and can be created with:
|
||||||
|
|
||||||
|
```
|
||||||
|
python -m openpifpaf.predict \
|
||||||
|
--glob "data/collective_activity/images/*.jpg" \
|
||||||
|
--checkpoint=shufflenetv2k30 \
|
||||||
|
--instance-threshold=0.05 --seed-threshold 0.05 --force-complete-pose\
|
||||||
|
--json-output /data/lorenzo-data/annotations/collective_activity/v012
|
||||||
|
```
|
||||||
|
|
||||||
|
Finally, to evaluate activity using a MonoLoco++ pre-trained model trained either on nuSCENES or KITTI:
|
||||||
|
```
|
||||||
|
python -m monstereo.run eval --activity \
|
||||||
|
--net monoloco_pp --dataset collective \
|
||||||
|
--model <MonoLoco++ model path> --dir_ann <pifpaf annotations directory>
|
||||||
|
```
|
||||||
|
|
||||||
@ -1,18 +0,0 @@
|
|||||||
|
|
||||||
# Perceiving Humans: from Monocular 3D Localization to Social Distancing
|
|
||||||
|
|
||||||
> Perceiving humans in the context of Intelligent Transportation Systems (ITS)
|
|
||||||
often relies on multiple cameras or expensive LiDAR sensors.
|
|
||||||
In this work, we present a new cost- effective vision-based method that perceives humans’ locations in 3D
|
|
||||||
and their body orientation from a single image.
|
|
||||||
We address the challenges related to the ill-posed monocular 3D tasks by proposing a deep learning method
|
|
||||||
that predicts confidence intervals in contrast to point estimates. Our neural network architecture estimates
|
|
||||||
humans 3D body locations and their orientation with a measure of uncertainty.
|
|
||||||
Our vision-based system (i) is privacy-safe, (ii) works with any fixed or moving cameras,
|
|
||||||
and (iii) does not rely on ground plane estimation.
|
|
||||||
We demonstrate the performance of our method with respect to three applications:
|
|
||||||
locating humans in 3D, detecting social interactions,
|
|
||||||
and verifying the compliance of recent safety measures due to the COVID-19 outbreak.
|
|
||||||
Indeed, we show that we can rethink the concept of “social distancing” as a form of social interaction
|
|
||||||
in contrast to a simple location-based rule. We publicly share the source code towards an open science mission.
|
|
||||||
|
|
||||||
@ -38,7 +38,8 @@ class GenerateKitti:
|
|||||||
linear_size=hidden_size)
|
linear_size=hidden_size)
|
||||||
# model_mono_pp = 'data/models/monoloco-191122-1122.pkl' # KITTI_p
|
# model_mono_pp = 'data/models/monoloco-191122-1122.pkl' # KITTI_p
|
||||||
# model_mono_pp = 'data/models/monoloco-191018-1459.pkl' # nuScenes_p
|
# model_mono_pp = 'data/models/monoloco-191018-1459.pkl' # nuScenes_p
|
||||||
model_mono_pp = 'data/models/stereoloco-200604-0949.pkl' # KITTI_pp
|
# model_mono_pp = 'data/models/stereoloco-200604-0949.pkl' # KITTI_pp
|
||||||
|
model_mono_pp = 'data/models/monstereo-201202-1745.pkl'
|
||||||
# model_mono_pp = 'data/models/stereoloco-200608-1550.pkl' # nuScenes_pp
|
# model_mono_pp = 'data/models/stereoloco-200608-1550.pkl' # nuScenes_pp
|
||||||
|
|
||||||
if 'monoloco_pp' in self.METHODS:
|
if 'monoloco_pp' in self.METHODS:
|
||||||
|
|||||||
@ -166,7 +166,7 @@ class PreprocessKitti:
|
|||||||
self.dic_jo[phase]['X'].append(inp)
|
self.dic_jo[phase]['X'].append(inp)
|
||||||
self.dic_jo[phase]['Y'].append(lab)
|
self.dic_jo[phase]['Y'].append(lab)
|
||||||
self.dic_jo[phase]['names'].append(name) # One image name for each annotation
|
self.dic_jo[phase]['names'].append(name) # One image name for each annotation
|
||||||
append_cluster(self.dic_jo, phase, inp, lab, keypoint)
|
append_cluster(self.dic_jo, phase, inp, lab, keypoint.tolist())
|
||||||
cnt_mono[phase] += 1
|
cnt_mono[phase] += 1
|
||||||
cnt_tot += 1
|
cnt_tot += 1
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user