add new readme
This commit is contained in:
parent
5a06063453
commit
a725a49291
332
README.md
332
README.md
@ -3,60 +3,322 @@
|
||||
<img src="docs/monoloco.gif" alt="gif" />
|
||||
|
||||
|
||||
This repository contains the code for two research projects:
|
||||
|
||||
1. **Perceiving Humans: from Monocular 3D Localization to Social Distancing (MonoLoco++)**
|
||||
[README](https://github.com/vita-epfl/monstereo/blob/master/docs/MonoLoco%2B%2B.md) & [Article](https://arxiv.org/abs/2009.00984)
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
|
||||
2. **MonStereo: When Monocular and Stereo Meet at the Tail of 3D Human Localization**
|
||||
[README](https://github.com/vita-epfl/monstereo/blob/master/docs/MonStereo.md) & [Article](https://arxiv.org/abs/2008.10913)
|
||||
This library is based on three research projects:
|
||||
|
||||
> __MonStereo: When Monocular and Stereo Meet at the Tail of 3D Human Localization__<br />
|
||||
> _[L. Bertoni](https://scholar.google.com/citations?user=f-4YHeMAAAAJ&hl=en), [S. Kreiss](https://www.svenkreiss.com),
|
||||
[T. Mordan](https://people.epfl.ch/taylor.mordan/?lang=en), [A. Alahi](https://scholar.google.com/citations?user=UIhXQ64AAAAJ&hl=en)_, ICRA21 --> [Article](https://arxiv.org/abs/2008.10913),[Video](#Todo)
|
||||
|
||||

|
||||
<img src="docs/000840_multi.jpg" width="420"/>
|
||||
|
||||
Both projects has been built upon the CVPR'19 project [Openpifpaf](https://github.com/vita-epfl/openpifpaf)
|
||||
for 2D pose estimation and the ICCV'19 project [MonoLoco](https://github.com/vita-epfl/monoloco) for monocular 3D localization.
|
||||
All projects share the AGPL Licence.
|
||||
---
|
||||
|
||||
|
||||
|
||||
> __Perceiving Humans: from Monocular 3D Localization to Social Distancing__<br />
|
||||
> _[L. Bertoni](https://scholar.google.com/citations?user=f-4YHeMAAAAJ&hl=en), [S. Kreiss](https://www.svenkreiss.com),
|
||||
[A. Alahi](https://scholar.google.com/citations?user=UIhXQ64AAAAJ&hl=en)_, T-ITS 2021 --> [Article](https://arxiv.org/abs/2009.00984), [Video](https://www.youtube.com/watch?v=r32UxHFAJ2M)
|
||||
|
||||
<img src="docs/social_distancing.jpg" width="420"/>
|
||||
|
||||
---
|
||||
|
||||
|
||||
> __MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation__<br />
|
||||
> _[L. Bertoni](https://scholar.google.com/citations?user=f-4YHeMAAAAJ&hl=en), [S. Kreiss](https://www.svenkreiss.com), [A.Alahi](https://scholar.google.com/citations?user=UIhXQ64AAAAJ&hl=en)_, ICCV 2019 --> [Article](https://arxiv.org/abs/1906.06059), [Video](https://www.youtube.com/watch?v=ii0fqerQrec)
|
||||
|
||||
<img src="docs/surf.jpg.combined.png" width="420"/>
|
||||
|
||||
All projects are built upon [Openpifpaf](https://github.com/vita-epfl/openpifpaf) for the 2D keypoints and share the AGPL Licence.
|
||||
|
||||
|
||||
# Setup
|
||||
Installation steps are the same for both projects.
|
||||
|
||||
### Install
|
||||
The installation has been tested on OSX and Linux operating systems, with Python 3.6 or Python 3.7.
|
||||
Packages have been installed with pip and virtual environments.
|
||||
For quick installation, do not clone this repository,
|
||||
and make sure there is no folder named monstereo in your current directory.
|
||||
# Quick setup
|
||||
A GPU is not required, yet highly recommended for real-time performances.
|
||||
MonoLoco++ and MonStereo can be installed as a single package, by:
|
||||
The installation has been tested on OSX and Linux operating systems, with Python 3.6, 3.7, 3.8.
|
||||
Packages have been installed with pip and virtual environments.
|
||||
|
||||
For quick installation, do not clone this repository, make sure there is no folder named monoloco in your current directory, and run:
|
||||
|
||||
|
||||
|
||||
```
|
||||
pip3 install monstereo
|
||||
pip3 install monoloco
|
||||
```
|
||||
|
||||
For development of the monstereo source code itself, you need to clone this repository and then:
|
||||
For development of the source code itself, you need to clone this repository and then:
|
||||
```
|
||||
pip3 install sdist
|
||||
cd monstereo
|
||||
cd monoloco
|
||||
python3 setup.py sdist bdist_wheel
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
### Interfaces
|
||||
All the commands are run through a main file called `main.py` using subparsers.
|
||||
To check all the commands for the parser and the subparsers (including openpifpaf ones) run:
|
||||
To check all the options:
|
||||
|
||||
* `python3 -m monoloco.run --help`
|
||||
* `python3 -m monoloco.run predict --help`
|
||||
* `python3 -m monoloco.run train --help`
|
||||
* `python3 -m monoloco.run eval --help`
|
||||
* `python3 -m monoloco.run prep --help`
|
||||
|
||||
or check the file `monoloco/run.py`
|
||||
|
||||
# Predictions
|
||||
For a quick setup download a pifpaf and MonoLoco++ / MonStereo models from
|
||||
[here](https://drive.google.com/drive/folders/1jZToVMBEZQMdLB5BAIq2CdCLP5kzNo9t?usp=sharing) and save them into `data/models`.
|
||||
|
||||
## Monocular 3D Localization
|
||||
The predict script receives an image (or an entire folder using glob expressions),
|
||||
calls PifPaf for 2d human pose detection over the image
|
||||
and runs Monoloco++ for 3d location of the detected poses.
|
||||
The command `--net` defines if saving pifpaf outputs, MonoLoco++ outputs or MonStereo ones.
|
||||
You can check all commands for Pifpaf at [openpifpaf](https://github.com/vita-epfl/openpifpaf).
|
||||
|
||||
Output options include json files and/or visualization of the predictions on the image in *frontal mode*,
|
||||
*birds-eye-view mode* or *combined mode* and can be specified with `--output_types`
|
||||
|
||||
Ground-truth KITTI files for comparing results can be downloaded from
|
||||
[here](https://drive.google.com/drive/folders/1jZToVMBEZQMdLB5BAIq2CdCLP5kzNo9t?usp=sharing)
|
||||
(file called *names-kitti*) and should be saved into `data/arrays`
|
||||
Ground-truth files can also be generated, more info in the preprocessing section.
|
||||
|
||||
For an example image, run the following command:
|
||||
|
||||
```
|
||||
python -m monstereo.run predict \
|
||||
docs/002282.png \
|
||||
--net monoloco_pp \
|
||||
--output_types multi \
|
||||
--model data/models/monoloco_pp-201203-1424.pkl \
|
||||
--path_gt data/arrays/names-kitti-200615-1022.json \
|
||||
-o <output directory> \
|
||||
--long-edge <rescale the image by providing dimension of long side. If None original resolution>
|
||||
--n_dropout <50 to include epistemic uncertainty, 0 otherwise>
|
||||
```
|
||||
|
||||

|
||||
|
||||
To show all the instances estimated by MonoLoco add the argument `show_all` to the above command.
|
||||
|
||||

|
||||
|
||||
It is also possible to run [openpifpaf](https://github.com/vita-epfl/openpifpaf) directly
|
||||
by specifying the network with the argument `--net pifpaf`. All the other pifpaf arguments are also supported
|
||||
and can be checked with `python -m monstereo.run predict --help`.
|
||||
|
||||

|
||||
|
||||
### Focal Length and Camera Parameters
|
||||
Absolute distances are affected by the camera intrinsic parameters.
|
||||
When processing KITTI images, the network uses the provided intrinsic matrix of the dataset.
|
||||
In all the other cases, we use the parameters of nuScenes cameras, with "1/1.8'' CMOS sensors of size 7.2 x 5.4 mm.
|
||||
The default focal length is 5.7mm and this parameter can be modified using the argument `--focal`.
|
||||
|
||||
## Social Distancing
|
||||
To visualize social distancing compliance, simply add the argument `--social-distance` to the predict command.
|
||||
|
||||
An example from the Collective Activity Dataset is provided below.
|
||||
|
||||
<img src="frame0038.jpg" width="500"/>
|
||||
|
||||
To visualize social distancing run the below, command:
|
||||
```
|
||||
python -m monstereo.run predict \
|
||||
docs/frame0038.jpg \
|
||||
--net monoloco_pp \
|
||||
--social_distance \
|
||||
--output_types front bird --show_all \
|
||||
--model data/models/monoloco_pp-201203-1424.pkl -o <output directory>
|
||||
```
|
||||
<img src="out_frame0038.jpg.front.png" width="400"/>
|
||||
|
||||
|
||||
<img src="out_frame0038.jpg.bird.png" width="400"/>
|
||||
|
||||
Threshold distance and radii (for F-formations) can be set using `--threshold-dist` and `--radii`, respectively.
|
||||
|
||||
For more info, run:
|
||||
|
||||
`python -m monstereo.run predict --help`
|
||||
|
||||
### Orientation and Bounding Box dimensions
|
||||
MonoLoco++ estimates orientation and box dimensions as well. Results are saved in a json file when using the command
|
||||
`--output_types json`. At the moment, the only visualization including orientation is the social distancing one.
|
||||
|
||||
|
||||
### Stereo 3D Localization
|
||||
The predict script receives an image (or an entire folder using glob expressions),
|
||||
calls PifPaf for 2d human pose detection over the image
|
||||
and runs MonStereo for 3d location of the detected poses.
|
||||
|
||||
Output options include json files and/or visualization of the predictions on the image in *frontal mode*,
|
||||
*birds-eye-view mode* or *multi mode* and can be specified with `--output_types`
|
||||
|
||||
|
||||
### Pre-trained Models
|
||||
* Download Monstereo pre-trained model from
|
||||
[Google Drive](https://drive.google.com/drive/folders/1jZToVMBEZQMdLB5BAIq2CdCLP5kzNo9t?usp=sharing),
|
||||
and save them in `data/models`
|
||||
(default) or in any folder and call it through the command line option `--model <model path>`
|
||||
* Pifpaf pre-trained model will be automatically downloaded at the first run.
|
||||
Three standard, pretrained models are available when using the command line option
|
||||
`--checkpoint resnet50`, `--checkpoint resnet101` and `--checkpoint resnet152`.
|
||||
Alternatively, you can download a Pifpaf pre-trained model from [openpifpaf](https://github.com/vita-epfl/openpifpaf)
|
||||
and call it with `--checkpoint <pifpaf model path>`. All experiments have been run with v0.8 of pifpaf.
|
||||
If you'd like to use an updated version, we suggest to re-train the MonStereo model as well.
|
||||
* The model for the experiments is provided in *data/models/ms-200710-1511.pkl*
|
||||
|
||||
|
||||
### Ground truth matching
|
||||
* In case you provide a ground-truth json file to compare the predictions of MonSter,
|
||||
the script will match every detection using Intersection over Union metric.
|
||||
The ground truth file can be generated using the subparser `prep` and called with the command `--path_gt`.
|
||||
As this step requires running the pose detector over all the training images and save the annotations, we
|
||||
provide the resulting json file for the category *pedestrians* from
|
||||
[Google Drive](https://drive.google.com/file/d/1e-wXTO460ip_Je2NdXojxrOrJ-Oirlgh/view?usp=sharing)
|
||||
and save it into `data/arrays`.
|
||||
|
||||
* In case the ground-truth json file is not available, with the command `--show_all`, is possible to
|
||||
show all the prediction for the image
|
||||
|
||||
After downloading model and ground-truth file, a demo can be tested with the following commands:
|
||||
|
||||
`python3 -m monstereo.run predict --glob docs/000840*.png --output_types multi --scale 2
|
||||
--model data/models/ms-200710-1511.pkl --z_max 30 --checkpoint resnet152 --path_gt data/arrays/names-kitti-200615-1022.json
|
||||
-o data/output`
|
||||
|
||||

|
||||
|
||||
`python3 -m monstereo.run predict --glob docs/005523*.png --output_types multi --scale 2
|
||||
--model data/models/ms-200710-1511.pkl --z_max 30 --checkpoint resnet152 --path_gt data/arrays/names-kitti-200615-1022.json
|
||||
-o data/output`
|
||||
|
||||

|
||||
|
||||
|
||||
## Preprocessing
|
||||
|
||||
### Kitti
|
||||
Annotations from a pose detector needs to be stored in a folder.
|
||||
For example by using [openpifpaf](https://github.com/vita-epfl/openpifpaf):
|
||||
```
|
||||
python -m openpifpaf.predict \
|
||||
--glob "<kitti images directory>/*.png" \
|
||||
--json-output <directory to contain predictions>
|
||||
--checkpoint=shufflenetv2k30 \
|
||||
--instance-threshold=0.05 --seed-threshold 0.05 --force-complete-pose
|
||||
```
|
||||
Once the step is complete:
|
||||
`python -m monstereo.run prep --dir_ann <directory that contains predictions> --monocular`
|
||||
|
||||
|
||||
### Collective Activity Dataset
|
||||
To evaluate on of the [collective activity dataset](http://vhosts.eecs.umich.edu/vision//activity-dataset.html)
|
||||
(without any training) we selected 6 scenes that contain people talking to each other.
|
||||
This allows for a balanced dataset, but any other configuration will work.
|
||||
|
||||
THe expected structure for the dataset is the following:
|
||||
|
||||
collective_activity
|
||||
├── images
|
||||
├── annotations
|
||||
|
||||
where images and annotations inside have the following name convention:
|
||||
|
||||
IMAGES: seq<sequence_name>_frame<frame_name>.jpg
|
||||
ANNOTATIONS: seq<sequence_name>_annotations.txt
|
||||
|
||||
With respect to the original datasets the images and annotations are moved to a single folder
|
||||
and the sequence is added in their name. One command to do this is:
|
||||
|
||||
`rename -v -n 's/frame/seq14_frame/' f*.jpg`
|
||||
|
||||
which for example change the name of all the jpg images in that folder adding the sequence number
|
||||
(remove `-n` after checking it works)
|
||||
|
||||
Pifpaf annotations should also be saved in a single folder and can be created with:
|
||||
|
||||
```
|
||||
python -m openpifpaf.predict \
|
||||
--glob "data/collective_activity/images/*.jpg" \
|
||||
--checkpoint=shufflenetv2k30 \
|
||||
--instance-threshold=0.05 --seed-threshold 0.05 --force-complete-pose\
|
||||
--json-output /data/lorenzo-data/annotations/collective_activity/v012
|
||||
```
|
||||
|
||||
Finally, to evaluate activity using a MonoLoco++ pre-trained model trained either on nuSCENES or KITTI:
|
||||
```
|
||||
python -m monstereo.run eval --activity \
|
||||
--net monoloco_pp --dataset collective \
|
||||
--model <MonoLoco++ model path> --dir_ann <pifpaf annotations directory>
|
||||
```
|
||||
|
||||
## Training
|
||||
We train on KITTI or nuScenes dataset specifying the path of the input joints.
|
||||
|
||||
Our results are obtained with:
|
||||
|
||||
`python -m monstereo.run train --lr 0.001 --joints data/arrays/joints-kitti-201202-1743.json --save --monocular`
|
||||
|
||||
For a more extensive list of available parameters, run:
|
||||
|
||||
`python -m monstereo.run train --help`
|
||||
|
||||
## Evaluation
|
||||
|
||||
### 3D Localization
|
||||
We provide evaluation on KITTI for models trained on nuScenes or KITTI. We compare them with other monocular
|
||||
and stereo Baselines:
|
||||
|
||||
[MonoLoco](https://github.com/vita-epfl/monoloco),
|
||||
[Mono3D](https://www.cs.toronto.edu/~urtasun/publications/chen_etal_cvpr16.pdf),
|
||||
[3DOP](https://xiaozhichen.github.io/papers/nips15chen.pdf),
|
||||
[MonoDepth](https://arxiv.org/abs/1609.03677)
|
||||
[MonoPSR](https://github.com/kujason/monopsr) and our
|
||||
[MonoDIS](https://research.mapillary.com/img/publications/MonoDIS.pdf) and our
|
||||
[Geometrical Baseline](monoloco/eval/geom_baseline.py).
|
||||
|
||||
* **Mono3D**: download validation files from [here](http://3dimage.ee.tsinghua.edu.cn/cxz/mono3d)
|
||||
and save them into `data/kitti/m3d`
|
||||
* **3DOP**: download validation files from [here](https://xiaozhichen.github.io/)
|
||||
and save them into `data/kitti/3dop`
|
||||
* **MonoDepth**: compute an average depth for every instance using the following script
|
||||
[here](https://github.com/Parrotlife/pedestrianDepth-baseline/tree/master/MonoDepth-PyTorch)
|
||||
and save them into `data/kitti/monodepth`
|
||||
* **GeometricalBaseline**: A geometrical baseline comparison is provided.
|
||||
|
||||
The average geometrical value for comparison can be obtained running:
|
||||
```
|
||||
python -m monstereo.run eval
|
||||
--dir_ann <annotation directory>
|
||||
--model <model path>
|
||||
--net monoloco_pp
|
||||
--generate
|
||||
````
|
||||
|
||||
To include also geometric baselines and MonoLoco, add the flag ``--baselines``
|
||||
|
||||
<img src="quantitative_mono.png" width="550"/>
|
||||
|
||||
Adding the argument `save`, a few plots will be added including 3D localization error as a function of distance:
|
||||
<img src="results.png" width="600"/>
|
||||
|
||||
### Activity Estimation (Talking)
|
||||
Please follow preprocessing steps for Collective activity dataset and run pifpaf over the dataset images.
|
||||
Evaluation on this dataset is done with models trained on either KITTI or nuScenes.
|
||||
For optimal performances, we suggest the model trained on nuScenes teaser (TODO add link)
|
||||
```
|
||||
python -m monstereo.run eval
|
||||
--activity
|
||||
--dataset collective
|
||||
--net monoloco_pp
|
||||
--model <path to the model>
|
||||
--dir_ann <annotation directory>
|
||||
```
|
||||
|
||||
* `python3 -m monstereo.run --help`
|
||||
* `python3 -m monstereo.run predict --help`
|
||||
* `python3 -m monstereo.run train --help`
|
||||
* `python3 -m monstereo.run eval --help`
|
||||
* `python3 -m monstereo.run prep --help`
|
||||
|
||||
or check the file `monstereo/run.py`
|
||||
|
||||
### Data structure
|
||||
|
||||
|
||||
81
README_old.md
Normal file
81
README_old.md
Normal file
@ -0,0 +1,81 @@
|
||||
# Monoloco library
|
||||
|
||||
<img src="docs/monoloco.gif" alt="gif" />
|
||||
|
||||
|
||||
This repository contains the code for two research projects:
|
||||
|
||||
1. **Perceiving Humans: from Monocular 3D Localization to Social Distancing (MonoLoco++)**
|
||||
[README](https://github.com/vita-epfl/monstereo/blob/master/docs/MonoLoco%2B%2B.md) & [Article](https://arxiv.org/abs/2009.00984)
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
|
||||
2. **MonStereo: When Monocular and Stereo Meet at the Tail of 3D Human Localization**
|
||||
[README](https://github.com/vita-epfl/monstereo/blob/master/docs/MonStereo.md) & [Article](https://arxiv.org/abs/2008.10913)
|
||||
|
||||

|
||||
|
||||
Both projects has been built upon the CVPR'19 project [Openpifpaf](https://github.com/vita-epfl/openpifpaf)
|
||||
for 2D pose estimation and the ICCV'19 project [MonoLoco](https://github.com/vita-epfl/monoloco) for monocular 3D localization.
|
||||
All projects share the AGPL Licence.
|
||||
|
||||
|
||||
# Setup
|
||||
Installation steps are the same for both projects.
|
||||
|
||||
### Install
|
||||
The installation has been tested on OSX and Linux operating systems, with Python 3.6 or Python 3.7.
|
||||
Packages have been installed with pip and virtual environments.
|
||||
For quick installation, do not clone this repository,
|
||||
and make sure there is no folder named monstereo in your current directory.
|
||||
A GPU is not required, yet highly recommended for real-time performances.
|
||||
MonoLoco++ and MonStereo can be installed as a single package, by:
|
||||
|
||||
```
|
||||
pip3 install monstereo
|
||||
```
|
||||
|
||||
For development of the monstereo source code itself, you need to clone this repository and then:
|
||||
```
|
||||
pip3 install sdist
|
||||
cd monstereo
|
||||
python3 setup.py sdist bdist_wheel
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
### Interfaces
|
||||
All the commands are run through a main file called `main.py` using subparsers.
|
||||
To check all the commands for the parser and the subparsers (including openpifpaf ones) run:
|
||||
|
||||
* `python3 -m monstereo.run --help`
|
||||
* `python3 -m monstereo.run predict --help`
|
||||
* `python3 -m monstereo.run train --help`
|
||||
* `python3 -m monstereo.run eval --help`
|
||||
* `python3 -m monstereo.run prep --help`
|
||||
|
||||
or check the file `monstereo/run.py`
|
||||
|
||||
### Data structure
|
||||
|
||||
Data
|
||||
├── arrays
|
||||
├── models
|
||||
├── kitti
|
||||
├── figures
|
||||
├── logs
|
||||
|
||||
|
||||
Run the following to create the folders:
|
||||
```
|
||||
mkdir data
|
||||
cd data
|
||||
mkdir arrays models kitti figures logs
|
||||
```
|
||||
|
||||
Further instructions for prediction, preprocessing, training and evaluation can be found here:
|
||||
|
||||
* [MonoLoco++ README](https://github.com/vita-epfl/monstereo/blob/master/docs/MonoLoco%2B%2B.md)
|
||||
* [MonStereo README](https://github.com/vita-epfl/monstereo/blob/master/docs/MonStereo.md)
|
||||
Loading…
Reference in New Issue
Block a user