G. Chéron, I. Laptev, and C. Schmid, P-CNN: Pose-based CNN Features for Action Recognition, ICCV, 2015.

A. Yao, J. Gall, and L. Van-gool, Coupled action recognition and pose estimation from multiple views, International Journal of Computer Vision, vol.100, issue.1, pp.16-37, 2012.

U. Iqbal, M. Garbade, and J. Gall, Pose for action -action for pose, 2017.

I. Kokkinos, Ubernet: Training a 'universal' convolutional neural network for low-, mid-, and high-level vision using diverse datasets and limited memory, Computer Vision and Pattern Recognition (CVPR), 2017.

M. Zolfaghari, G. L. Oliveira, N. Sedaghat, and T. Brox, Chained multistream networks exploiting pose, motion, and appearance for action classification and detection, The IEEE International Conference on Computer Vision (ICCV), 2017.

V. Choutas, P. Weinzaepfel, J. Revaud, and C. Schmid, Potion: Pose motion representation for action recognition, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
URL : https://hal.archives-ouvertes.fr/hal-01764222

D. C. Luvizon, H. Tabia, and D. Picard, Human pose regression by combining indirect part detection and contextual information, Computers and Graphics, vol.85, pp.15-22, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02314445

K. M. Yi, E. Trulls, V. Lepetit, and P. Fua, LIFT: Learned Invariant Feature Transform, European Conference on Computer Vision (ECCV), 2016.

D. C. Luvizon, D. Picard, and H. Tabia, 2d/3d pose estimation and action recognition using multitask deep learning, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
URL : https://hal.archives-ouvertes.fr/hal-01815703

N. Sarafianos, B. Boteanu, B. Ionescu, and I. A. Kakadiaris, 3d human pose estimation: A review of the literature and analysis of covariates, Computer Vision and Image Understanding, vol.152, pp.1-20, 2016.

S. Herath, M. Harandi, and F. Porikli, Going deeper into action recognition: A survey, Image and Vision Computing, vol.60, pp.4-21, 2017.

M. Andriluka, S. Roth, and B. Schiele, Pictorial structures revisited: People detection and articulated pose estimation, Computer Vision and Pattern Recognition, pp.1014-1021, 2009.

M. Dantone, J. Gall, C. Leistner, and L. V. Gool, Human Pose Estimation Using Body Parts Dependent Joint Regressors, Computer Vision and Pattern Recognition (CVPR), pp.3041-3048, 2013.

L. Pishchulin, M. Andriluka, P. Gehler, and B. Schiele, Poselet Conditioned Pictorial Structures, Computer Vision and Pattern Recognition (CVPR), pp.588-595, 2013.

G. Ning, Z. Zhang, and Z. He, Knowledge-guided deep fractal neural networks for human pose estimation, IEEE Transactions on Multimedia, issue.99, pp.1-1, 2017.

I. Lifshitz, E. Fetaya, and S. Ullman, Human Pose Estimation Using Deep Consensus Voting, pp.246-260, 2016.

L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. Andriluka et al., DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele, DeeperCut: A Deeper, Stronger, and Faster Multi-Person Pose Estimation Model, European Conference on Computer Vision (ECCV), 2016.

U. Rafi, I. Kostrikov, J. Gall, and B. Leibe, An efficient convolutional network for human pose estimation, BMVC, vol.1, p.2, 2016.

S. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh, Convolutional pose machines, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

V. Belagiannis, C. Rupprecht, G. Carneiro, and N. Navab, Robust optimization for deep regression, International Conference on Computer Vision (ICCV), pp.2830-2838, 2015.

J. Tompson, R. Goroshin, A. Jain, Y. Lecun, and C. Bregler, Efficient object localization using Convolutional Networks, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.648-656, 2015.

A. Toshev and C. Szegedy, DeepPose: Human Pose Estimation via Deep Neural Networks, Computer Vision and Pattern Recognition (CVPR), pp.1653-1660, 2014.

T. Pfister, K. Simonyan, J. Charles, and A. Zisserman, Deep convolutional neural networks for efficient pose estimation in gesture videos, Asian Conference on Computer Vision (ACCV), 2014.

A. Bulat and G. Tzimiropoulos, Human pose estimation via Convolutional Part Heatmap Regression, European Conference on Computer Vision (ECCV), pp.717-732, 2016.

G. Gkioxari, A. Toshev, and N. Jaitly, Chained Predictions Using Convolutional Neural Networks, European Conference on Computer Vision (ECCV), 2016.

A. Newell, K. Yang, and J. Deng, Stacked Hourglass Networks for Human Pose Estimation, European Conference on Computer Vision (ECCV), pp.483-499, 2016.

X. Chu, W. Yang, W. Ouyang, C. Ma, A. L. Yuille et al., Multi-context attention for human pose estimation, 2017.

W. Yang, S. Li, W. Ouyang, H. Li, and X. Wang, Learning feature pyramids for human pose estimation, The IEEE International Conference on Computer Vision (ICCV, 2017.

K. Sun, B. Xiao, D. Liu, and J. Wang, Deep high-resolution representation learning for human pose estimation, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

I. Goodfellow, J. Pouget-abadie, M. Mirza, B. Xu, D. Warde-farley et al., Generative adversarial nets, Advances in Neural Information Processing Systems, vol.27, pp.2672-2680, 2014.

C. Chou, J. Chien, and H. Chen, Self adversarial training for human pose estimation, CoRR, 2017.

Y. Chen, C. Shen, X. Wei, L. Liu, and J. Yang, Adversarial posenet: A structure-aware convolutional network for human pose estimation, The IEEE International Conference on Computer Vision (ICCV), 2017.

J. Carreira, P. Agrawal, K. Fragkiadaki, and J. Malik, Human pose estimation with iterative error feedback, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4733-4742, 2016.

X. Zhou, M. Zhu, G. Pavlakos, S. Leonardos, K. G. Derpanis et al., Monocap: Monocular human motion capture using a CNN coupled with a geometric prior, CoRR, 2017.

D. Tome, C. Russell, and L. Agapito, Lifting from the deep: Convolutional 3d pose estimation from a single image, CVPR, 2017.

J. Martinez, R. Hossain, J. Romero, and J. J. Little, A simple yet effective baseline for 3d human pose estimation, ICCV, 2017.

B. Tekin, P. Márquez-neila, M. Salzmann, and P. Fua, Fusing 2d uncertainty and 3d cues for monocular body pose estimation, CoRR, 2016.

D. Mehta, H. Rhodin, D. Casas, O. Sotnychenko, W. Xu et al., Monocular 3d human pose estimation using transfer learning and improved CNN supervision, CoRR, 2016.

A. Popa, M. Zanfir, and C. Sminchisescu, Deep multitask architecture for integrated 2d and 3d human sensing, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

C. Ionescu, D. Papava, V. Olaru, and C. Sminchisescu, Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments, TPAMI, vol.36, issue.7, pp.1325-1339, 2014.

D. Mehta, S. Sridhar, O. Sotnychenko, H. Rhodin, M. Shafiei et al.,

W. Seidel, D. Xu, C. Casas, and . Theobalt, Vnect: Real-time 3d human pose estimation with a single rgb camera, ACM Transactions on Graphics, vol.36, 2017.

C. Chen and D. Ramanan, 3d human pose estimation = 2d pose estimation + matching, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

X. Sun, J. Shang, S. Liang, and Y. Wei, Compositional human pose regression, The IEEE International Conference on Computer Vision (ICCV), 2017.

G. Pavlakos, X. Zhou, K. G. Derpanis, and K. Daniilidis, Coarse-tofine volumetric prediction for single-image 3D human pose, the IEEE Conference on Computer Vision and Pattern Recognition, 2017.

X. Sun, B. Xiao, F. Wei, S. Liang, and Y. Wei, Integral human pose regression, The European Conference on Computer Vision (ECCV), 2018.

W. Yang, W. Ouyang, X. Wang, J. S. Ren, H. Li et al., 3d human pose estimation in the wild by adversarial learning, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

U. Iqbal, P. Molchanov, T. Breuel-juergen, J. Gall, and . Kautz, Hand pose estimation via latent 2.5d heatmap regression, The European Conference on Computer Vision (ECCV), 2018.

B. Nie, C. Xiong, and S. Zhu, Joint action recognition and pose estimation from video, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

H. Jhuang, J. Gall, S. Zuffi, C. Schmid, and M. J. Black, Towards understanding action recognition, The IEEE International Conference on Computer Vision (ICCV), 2013.
URL : https://hal.archives-ouvertes.fr/hal-00906902

C. Cao, Y. Zhang, C. Zhang, and H. Lu, Body joint guided 3d deep convolutional descriptors for action recognition, CoRR, 2017.

J. Carreira and A. Zisserman, Quo vadis, action recognition? a new model and the kinetics dataset, CVPR, 2017.

G. Varol, I. Laptev, and C. Schmid, Long-term Temporal Convolutions for Action Recognition, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01241518

W. Du, Y. Wang, and Y. Qiao, Rpan: An end-to-end recurrent poseattention network for action recognition in videos, The IEEE International Conference on Computer Vision (ICCV), 2017.

F. Baradel, C. Wolf, J. Mille, and G. W. Taylor, Glimpse clouds: Human activity recognition from unstructured feature points, Computer Vision and Pattern Recognition (CVPR), 2018.
URL : https://hal.archives-ouvertes.fr/hal-01713109

D. Wang, W. Ouyang, W. Li, and D. Xu, Dividing and aggregating network for multi-view action recognition, The European Conference on Computer Vision (ECCV), 2018.

M. Liu and J. Yuan, Recognizing human actions as the evolution of pose estimation maps, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

D. C. Luvizon, H. Tabia, and D. Picard, Learning features combination for human action recognition from skeleton sequences, Pattern Recognition Letters, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01515376

L. L. Presti and M. L. Cascia, 3d skeleton-based human action classification: A survey, Pattern Recognition, vol.53, pp.130-147, 2016.

J. Liu, A. Shahroudy, D. Xu, and G. Wang, Spatio-temporal lstm with trust gates for 3d human action recognition, pp.816-833, 2016.

J. Liu, G. Wang, P. Hu, L. Duan, and A. C. Kot, Global contextaware attention lstm networks for 3d action recognition, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR, 2017.

S. Song, C. Lan, J. Xing, W. Z. , and J. Liu, An end-toend spatio-temporal attention model for human action recognition from skeleton data, AAAI Conference on Artificial Intelligence, 2017.

A. Shahroudy, T. Ng, Y. Gong, and G. Wang, Deep multimodal feature analysis for action recognition in rgb+d videos, 2017.

F. Baradel, C. Wolf, and J. Mille, Pose-conditioned spatio-temporal attention for human action recognition, 1703.
URL : https://hal.archives-ouvertes.fr/hal-01593548

F. Chollet, Xception: Deep learning with depthwise separable convolutions, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

Q. Ke, M. Bennamoun, S. An, F. Sohel, and F. Boussaid, A new representation of skeleton sequences for 3d action recognition, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, 2D Human Pose Estimation: New Benchmark and State of the Art Analysis, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.

W. Zhang, M. Zhu, and K. G. Derpanis, From actemes to action: A strongly-supervised representation for detailed action understanding, ICCV, pp.2248-2255, 2013.

A. Shahroudy, J. Liu, T. Ng, and G. Wang, Ntu rgb+d: A large scale dataset for 3d human activity analysis, CVPR, 2016.

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society, Series B, vol.67, pp.301-320, 2005.

F. Baradel, C. Wolf, J. Mille, and G. W. Taylor, Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points, Computer Vision and Pattern Recognition (CVPR), 2018.
URL : https://hal.archives-ouvertes.fr/hal-01713109