「CVPR2024-速報-」ResearchPortトップカンファレンス定点観測 vol.13

2024年6月12日 12時28分 公開

本記事3行要約:

● CVPR2024でも昨年超えの過去最多2,716件の論文が採択された
● 日本人研究者からの発表件数も最多更新で66件(のべ130名)
● 出現キーワードでは3D、Diffusion、Video、Generationが多数


トップカンファレンス定点観測シリーズ vol.13、「CVPR」の第4弾です。

これまでResearchPortでは、コンピュータビジョン分野で最高峰の会議であるCVPR(IEEE/CVF Conference on Computer Vision and Pattern Recognition)における論文投稿数の増加とそれに伴う盛況感について2021・2022・2023年(速報)版を記事にしてきました。

*参照:
「CVPR」ResearchPortトップカンファレンス定点観測 vol.1
「CVPR2022」ResearchPortトップカンファレンス定点観測 vol.3
「CVPR2023-速報-」ResearchPortトップカンファレンス定点観測 vol.9

今年も6月となりCVPR2024が開催されます。
以前からの最多論文数更新は今年度も変わらず、日本からの発表も伸びております。本会議が開催される直前に、速報記事*1として統計情報をまとめました。

CVPR2024 開催概要
▶ 開催期間: 17-21 Jun., 2024
▶ 開催都市: Seattle, WA, USA
▶ 公式HP:  https://cvpr.thecvf.com/Conferences/2024

*1 2024年6月12日時点

■CVPR2024総括

前年同様、CVPR2024でも投稿数・採択数ともに過去最高数を更新しております。投稿数は10,000件を大幅に超えており、5年前の2019年大会の2倍以上と驚異的な数字の伸びをみせております。
詳細は以前の記事を参照していただきたく、早速本題に入りたいと思います。CVPR2024最新数値含めて以下の表1・図1に示しております。

Year #papers #orals #submissions acceptance rate oral acceptance rate Venue
1996 137 73 551 24.86% 13.25% San Francisco,CA
1997 173 62 544 31.80% 11.40% San juan,Puerto Rico
1998 139 42 453 30.68% 9.27% Santa Barbara,CA
1999 192 73 503 38.17% 14.51% Fort Collins,CO
2000 220 66 466 47.21% 14.16% Hilton Head,SC
2001 273 78 920 29.67% 8.48% Kauai,HI
2002
2003 209 60 905 23.09% 6.63% Madison,WI
2004 260 54 873 29.78% 6.19% Washington,DC
2005 326 75 1,160 28.10% 6.47% San Diego,CA
2006 318 54 1,131 28.12% 4.77% New York,NY
2007 353 60 1,250 28.24% 4.80% Minneapolis,MN
2008 508 64 1,593 31.89% 4.02% Anchorage,AK
2009 384 61 1,464 26.23% 4.17% Miami,FL
2010 462 78 1,724 26.80% 4.52% San Francisco,CA
2011 436 59 1,677 26.00% 3.52% Colorado Springs,CO
2012 463 48 1,933 23.95% 2.48% Providence
2013 471 60 1,798 26.20% 3.34% Portland,OR
2014 540 104 1,807 29.88% 5.76% Columbus,OH
2015 602 71 2,123 28.36% 3.34% Boston,MA
2016 643 83 2,145 29.98% 3.87% Las Vegas,NV
2017 783 71 2,680 29.22% 2.65% Hawaii,HW
2018 979 70 3,359 29.15% 2.08% Salt Lake City,UT
2019 1,294 288 5,160 25.08% 5.58% Long Beach,CA
2020 1,466 335 5,865 25.00% 5.71% Seattle,WA
2021 1,660 295 7,015 23.66% 4.21% Nashville,TN
2022 2,063 342 8,161 25.28% 4.19% New Orleans, LA
2023 2,357 247 9,155 25.75% 2.70% Vancouver, Canada
2024 2,716 324 11,532 23.55% 2.81% Seattle,WA

表1 CVPR論文投稿数および採択率

* 2023年のoralはhighlightsとaward candidateを合わせた件数です。
* 出典:https://cvpr.thecvf.com/Conferences/2024/AcceptedPapers
 

図1 CVPR統計推移

今年は、11,532件の論文が投稿され、2,716件*2が採択されました。採択率は23.55%で、このうち324件が口頭発表論文(oral)ですので、oral採択率は2.81%となります。

査読が崩壊しているという話も聞かれるものの、論文数は順調に伸びており、コミュニティが成長・拡大していることが窺えます。

[CVPR2024 統計]
 ・論文投稿数: 11,532件
 ・採択数: 2,716件(うちoral採択 324件)
 ・採択率: 23.55%
 ・oral採択率: 2.81%

*2 websiteでは2,719件と書かれていますが、proceedingsでは2,716件でしたので取り下げ等があったと判断し2,716件を採用しています。

■日本人研究者の活躍

恒例となりました、日本人著者の比率も調べておりますので、結果を表2に示します。

全著者数(15,288名)に占める日本人の割合は、2024年において0.85%となっております。
著者比率自体は昨年より低下しておりますが、日本人著者数は年々増加傾向となっており、国内コンピュータビジョン研究の盛況が感じられる数字ではないでしょうか。

一方、「日本人著者を含む論文数」は年々微増しておりますが、全論文の伸びには追いついておらず「日本人著者が絡む論文比率」でみると減少傾向となっています。

開催年 論文数 著者数 平均著者数 日本人
著者数
日本人
著者比率
日本人著者
を含む論文
日本人著者が
絡む論文比率
2014 540 1,881 3.48 45 2.39% 26 4.81%
2015 602 2,207 3.67 33 1.50% 20 3.32%
2016 643 2,487 3.87 45 1.81% 21 3.27%
2017 783 3,185 4.07 61 1.92% 29 3.70%
2018 979 4,214 4.30 93 2.21% 38 3.88%
2019 1,294 5,863 4.53 86 1.47% 40 3.09%
2020 1,466 6,970 4.75 65 0.93% 38 2.59%
2021 1,660 8,087 4.87 72 0.89% 42 2.53%
2022 2,063 10,874 5.27 108 0.99% 52 2.52%
2023 2,357 12,722 5.40 117 0.92% 60 2.55%
2024 2,716 15,288 5.63 130 0.85% 66 2.43%

表2 CVPR投稿論文全体の著者数に占める日本人比率の推移

CVPR2024 個人別採択件数
著者 採択数
Shunsuke Saito
4
Koki Nagano
3
Ko Nishino
3
Kiyoharu Aizawa
2
Hiroyasu Akada
2
Yuki Asano
2
Yasunori Ishii
2
Yasuyuki Matsushita
2
Hajime Nagahara
2
Shohei Nobuhara
2
Fumio Okura
2
Hiroaki Santo
2
Yoichi Sato
2
Takafumi Taketomi
2
Norimichi Ukita
2
Toshihiko Yamasaki
2
Takayoshi Yamashita
2
Kazuki Adachi
1
Daiki Chijiwa
1
Yuki Endo
1
Kenji Enomoto
1
Yuto Enyo
1
Toshiaki Fujii
1
Ryo Fujita
1
Kent Fujiwara
1
Koki Fukai
1
Yasutaka Furukawa
1
Ryosuke Furuta
1
Shuji Habuchi
1
Kenji Hata
1
Yusuke Hirota
1
Yuto Horikawa
1
Daichi Horita
1
Tomoki Ichikawa
1
Satoshi Ikehata
1
Kei IKEMURA
1
Tetsugo Inada
1
Naoto Inoue
1
Naoya Iwamoto
1
Kenji Iwata
1
Sekitoshi Kanai
1
Yoshihiro Kanamori
1
Kanta Kaneda
1
Takuhiro Kaneko
1
Hisashi Kashima
1
Hirokatsu Kataoka
1
Kenji Kawaguchi
1
Kazuhiko Kawamoto
1
Hiroaki Kawashima
1
Hiroshi Kera
1
Kotaro Kikuchi
1
Akisato Kimura
1
Takumi Kobayashi
1
Satoru Koda
1
Takashi Matsubara
1
Hidenobu Matsuki
1
Fumiya Matsuzawa
1
Hajime Mihara
1
Yu Mitsuzumi
1
Ikuya Morikawa
1
Yusuke Moriuchi
1
Riku Murai
1
Yuzuru Nakamura
1
Yuta Nakashima
1
Chihiro Nakatani
1
Hideki Nakayama
1
Koichiro Niinuma
1
Kento Nishi
1
Takeru Oba
1
Masatoshi Okutomi
1
Taishi Ono
1
Mayu Otani
1
Takashi Otonari
1
Daichi Saito
1
Hiroki Sakuma
1
Takami Sato
1
Shogo Sato
1
Hiroyuki Sato
1
Satoshi Sato
1
Ryosuke Sawata
1
Hiroyuki Segawa
1
Takayuki Shimizu
1
Kaede Shiohara
1
Takahiro Shirakawa
1
Takaaki Shiratori
1
Kota Sueyoshi
1
Akihiro Sugimoto
1
Komei Sugiura
1
Kosuke Sumiyasu
1
Keita Takahashi
1
Tsubasa Takahashi
1
Ryuhei Takahashi
1
Ryuichi Takanobu
1
Hikari Takehara
1
Masato Taki
1
Towaki Takikawa
1
Yusuke Takimoto
1
Mikihiro Tanaka
1
Masatoshi Tateno
1
Masayoshi Tomizuka
1
Chihiro Tsutake
1
Seiichi Uchida
1
Takeshi Uemori
1
Yuiga Wada
1
Nobuhiko Wakai
1
Takuma Yagi
1
Kota Yamaguchi
1
Shunsuke Yasuki
1
Ryoma Yataka
1

図2 CVPR2024 日本人研究者個人別採択件数

論文出現キーワードの推移

最後に、論文タイトルに含まれるキーワードの比率も、調査いたしました。

昨年同様、“Diffusion(拡散モデル)”がトレンドキーワードとして挙がっていることが分かります。また3DやVideoなど多様なモダリティが昨年に続き注目を集めています。加えて、生成(Generation / text to image)やセグメンテーション(segmentation)など、密(Dense)な出力タスクが多く、より難易度の高いタスクに関する研究が進んでいることも見て取れます。
また特筆すべき点として、“言語モデル(language-model / LLM)”というキーワードが昨年に比べて著しく増加しており、ビジョンタスクの言語モデルとの融合や、学習時のおける活用が増えていることも分かります。
Datasetなどのキーワードも多く、今まで変わらずデータセットや問題設定における貢献も重要視されていることも伺えます。

2022 2023 2024
learning 12.92% learning 13.39% 3d 13.14%
transformer 9.50% 3d 10.97% diffusion 12.04%
video 8.97% video 8.80% learning 8.91%
3d 8.73% transformer 6.38% video 8.73%
object detection 4.58% object detection 4.29% generation 5.82%
detection 4.39% diffusion 4.21% transformer 4.09%
estimation 4.24% estimation 4.08% segmentation 3.83%
graph 3.76% representation 3.82% detection 3.61%
self supervis 3.71% detection 3.66% estimation 3.42%
segmentation 3.62% point cloud 3.49% language model 3.39%
representation 3.09% generation 3.36% reconstruction 3.09%
semantic segmentation 3.04% training 3.27% representation 2.95%
point cloud 3.04% semantic segmentation 3.23% training 2.80%
attention 2.84% reconstruction 3.19% synthesis 2.76%
adversarial 2.75% segmentation 3.15% feature 2.58%
feature 2.70% graph 3.06% dataset 2.58%
unsupervised 2.60% self supervis 3.02% object detection 2.54%
training 2.46% feature 3.02% text to image 2.21%
few shot 2.36% synthesis 2.55% generative 2.21%
reconstruction 2.36% dataset 2.55% fusion 2.14%
dataset 2.36% unsupervised 2.47% unsupervised 2.10%
generation 2.31% adversarial 2.42% graph 2.03%
prediction 2.31% resolution 2.38% point cloud 1.99%
resolution 2.31% recognition 2.25% resolution 1.95%
label 2.27% modeling 2.25% semantic segmentation 1.92%
recognition 2.27% label 2.17% prediction 1.88%
depth 2.22% prediction 2.13% adversarial 1.88%
synthesis 2.17% semi supervised 1.96% editing 1.88%
semi supervised 2.03% attention 1.91% zero shot 1.84%
matching 1.98% few shot 1.87% attention 1.80%
weakly supervised 1.98% localization 1.87% camera 1.77%
contrastive learning 1.98% spars 1.70% multi modal 1.77%
spars 1.93% weakly supervised 1.66% interaction 1.69%
localization 1.78% space 1.62% spars 1.69%
representation learning 1.69% representation learning 1.62% distillation 1.66%
end to end 1.64% matching 1.57% benchmark 1.66%
alignment 1.64% zero shot 1.53% self supervis 1.51%
attack 1.59% generative 1.49% matching 1.51%
transfer 1.54% generalization 1.44% rendering 1.47%
camera 1.45% pre training 1.44% adaptation 1.47%
generative 1.40% depth 1.44% recognition 1.47%
space 1.40% rendering 1.44% few shot 1.44%
classification 1.30% retrieval 1.40% modeling 1.40%
cross modal 1.25% fusion 1.40% tuning 1.40%
pre training 1.25% camera 1.32% depth 1.40%
interaction 1.21% contrastive learning 1.28% alignment 1.40%
generalization 1.21% alignment 1.23% latent 1.36%
search 1.21% transfer 1.23% label 1.36%
domain adaptation 1.16% benchmark 1.23% retrieval 1.33%
modeling 1.16% attack 1.23% localization 1.29%
wild 1.16% distillation 1.19% space 1.25%
compression 1.11% adaptation 1.19% llm 1.11%
rendering 1.11% cross modal 1.15% attack 1.11%
re identification 1.06% tracking 1.11% transfer 1.03%
embedding 1.06% latent 1.11% representation learning 1.03%
tracking 1.06% editing 1.11% instruction 1.03%
retrieval 1.06% embedding 1.06% generalization 0.99%
instance segmentation 1.01% multi modal 1.06% pre training 0.92%
zero shot 1.01% domain adaptation 0.98% weakly supervised 0.92%
detector 1.01% language model 0.98% semi supervised 0.92%
multi modal 1.01% grounding 0.94% tracking 0.88%
fusion 1.01% regularization 0.89% image generation 0.85%
distillation 1.01% quantization 0.89% grounding 0.85%
benchmark 0.96% instance segmentation 0.89% enhancement 0.81%
disentangl 0.96% action recognition 0.85% boosting 0.81%
latent 0.96% interaction 0.85% reasoning 0.81%
knowledge distillation 0.96% disentangl 0.81% embedding 0.81%
clustering 0.87% restoration 0.81% federated learning 0.77%
translation 0.87% clustering 0.81% disentangl 0.77%
grounding 0.87% knowledge distillation 0.81% domain adaptation 0.74%
captioning 0.82% wild 0.81% restoration 0.74%
navigation 0.82% classification 0.77% contrastive learning 0.74%
incremental learning 0.82% detector 0.77% end to end 0.74%
restoration 0.77% text to image 0.77% 3d reconstruction 0.70%
action recognition 0.77% compression 0.77% synthetic 0.70%
quantization 0.72% continual learning 0.72% pruning 0.66%
federated learning 0.72% tuning 0.68% clustering 0.66%
adaptation 0.72% federated learning 0.64% video generation 0.63%
regularization 0.72% reasoning 0.64% cross modal 0.63%
object tracking 0.72% vision and language 0.64% navigation 0.63%
variation 0.68% enhancement 0.60% continual learning 0.59%
image generation 0.68% captioning 0.60% detector 0.59%
classifier 0.68% end to end 0.60% knowledge distillation 0.55%
regression 0.68% recovery 0.60% image classification 0.55%
editing 0.68% translation 0.55% instance segmentation 0.52%
reasoning 0.63% 3d reconstruction 0.55% re identification 0.52%
diffusion 0.63% navigation 0.55% spatio temporal 0.52%
boosting 0.63% image classification 0.55% quantization 0.52%
synthetic 0.58% boosting 0.55%
enhancement 0.58% synthetic 0.55%
3d reconstruction 0.58% re identification 0.55%
correction 0.53% image generation 0.51%
vision and language 0.53%
question answer 0.53%
continual learning 0.53%
noisy label 0.53%

表3 論文出現キーワード推移(2022-2024年)

2019 2020 2021
learning 18.01% learning 17.05% learning 16.81%
video 7.42% 3d 9.62% 3d 8.43%
3d 7.11% video 7.37% video 8.31%
adversarial 5.80% estimation 5.94% unsupervised 4.64%
estimation 5.02% graph 5.25% object detection 4.58%
detection 4.56% segmentation 4.71% estimation 4.34%
feature 4.41% object detection 4.30% segmentation 4.28%
graph 4.25% adversarial 4.23% graph 3.92%
unsupervised 4.10% attention 4.09% detection 3.80%
segmentation 4.02% detection 3.68% feature 3.80%
attention 3.48% unsupervised 3.55% representation 3.25%
object detection 3.32% feature 3.48% self supervis 3.19%
recognition 2.94% recognition 3.41% point cloud 2.89%
representation 2.86% representation 3.21% resolution 2.89%
generative 2.86% search 2.73% adversarial 2.83%
point cloud 2.63% reconstruction 2.73% depth 2.71%
depth 2.40% resolution 2.59% label 2.71%
semantic segmentation 2.09% point cloud 2.59% reconstruction 2.59%
prediction 2.01% depth 2.52% recognition 2.53%
reconstruction 2.01% training 2.39% space 2.53%
domain adaptation 2.01% self supervis 2.39% generative 2.35%
matching 1.93% prediction 2.32% semantic segmentation 2.35%
re identification 1.93% domain adaptation 2.05% attention 2.29%
embedding 1.93% generative 2.05% transformer 2.23%
resolution 1.86% semantic segmentation 1.91% training 2.23%
training 1.78% synthesis 1.91% domain adaptation 2.23%
dataset 1.78% dataset 1.84% few shot 2.23%
weakly supervised 1.70% attack 1.84% camera 2.17%
tracking 1.70% space 1.84% generation 2.11%
search 1.55% matching 1.84% semi supervised 2.11%
end to end 1.55% few shot 1.77% synthesis 2.11%
camera 1.55% label 1.77% search 2.05%
synthesis 1.55% classification 1.70% prediction 1.93%
transfer 1.55% semi supervised 1.70% localization 1.75%
label 1.47% re identification 1.57% representation learning 1.75%
zero shot 1.47% camera 1.57% re identification 1.69%
captioning 1.47% transfer 1.57% weakly supervised 1.69%
few shot 1.39% end to end 1.50% dataset 1.63%
fusion 1.39% tracking 1.50% transfer 1.57%
localization 1.39% compression 1.43% modeling 1.45%
retrieval 1.39% weakly supervised 1.43% tracking 1.45%
self supervis 1.39% reasoning 1.43% end to end 1.45%
classification 1.24% embedding 1.30% embedding 1.39%
action recognition 1.24% wild 1.30% matching 1.39%
reasoning 1.24% representation learning 1.30% instance segmentation 1.39%
instance segmentation 1.24% instance segmentation 1.23% alignment 1.33%
question answer 1.24% spars 1.09% attack 1.21%
metric learning 1.16% fusion 1.09% wild 1.21%
generation 1.16% translation 1.02% cross modal 1.21%
regression 1.08% disentangl 1.02% disentangl 1.21%
attack 1.08% clustering 1.02% spars 1.21%
compression 1.08% cross modal 1.02% latent 1.21%
spars 1.01% zero shot 0.96% fusion 1.15%
space 1.01% generation 0.96% detector 1.15%
clustering 1.01% action recognition 0.96% retrieval 1.15%
wild 1.01% localization 0.96% compression 1.08%
disentangl 1.01% deep neural network 0.89% rendering 1.08%
alignment 0.93% parsing 0.89% reasoning 1.08%
benchmark 0.93% interaction 0.89% classification 1.02%
regularization 0.93% retrieval 0.89% translation 1.02%
translation 0.93% modeling 0.89% benchmark 1.02%
representation learning 0.93% latent 0.82% object tracking 1.02%
cross modal 0.93% alignment 0.82% interaction 0.96%
semi supervised 0.93% multi modal 0.82% clustering 0.96%
latent 0.85% reinforcement learning 0.82% generalization 0.90%
interaction 0.85% face recognition 0.82% contrastive learning 0.90%
variation 0.85% variation 0.75% face recognition 0.84%
detector 0.85% rendering 0.75% adaptation 0.84%
face recognition 0.85% detector 0.75% pre training 0.72%
spatio temporal 0.77% transformer 0.75% grounding 0.72%
parsing 0.77% spatio temporal 0.75% distillation 0.72%
transfer learning 0.77% image classification 0.68% spatio temporal 0.72%
image classification 0.77% captioning 0.68% restoration 0.72%
adaptation 0.77% knowledge distillation 0.68% variation 0.66%
deep neural network 0.77% editing 0.68% captioning 0.66%
modeling 0.70% regularization 0.61% action recognition 0.66%
navigation 0.62% regression 0.61% zero shot 0.60%
inpainting 0.54% object tracking 0.61% style transfer 0.60%
object tracking 0.54% adaptation 0.55% one shot 0.60%
quantization 0.54% synthetic 0.55% correction 0.60%
style transfer 0.54% meta learning 0.55% regression 0.54%
correction 0.54% benchmark 0.55% classifier 0.54%
grounding 0.54% quantization 0.55% incremental learning 0.54%
pruning 0.54% pruning 0.55% regularization 0.54%
generalization 0.55% navigation 0.54%
image generation 0.55% editing 0.54%

表4 論文出現キーワード推移(2019-2021年)

2016 2017 2018
learning 13.38% learning 14.56% learning 19.51%
video 8.55% video 8.69% video 9.30%
3d 6.38% 3d 6.26% 3d 7.35%
estimation 5.91% detection 4.47% adversarial 6.44%
feature 5.13% estimation 4.22% estimation 5.62%
detection 4.98% feature 3.96% feature 4.80%
segmentation 4.51% segmentation 3.96% segmentation 3.78%
recognition 4.35% recognition 3.83% generative 3.57%
classification 3.27% camera 2.94% object detection 3.57%
spars 3.27% depth 2.68% attention 3.47%
object detection 3.11% classification 2.55% detection 3.27%
representation 2.64% matching 2.43% re identification 3.06%
label 2.64% space 2.43% depth 3.06%
tracking 2.64% localization 2.43% unsupervised 3.06%
matching 2.49% weakly supervised 2.43% weakly supervised 2.76%
graph 2.49% attention 2.30% recognition 2.66%
camera 2.49% graph 2.17% tracking 2.55%
prediction 2.33% adversarial 2.17% matching 2.55%
reconstruction 2.33% unsupervised 2.17% localization 2.45%
training 2.18% action recognition 2.17% representation 2.35%
dataset 2.18% representation 2.04% graph 2.35%
space 2.02% tracking 2.04% resolution 2.04%
localization 2.02% generative 2.04% synthesis 2.04%
search 2.02% captioning 2.04% prediction 2.04%
unsupervised 1.87% resolution 2.04% semantic segmentation 2.04%
depth 1.87% regression 1.92% camera 1.94%
re identification 1.71% object detection 1.92% question answer 1.74%
action recognition 1.71% semantic segmentation 1.92% end to end 1.74%
embedding 1.56% wild 1.79% point cloud 1.63%
alignment 1.56% embedding 1.79% generation 1.63%
regression 1.56% retrieval 1.66% reconstruction 1.63%
semantic segmentation 1.56% prediction 1.66% domain adaptation 1.63%
weakly supervised 1.40% re identification 1.66% embedding 1.53%
descriptor 1.40% zero shot 1.66% spars 1.53%
modeling 1.40% training 1.53% space 1.43%
clustering 1.24% modeling 1.41% zero shot 1.43%
resolution 1.24% dataset 1.41% classification 1.43%
classifier 1.24% spatio temporal 1.41% dataset 1.33%
attention 1.09% label 1.41% transfer 1.23%
question answer 1.09% deep neural network 1.41% fusion 1.23%
zero shot 1.09% question answer 1.28% action recognition 1.23%
transfer 1.09% representation learning 1.28% regression 1.23%
parsing 1.09% search 1.28% training 1.23%
image classification 1.09% variation 1.15% captioning 1.23%
deep neural network 1.09% reconstruction 1.15% modeling 1.23%
light field 1.09% image classification 1.15% wild 1.12%
latent 0.93% synthesis 1.15% disentangl 1.12%
retrieval 0.93% cross modal 1.15% reinforcement learning 1.02%
3d reconstruction 0.93% spars 1.15% detector 0.92%
captioning 0.78% low rank 1.02% alignment 0.92%
benchmark 0.78% parsing 1.02% latent 0.92%
detector 0.78% generation 1.02% self supervis 0.92%
fusion 0.78% detector 1.02% deep neural network 0.92%
wild 0.78% benchmark 1.02% retrieval 0.92%
end to end 0.78% domain adaptation 1.02% label 0.92%
labeling 0.78% end to end 1.02% translation 0.92%
metric learning 0.62% light field 0.89% style transfer 0.82%
object tracking 0.62% reinforcement learning 0.89% variation 0.82%
non rigid 0.62% alignment 0.89% compression 0.82%
recovery 0.62% metric learning 0.89% clustering 0.82%
face recognition 0.62% descriptor 0.89% quantization 0.82%
clustering 0.89% reasoning 0.72%
fusion 0.89% face recognition 0.72%
bayesian 0.89% 3d reconstruction 0.72%
3d reconstruction 0.89% light field 0.72%
labeling 0.89% adaptation 0.72%
face recognition 0.77% parsing 0.61%
regularization 0.77% descriptor 0.61%
recovery 0.77% interaction 0.61%
transfer 0.64% transfer learning 0.61%
one shot 0.64% non rigid 0.51%
latent 0.64% benchmark 0.51%
instance segmentation 0.64% synthetic 0.51%
style transfer 0.64% attack 0.51%
quantization 0.64% one shot 0.51%
minimization 0.64% enhancement 0.51%
classifier 0.51% restoration 0.51%
reasoning 0.51% pruning 0.51%
point cloud 0.51% classifier 0.51%
self supervis 0.51% grounding 0.51%
fine tuning 0.51%
object recognition 0.51%
object tracking 0.51%

表5 論文出現キーワード推移(2016-2018年)

2014 2015
learning 9.44% learning 9.97%
segmentation 6.67% video 7.81%
3d 6.67% estimation 6.81%
estimation 6.11% detection 5.48%
video 5.37% 3d 5.32%
tracking 5.37% feature 5.32%
feature 5.19% recognition 4.32%
recognition 4.81% segmentation 4.32%
matching 4.26% matching 3.49%
detection 3.89% spars 3.49%
classification 3.89% graph 3.49%
graph 3.33% tracking 3.32%
depth 3.15% depth 3.32%
modeling 2.96% camera 3.32%
spars 2.96% representation 3.16%
camera 2.96% retrieval 2.66%
reconstruction 2.78% space 2.49%
space 2.78% clustering 2.33%
label 2.22% descriptor 2.33%
representation 2.22% classification 2.33%
search 2.04% object detection 2.16%
parsing 2.04% search 2.16%
3d reconstruction 1.85% label 2.16%
localization 1.85% reconstruction 2.16%
action recognition 1.85% alignment 1.83%
clustering 1.67% action recognition 1.83%
unsupervised 1.67% localization 1.83%
resolution 1.67% prediction 1.66%
alignment 1.48% regression 1.66%
labeling 1.30% dataset 1.66%
object detection 1.30% low rank 1.49%
non rigid 1.11% resolution 1.49%
wild 1.11% image classification 1.33%
weakly supervised 0.93% fusion 1.33%
semi supervised 0.93% modeling 1.33%
image classification 0.93% unsupervised 1.16%
bayesian 0.93% metric learning 1.16%
regression 0.93% transfer 1.16%
object tracking 0.93% semantic segmentation 1.16%
synthesis 0.93% re identification 1.16%
variation 0.93% detector 1.00%
enhancement 0.93% face recognition 1.00%
object recognition 0.74% rgbd 1.00%
descriptor 0.74% regularization 1.00%
pedestrian detection 0.74% variation 1.00%
prediction 0.74% parsing 1.00%
transfer learning 0.74% embedding 1.00%
minimization 0.74% bayesian 0.83%
fusion 0.74% classifier 0.83%
transfer 0.74% coarse to fine 0.83%
recovery 0.74% deep neural network 0.83%
retrieval 0.74% benchmark 0.83%
low rank 0.74% light field 0.83%
re identification 0.56% weakly supervised 0.83%
boosting 0.56% wild 0.83%
semantic segmentation 0.56% 3d reconstruction 0.83%
dataset 0.56% labeling 0.83%
spatio temporal 0.56% domain adaptation 0.66%
quantization 0.56% non rigid 0.66%
metric learning 0.56% multiple instance learning 0.66%
domain adaptation 0.56% defocus 0.66%
diffusion 0.56% semi supervised 0.66%
correction 0.66%
restoration 0.66%

表6 論文出現キーワード推移(2014-2015年)

* 今回より、出現キーワードの抽出方法を多少変更いたしました。
  過去に公開したコラム内の出現キーワードとの順位相違や、新たなキーワードが追加されております。
 

まとめ

今年も、CVPR2024開始前の予習的な意味合いで統計情報を先行してまとめ公開いたしました(2024年6月12日現在)。来週より始まるCVPR2024に現地・オンラインで参加される方の参考になれば嬉しい限りです。
 

● ResearchPortメンバーシップ募集:
https://research-p.com/mailform/
ResearchPortでは、研究者・技術者の方の研究事業開発サポートやキャリアサポートサービスを提供しております。ご興味がある方はResearchPortメンバーシップへご登録下さい。

 
編集:ResearchPort事業部

■Contact

本記事に関する質問や、ご意見・ご要望などがございましたらResearchPortまでお問い合わせください。
https://research-p.com/contactform/

関連記事
メンバーシップ登録

Membership Registration

博士課程在籍中の方 ポスドクの方 大学教員の方 企業研究者/技術者

研究者/技術者向けに、ResearchPortスタッフが個別に、一人ひとりのニーズに合わせた新たな転職先探し、副業案件や顧問依頼のご紹介をしております。その他にも、皆様の研究成果PRや研究パートナーを探すサポートなどを行っております。ご相談ご興味がある方はResearchPortメンバーシップへのご登録をお願いいたします。