I will be attending ICML for the first time this year and wanted to dig a bit deeper into the preparation. ICML is one of the major academic Machine Learning conferences, alongside NIPS, ICLR and others.
Last year, Tesla’s AI Director Andrej Karpathy published a similar post detailing 2017 accepted paper statistics. This was a source of inspiration for this work and can be used as a comparison point to this year’s statistics.
Most Mentioned Institutions
The first thing I did was to check which institutions were most mentioned in papers. I used python, regex, and manually resolved name collisions (ie. collapsing “Google Inc”, “Google Brain”, etc. into “Google”). Each institute’s name was only counted once per paper, so we are looking at the number of mentions across papers as opposed to the number of authors from each institute.
In total, we get ~480 unique institution mentions, up from last year’s 420. The total number of accepted papers was 621, indicating that ~141 papers had cross-institute collaboration.
The top 30 institutes are listed below with their counts.
29 UC Berkeley
29 Carnegie Mellon University
21 University of California
19 University of Oxford
16 Cornell University
15 University of Toronto
15 UT Austin
14 ETH Zurich
14 École polytechnique fédérale de Lausanne
14 University of Cambridge
11 Georgia Tech
11 Duke University
11 Columbia University
10 University of Southern California
10 Tsinghua University
9 Purdue University
8 Johns Hopkins University
8 Yale University
Industry vs. Academia
In order to get a better idea of industry vs. academic representation at ICML, I looked at the percentage of industry vs. academic authors in the top 10 institutes.
Of the paper counts for the top 10 institutes, 45% of papers mentioned industry authors.
Looking specifically at Alphabet’s involvement in the paper counts for the top 10 institutes, I found that
Google & DeepMind were mentioned in 28% of papers.
What’s more surprising is that
Google & DeepMind were mentioned in nearly 13% of all papers accepted to ICML.
This is up from last year’s 6.3%: a clear sign that Alphabet is successfully continuing to scale its Machine Learning research efforts. This is especially prevalent when considering that similar to last year, academia is mentioned in ~75% of papers. Therefore, authors from academia are roughly maintaining their strong representation at this conference.
It would be interesting to continue posting this analysis over the years and watch trends shift. I would also like to run topic modelling on the abstracts of the accepted papers over time to uncover trends in the prevalence of different research topics.
If you find any errors as you read through this, please leave comments and I will resolve them shortly!