By Richard Gimarc ([email protected])
Pictured below is a word cloud developed from the presentation titles from all CMG conferences dating from the first conference in 1976 through to the most recent in 2017.
High frequency (large) words are in the center of the cloud and low frequency (small) words are around the edges. To convince myself that this representation is correct, I plotted the frequency of the top 30 words in our collection of CMG presentation titles.
The frequency chart confirms what we are seeing in the word cloud:
This blog describes what we can learn about CMG by “text mining” our archive of CMG conference presentation titles.
Scope and Focus
CMG has an extensive collection of 3,995 conference papers dating back to our first conference in 1976. The chart below shows how our archive has grown over the years.
Our text mining is restricted to examining the titles of presentations. A fuller examination of paper/presentation content out of scope.
Our goal is to see if we can answer the following three questions:
The results presented in this blog were developed using R and related text mining packages such as tm and wordcloud. We used the following steps to prepare the presentation title text for analysis:
Before:Capacity Planning with Queueing Network Models: An IMS Case Study
After: capacity plan queue network model ims case study
Analysis – Word Choice
Our initial word cloud is based on the most frequent words used in all CMG presentation titles dating back to 1976. The following table shows the top 20 words and their frequency.
These words could easily be used to describe the range of job responsibilities for CMG members. It’s interesting that “performance” is twice as frequent as any other term. In fact, the word “performance” occurs in approximately one out of every three titles. Also, I would wager that most CMG members would use 3 to 4 of the top 20 words in their job description.
Analysis – Top 10 by Decade
Next, we look at the presentation titles by decade to see how things have changed over time. What similarities do we see over CMG’s 40+ years? Are there any changes in recent years?
The following table shows the most frequent 10 words by decade. Note the following:
What have we learned?
The purpose of CMG is stated in our Bylaws (Article 2, #1)
Foster research and development, and the exchange and public dissemination of data pertaining to computer measurement, computer management, and computer performance evaluation, and underlying computer science.
It is encouraging to see quantitatively that the presentation titles at CMG conferences align with the organization’s stated purpose.
When you look at how CMG has changed over the years, starting with the mainframe, moving into client-server and distributed systems, the emergence of the web and today’s focus on cloud, apps, mobile and IoT, CMG has retained its focus: Evaluating and planning for the performance and capacity of today’s applications and environments.