How We Judge The Top Programming Languages

Our interactive rankings of the most popular programming languages was first created by data journalist Nick Diakopoulos in 2013. The current version is maintained by IEEE Spectrum senior editor Stephen Cass with development support from Prachi Patel and Michael Novakovic. As no-one can look over the shoulders of every programmer, we have chosen metrics that we believe are reasonable proxies of popularity. By combining metrics to synthesize a single ranking we hope to even out statistical fluctuations, and by changing the weights given to different metrics as they are combined lets us to emphasis different aspects, such as what’s popular with employers in our Jobs ranking. Data is gathered through a combination of manual collection and APIs, and combined using an R script.

We originally started with a list of over 300 programming languages gathered from GitHub, we looked at the volume of results found on Google when we searched for each one using the template «X programming» where «X» is the name of the language. We filtered out languages that had a very low number of search results and then went through the remaining entries by hand to narrow them down to the most interesting. Since then, each year we review the list as new languages find their footing and other languages slip into obscurity.

Our final set of 57 languages includes names familiar to most computer users, such as Java, stalwarts like Cobol and Fortran, and languages that thrive in niches, like Haskell. The Processing language was dropped from our rankings this year because it’s name is a common word even within programming (unlike, say, Python, which is a common word generally, but nearly always refers to the language within a programming context). This makes it hard to separate out when the word «processing» is referring to the language, and the result was a score that seemed artificially high for a niche language. We hope to attack this problem in next year’s rankings.

We gauged the popularity of languages using the following sources for a total of nine metrics.

Google Search

We measured the number of hits for each language by using search for the template «X programming.» This number indicates the volume of online information resources about each programming language. We took the measurement in August 2022, so it represents a snapshot of the Web at that particular moment in time. This data was gathered manually.

Twitter

We measured the number of hits on Twitter for the template «X programming» for the 7.5 months from January 2022 to mid-August 2022 using the Twitter Search API. This number indicates the amount of chatter on social media for the language and reflects the sharing of online resources like news articles or books, as well as physical social activities such as hackathons.

Stack Overflow

Stack Overflow is a popular site where programmers can ask questions about coding. We measured the number of questions posted that mention each language for the 12 months ending August 2022. Each question is tagged with the languages under discussion, and these tags are used to tabulate our measurements using the Stack Exchange API.

Reddit

Reddit is a news and information site where users post links and comments. On Reddit we measured the number of posts mentioning each of the languages, using the template «X programming» from September 2021 to August 2022 across any subreddit on the site. We collected data using the Reddit API.

IEEE Xplore Digital Library

IEEE maintains a digital library with over 3.6 million conference and journal articles covering a range of scientific and engineering disciplines. We measured the number of articles that mention each of the languages in the template «X programming» for the years 2021 and 2022. This metric captures the prevalence of the different programming languages as used and referenced in scholarship. We collected data using the IEEE Xplore API.

IEEE Jobs Site

We measured the demand for different programming languages in job postings on the IEEE Job Site. The IEEE Jobs Site has a large number of non-US listings. Because some of the languages we track could be ambiguous in plain text—such as D, Go, J, Ada, and R—we searched for job listings with those words in the job description and then manually examined listings. When the number of listings returned was greater than 500, 200 of the listings were examined as a sample, and the result used to calculate the total number of matching jobs. The search was conducted in August 2022.

CareerBuilder

We measured the demand for different programming languages on the CareerBuilder job site. CareerBuilder listings were those offered within the United Stated. Because there is no publicly available API, we manually searched for listings including each language. Because some of the languages we track could be ambiguous in plain text—such as Go, J, and R—we manually inspected listing to remove false positives (for example, listings looking for experience with the Americans with Disabilities Act rather than the Ada programming language.). When more than 200 results were returned, 200 of the listings were examined as a sample, and the result used to calculate the total number of matching jobs. The search was conducted in August 2022.

Github

Github is a public repository for many volunteer-driven open source software projects, and so indicates what languages coders choose to work in when they have a personal choice. We use looked at two metrics from Github: repositories that have been «starred» by users, which reflects long term interests, and the number of pull requests, which indicates current activity. We used data gathered by GitHut 2.0, which measures the top 50 langauges used by number of repositories tagged with that language and draws from GitHub’s public API. The data covers the first quarter of 2022.

Source: IEEE Spectrum Computing