Why RStudio Supports Python for Data Science?

Now, there has always been a question on why RStudio supports Python whilst being an “R first” company. Although there isn’t a definitive answer, some surveys and anecdotes might explain this question.

After conducting extensive surveys, RStudio found out that their customers use both Python and R for different scenarios. They found that R is use most commonly for visualisation, statistical analysis, and data transformation, while R users prefer using Python for data transformation and machine learning applications.

With the above data, we can draw some conclusions:

  • R users also use Python:

    After taking surveys from 2,000 people, almost half of them use Python regularly.

  • Visualisation and statistical applications are R’s most used areas: Almost 9 out of 10 use R for this, with Data Transformation coming close at third.
  • Data transformation and Machine learning/A.I. are Python’s most used areas: Most projects on Data Transformation and Machine learning use Python as their choice of language. No other area comes closer.

One should take into account the following reasons before concluding that Python is superior or more robust

  • The start of the survey specifically mentioned that it is open to anyone who is “interest in R”. Most probably, a Python user would not have completed it.
  • Most of the responses were receive by asking RStudio employees to encourage R’s community to fill it out. It is unlikely that these employees represent the entire data science community.
  • The data for this survey had not been weighte according to its anonymous demographic to represent any larger set of population. This means that the survey might have a bias in areas like gender, ethnicity, industry or education.

The best way to think of this survey is to show the viewpoint of significant R users. While not representing the entire community of Data Scientists, RStudio focuses on supporting Python to make everyone productive. And hence lies the reason why RStudio supports Python:

  • One must not be force to choose between Python or R. It has always been known that R user use more than one language for their data science projects. The data collected from the above survey supports this idea. Becoming R-Only, RStudio, in turn, could then harm the data science community.
  • As half of R’s community uses Python, embracing it was a no-brainer. With almost half of RStudio users using Python for their data science applications, not supporting it would have pushed them toward other tools.
  Accepting Python is almost mean to support it in RStudio. Forcing programmers to switch back and forth between two different environments is inefficient and reduces productivity.

RStudio can help its users get results faster and easier by adding support for Python. That in turn also satisfies RStudio’s wider goal: “to enhance the production and consumption of knowledge by everyone, regardless of economic means”.

