Azure Data Science VM

Data Science VMs are Azure Virtual Machine images, pre-installed, configured and tested with several popular tools that are commonly used for data analytics, machine learning and AI training.


I have explored DSVM and found there are pretty much all software packages are preinstalled which is claimed essential for data science and it is running fast also, below are the some points noted:
  • R and Python tools for visual studio (RTVS and PTVS) is preinstalled
  • IDE like RStudio is also preinstalled
  • Azure Storage explorer already installed
  • A fully configured Jupyter Notebooks Server available with examples, I have gone through all steps in example “Introduction to Azure ML R notebooks.ipynb” notebook
  • On R tools for Visual Studio, by using the R Interactive window I have again ran all step from above notebook and get all exact same output
  • Azure ML packages was not installed on RTVS so I have manually installed, now it will connect to any Azure Machine Learning Studio using workspace_id and authorization_token
  • Similarly Jupyter Notebooks can connect to Azure Machine Learning Studio using workspace_id and authorization_token
Created sample model and calculated the RMSE as below:

Mean Absolute Error: 3.270863
Root Mean Squared Error: 4.679191
Relative Absolute Error: 0.492066
Relative Squared Error: 0.259357


Microsoft is recommending to installing and configuring X2Go client for Data Science VM full graphical access.
X2Go gives remote access to a Linux system's graphical user interface. You can find details here.

OR building the Data Science workbench as a Windows server, which then will allow RDP access to the desktop without the need for an extra client installation.

What I have found if user is accessing only few tools with less data, it is better to have it on personal system and connect Azure ML studio.

Post a Comment

Thanks for your comment !
I will review your this and will respond you as soon as possible.