Table of contents

Organize assets in a project

A project is how you organize your assets to achieve a particular data analysis goal. Your project assets can include:

  • Notebooks
  • RStudio files
  • Local data sets

Restriction: A project name, asset name, data source name, and remote data set name cannot contain any special characters.

You can also export or import a Data Science Experience project as a ZIP or TAR.GZ file.

DSX provides a sample project named dsx-samples, available to all users, with sample notebooks to help get you started. Although you can create new notebooks, models, scripts, and data sets, you cannot add jobs, collaborators, or SPSS Modeler flows. You also cannot use any local data sets from any notebooks within dsx-samples.

Tasks you can perform:

Create a project

To create a project, go the the Projects list and click New Project.

For a new blank project, click the Blank tab.

To import a preexisting project from your local device, click the From File tab and upload the ZIP or TAR.GZ.

Restriction: A project name cannot contain spaces or non-ASCII characters. If your DSX Local project connects to a Git repository, then use the Git repo guidelines for project names and notebook names.

If you select the Library Project check box, then the project will have no repository or collaborators (only the Admin of the library project can edit it), and will be shared across all users in the DSX system. A library project is best for storing large common data sets, code packages, and scripts. The files are stored in the global path, for example, /user-home/_global_/libraryProjects/Jdoe\ Library\ Project/datasets/cars.csv in the library project's user home and ../../../../_global_/libraryProjects/Jdoe\ Library\ Project/datasets/cars.csv from a notebook. Tip: In the library project, you can click Path next to the file to display the exact path to it.

Project type Collboration privileges Master repository Repository copy
Standard Managed in DSX Master repository exists in the DSX cluster file system Each collaborator gets a copy
GitHub Managed outside of DSX Master repository exists in GitHub Each user gets a copy when the project is imported from GitHub
Library No collaboration, anyone can view No master repository No repository copy

Click Create. Your new project opens and you can start adding collaborators and assets to it.

Manage assets

If you have Admin or Editor permissions on a project, you can add assets from its Assets page.

If you have Admin permissions on a project, you can delete an asset by clicking Delete next to it.

Export a project

You can download a project as a ZIP or TAR.GZ file by clicking Export as next it. Note that the environments in the project do not get exported.

Rename a project

If you have Admin permissions on a project, you can rename it by clicking Rename next to it. This renames the project for all of the collaborators, and automatically stops the Admin's runtimes active for that project. When the renaming completes, any access to notebooks or RStudio will automatically start up the runtimes inside the context of the new project. The Admin can also choose to manually start them in the Runtimes page. Because the containers are not stopped for the collaborators, each collaborator must stop the runtimes associated with the old project name in the All Runtimes page. Any subsequent access to notebooks and RStudio would automatically bring up the runtimes with the correct project name context, or the collaborator can go to the Project > Runtimes page to manually start runtimes. Also, collaborators should verify that assets like notebooks and scripts do not directly specify the project name, for example, in any of the paths (the paths should always be relative for portability).

Recommendation: Before renaming a project, the Admin should ensure that all hard-coded project names used in paths are corrected to relative paths, and that all collaborators are forewarned of the change so that they can terminate any of their runtimes prior to the rename.

Delete a project

If you have Admin permissions on a standard project, you can delete it by clicking Delete next to it. This deletes the project for all of the collaborators, and deletes all assets (and the storage directories) associated with the project. If necessary, a DSX admin can manually recover deleted projects from the DSX system's recycle bin directory.

If an Admin deletes a GitHub project, then only the DSX copy of the project will be deleted (not the remote repository on GitHub).

Recommendation: Before deleting a project, ensure the project's collaborators have stopped all running containers, saved their work, and exported a copy of the project if they need a backup of it, and stopped all runtimes for that project. Otherwise, the project might continue to use resources that can only be freed up when the DSX administrator deletes the corresponding pods. You can also forewarn the DSX administrator about the impact of the deleted project on all collaborators.

View all projects

Click the Tree View icon (The tree view icon) to view all projects in the system and expand their contents. You can click on any folder, Jupyter notebook, or CSV file to preview it.

Tree View