Skip to content

How to use the Dataset API / DatasetServiceClient

Introduction

Datasets are the third-level (and therefore also optional) resource to organize stored data. Hence, they need at least a Project or a Collection as parent for their creation. Datasets should be used to group objects that are closely related to each other into a logical unit and to describe them with additional metadata.

If you don't know how to create a Project you should read the previous chapter about the Project API basics.

If you don't know how to create a Collection you should read the previous chapter about the Collection API basics which is eerily similar to the Project API.

Create Dataset

API example for creating a new Dataset.

Required permissions

This request requires at least APPEND permission on the parent resource in which the Dataset is to be created.

Dataset naming guidelines

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Native JSON request to create a simple Dataset
curl -d '
  {
    "name": "json-api-dataset", 
    "title": "JSON API Dataset",
    "description": "Created with JSON over HTTP.",
    "keyValues": [],
    "relations": [],
    "data_class": "DATA_CLASS_PUBLIC",
    "projectId": "<project-id>",
    "collectionId": "<dataset-id>",
    "metadataLicenseTag": "CC-BY-4.0",
    "defaultDataLicenseTag": "CC-BY-4.0",
    "authors": []
  }' \
     -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X POST https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
// Create tonic/ArunaAPI request to create a Dataset
let request = CreateDatasetRequest {
    name: "rust-api-dataset".to_string(),
    title: "Rust API Dataset".to_string(),
    description: "Created with the gRPC Rust API client.".to_string(),
    key_values: vec![],
    external_relations: vec![],
    data_class: DataClass::Public as i32,
    metadata_license_tag: Some("CC-BY-4.0".to_string()),
    default_data_license_tag: Some("CC-BY-4.0".to_string()),
    parent: Some(Parent::ProjectId("<project-id>".to_string())),
    authors: vec![]
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.create_dataset(request)
                             .await
                             .unwrap() 
                             .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# Create tonic/ArunaAPI request to create a new Dataset
request = CreateDatasetRequest(
    name="python-api-project",
    title="Python API Project",
    description="Created with the gRPC Python API client.",
    key_values=[], 
    external_relations=[], 
    data_class=DataClass.DATA_CLASS_PUBLIC,
    project_id="<project-id>",
    collection_id="<collection-id>",
    metadata_license_tag="CC-BY-4.0",
    default_data_license_tag="CC-BY-4.0",
    authors=[]
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.CreateDataset(request=request)

# Do something with the response
print(f'{response}')

Get Dataset(s)

API examples of how to fetch information for one or multiple existing Dataset(s).

Required permissions

This request requires at least READ permissions on the Dataset or one if its parent resources.

1
2
3
4
# Native JSON request to fetch information of a Dataset
curl -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X GET 'https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}'
1
2
3
4
# Native JSON request to fetch information of multiple Datasets
curl -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X GET 'https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets?datasetIds=dataset-id-01&datasetIds=dataset-id-02'
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Create tonic/ArunaAPI request to fetch information of a Dataset
let request = GetDatasetRequest {
    dataset_id: "<dataset-id>".to_string(),
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.get_dataset(request)
                                .await
                                .unwrap()
                                .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
// Create tonic/ArunaAPI request to fetch information of multiple Datasets
let request = GetDatasetsRequest {
    dataset_ids: vec![
        "<dataset-id-01>".to_string(),
        "<dataset-id-02>".to_string(),
        "<...>".to_string(),
    ],
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.get_datasets(request)
                                .await
                                .unwrap()
                                .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Create tonic/ArunaAPI request to fetch information of a Dataset
request = GetDatasetRequest(
    dataset_id="<dataset-id>"
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.GetDataset(request=request)

# Do something with the response
print(f'{response}')
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Create tonic/ArunaAPI request to fetch information of multiple Datasets
request = GetDatasetsRequest(
    dataset_ids=[
        "<dataset-id-01>",
        "<dataset-id-02>",
        "<...>"]
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.GetDatasets(request=request)

# Do something with the response
print(f'{response}')

Update Dataset

API examples of how to update individual metadata of an existing Dataset.

Required permissions
  • Name update needs at least WRITE permissions on the specific Dataset or one of its parent resources
  • Description update needs at least WRITE permissions on the specific Dataset or one of its parent resources
  • KeyValue update needs at least WRITE permissions on the specific Dataset or one of its parent resources
  • Dataclass update needs at least WRITE permissions on the specific Dataset or one of its parent resources
1
2
3
4
5
6
7
8
# Native JSON request to update the name of a Dataset
curl -d '
  {
    "name": "updated-json-api-dataset"
  }' \
     -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X PATCH https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/name
1
2
3
4
5
6
7
8
# Native JSON request to update the title of a Dataset
curl -d '
  {
    "title": "Updated JSON API Dataset"
  }' \
     -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X PATCH https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/title
1
2
3
4
5
6
7
8
# Native JSON request to update the description of a Dataset
curl -d '
  {
    "description": "Updated with JSON over HTTP."
  }' \
     -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X PATCH https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/description
1
2
3
4
5
6
7
8
9
# Native JSON request to update the key-values associated with a Dataset
curl -d '
  {
    "addKeyValues": [],
    "removeKeyValues": []
  }' \
     -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X PATCH https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/key_values

Info

Dataclass can only be relaxed: Confidential > Workspace > Private > Public

1
2
3
4
5
6
7
8
# Native JSON request to update the dataclass of a Dataset
curl -d '
  {
    "dataClass": "DATA_CLASS_PUBLIC"
  }' \
     -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X PATCH https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/data_class
1
2
3
4
5
6
7
8
9
# Native JSON request to update the license of a Dataset
curl -d '
  {
    "metadataLicenseTag": "CC0",
    "defaultDataLicenseTag": "CC0"
  }' \
     -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X PATCH https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/licenses
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Native JSON request to add an author to a Dataset
curl -d '
  {
    "addAuthors": [
        {
        "firstName": "John",
        "lastName": "Doe",
        "email": "john.doe@example.com",
        "orcid": "0000-0002-1825-0097",
        "id": "<user-id-if-registered>"
        }
    ],
    "removeAuthors": []
  }' \
     -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X PATCH https://<URL-to-Aruna-instance-API-endpoint>/v2/dataset/{collection-id}/authors
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// Create tonic/ArunaAPI request to update the name of a Dataset
let request = UpdateDatasetNameRequest {
    dataset_id: "<dataset-id>".to_string(),
    name: "updated-rust-api-dataset".to_string(),
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.update_dataset_name(request)
                             .await
                             .unwrap()
                             .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// Create tonic/ArunaAPI request to update the title of a Dataset
let request = UpdateDatasetTitleRequest {
    dataset_id: "<dataset-id>".to_string(),
    title: "Updated Rust API Dataset".to_string(),
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.update_dataset_title(request)
                             .await
                             .unwrap()
                             .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// Create tonic/ArunaAPI request to update the description of a Dataset
let request = UpdateDatasetDescriptionRequest {
    dataset_id: "<dataset-id>".to_string(),
    description: "Updated with the gRPC Rust API client.".to_string(),
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.update_dataset_description(request)
                             .await
                             .unwrap()
                             .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// Create tonic/ArunaAPI request to update the key-values associated with a Dataset
let request = UpdateDatasetKeyValuesRequest {
    dataset_id: "<dataset-id>".to_string(),
    add_key_values: vec![], 
    remove_key_values: vec![]
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.update_dataset_key_values(request)
                             .await
                             .unwrap()
                             .into_inner();

// Do something with the response
println!("{:#?}", response);

Info

Dataclass can only be relaxed: Confidential > Private > Public

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// Create tonic/ArunaAPI request to update the datacalass of a Dataset
let request = UpdateDatasetDataClassRequest {
    dataset_id: "<dataset-id>".to_string(),
    data_class: DataClass::Public as i32,
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.update_dataset_data_class(request)
                             .await
                             .unwrap()
                             .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// Create tonic/ArunaAPI request to update the licenses of a Dataset
let request = UpdateDatasetLicensesRequest {
    dataset_id: "<dataset-id>".to_string(),
    metadata_license_tag: "CC0".to_string(),
    default_data_license_tag: "CC0".to_string(),
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.update_dataset_licenses(request)
                             .await
                             .unwrap()
                             .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
// Create tonic/ArunaAPI request to add an author to a Dataset
let request = UpdateDatasetAuthorsRequest {
    dataset_id: "<dataset-id>".to_string(),
    add_authors: vec![Author {
        first_name: "John".to_string(),
        last_name: "Doe".to_string(),
        email: "john.doe@example.com".to_string(),
        orcid: "0000-0002-1825-0097".to_string(),
        id: "<user-id-if-registered>".to_string(),
    }],
    remove_authors: vec![],
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.update_dataset_authors(request)
                             .await
                             .unwrap()
                             .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Create tonic/ArunaAPI request to update the name of a Dataset
request = UpdateDatasetNameRequest(
    dataset_id="<dataset-id>",
    name="updated-python-api-project"
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.UpdateDatasetName(request=request)

# Do something with the response
print(f'{response}')
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Create tonic/ArunaAPI request to update the title of a Dataset
request = UpdateDatasetTitleRequest(
    dataset_id="<dataset-id>",
    title="Updated Python API Dataset"
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.UpdateDatasetTitle(request=request)

# Do something with the response
print(f'{response}')
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Create tonic/ArunaAPI request to update the description of a Dataset
request = UpdateDatasetDescriptionRequest(
    dataset_id="<dataset-id>",
    description="Updated with the gRPC Python API client"
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.UpdateDatasetDescription(request=request)

# Do something with the response
print(f'{response}')
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Create tonic/ArunaAPI request to update the key-values associated with a Dataset
request = UpdateDatasetKeyValuesRequest(
    dataset_id="<dataset-id>",
    add_key_values=[],
    remove_key_values=[]
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.UpdateDatasetKeyValues(request=request)

# Do something with the response
print(f'{response}')

Info

Dataclass can only be relaxed: Confidential > Private > Public

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Create tonic/ArunaAPI request to relax the data_class of a Dataset
request = UpdateDatasetDataClassRequest(
    dataset_id="<dataset-id>",
    data_class=DataClass.DATA_CLASS_PUBLIC
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.UpdateDatasetDataClass(request=request)

# Do something with the response
print(f'{response}')
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Create tonic/ArunaAPI request to update the licenses of a Dataset
request = UpdateDatasetLicensesRequest(
    dataset_id="<dataset-id>",
    metadata_license_tag="CC0",
    default_data_license_tag="CC0"
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.UpdateDatasetLicenses(request=request)

# Do something with the response
print(f'{response}')
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Create tonic/ArunaAPI request to add an author to a Dataset
request = UpdateDatasetAuthorsRequest(
    dataset_id="<dataset-id>",
    add_authors=[Author(
        first_name="John",
        last_name="Doe",
        email="john.doe@example.com",
        orcid="0000-0002-1825-0097",
        user_id="<user-id-if-registered"
    )],
    remove_authors=[]
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.UpdateDatasetAuthors(request=request)

# Do something with the response
print(f'{response}')

Snapshot Dataset

API examples of how to snapshot a Dataset, i.e. create an immutable clone of the Dataset and its underlying resources.

Required permissions

This request requires at least ADMIN permissions on the Dataset or one if its parent resources.

1
2
3
4
# Native JSON request to snapshot a Dataset
curl -H 'Authorization: Bearer <AUTH_TOKEN>' \
  -H 'Content-Type: application/json' \
  -X POST https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/snapshot
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Create tonic/ArunaAPI request to snapshot a Dataset
let request = SnapshotDatasetRequest {
    dataset_id: "<dataset-id>".to_string()
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.snapshot_dataset_version(request)
                                .await
                                .unwrap()
                                .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Create tonic/ArunaAPI request to snapshot a Dataset
request = SnapshotDatasetRequest(
    dataset_id="<dataset-id>"
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.SnapshotDatasetVersion(request=request)

# Do something with the response
print(f'{response}')

Delete Dataset

API examples of how to delete a Dataset.

Info

Deletion does not remove the Dataset from the database, but sets the status of the Dataset and the underlying resources to "DELETED".

Required permissions

This request requires at least ADMIN permissions on the Dataset or one if its parent resources.

1
2
3
4
# Native JSON request to delete a Dataset
curl -H 'Authorization: Bearer <AUTH_TOKEN>' \
     -H 'Content-Type: application/json' \
     -X DELETE https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Create tonic/ArunaAPI request to delete a Dataset
let request = DeleteDatasetRequest {
    dataset_id: "<dataset-id>".to_string()
};

// Send the request to the Aruna instance gRPC endpoint
let response = dataset_client.delete_dataset(request)
                                .await
                                .unwrap()
                                .into_inner();

// Do something with the response
println!("{:#?}", response);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Create tonic/ArunaAPI request to delete a Dataset
request = DeleteDatasetRequest(
    dataset_id="<dataset-id>"
)

# Send the request to the Aruna instance gRPC endpoint
response = client.dataset_client.DeleteDataset(request=request)

# Do something with the response
print(f'{response}')