Skip to content

DataGEMS Cross-Dataset Discovery Service v1

The Cross-Dataset Discovery Service provides a secure API for performing natural language search queries across a pre-indexed collection of datasets.


Terms of service: https://datagems.eu/terms
License: EUPL-1.2 license

Servers

Description URL
Default server path when running behind a reverse proxy /cdd

Monitoring


GET /health

Health Check

Description

Verifies the operational status of the API and its dependencies (Qdrant, TEI, Database). Returns a 200 OK if all systems are healthy.

Response 200 OK

{
    "message": "All dependencies are healthy.",
    "status": "ok"
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "message": {
            "example": "All dependencies are healthy.",
            "type": "string"
        },
        "status": {
            "example": "ok",
            "type": "string"
        }
    },
    "type": "object"
}

Response 424 Failed Dependency

{
    "code": 104,
    "error": "error communicating with underpinning service",
    "message": {
        "correlationId": "string",
        "payload": {},
        "source": "string",
        "statusCode": 0
    }
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "code": {
            "example": 104,
            "type": "integer"
        },
        "error": {
            "example": "error communicating with underpinning service",
            "type": "string"
        },
        "message": {
            "$ref": "#/components/schemas/FailedDependencyMessage"
        }
    },
    "type": "object"
}

POST /search/

Perform a Search Query

Description

Submits a natural language query and returns a ranked list of relevant results. The search can be optionally filtered to a specific set of datasets. Results are always filtered based on the user's access permissions.

Input parameters

Parameter In Type Default Nullable Description
OAuth2Bearer header string N/A No JWT token for authentication, obtained from the OIDC provider.

Request body

{
    "dataset_ids": [
        "string"
    ],
    "k": 0,
    "query": "string",
    "search_mode": "sparse"
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the request body
{
    "properties": {
        "dataset_ids": {
            "description": "A list of dataset UUIDs to restrict the search to.",
            "items": {
                "type": "string"
            },
            "nullable": true,
            "type": "array"
        },
        "k": {
            "default": 5,
            "description": "The number of results to return.",
            "maximum": 100,
            "minimum": 1,
            "type": "integer"
        },
        "query": {
            "description": "The natural language query string.",
            "type": "string"
        },
        "search_mode": {
            "default": "sparse",
            "description": "The retrieval strategy to use.",
            "enum": [
                "sparse",
                "dense",
                "hybrid"
            ],
            "type": "string"
        }
    },
    "required": [
        "query"
    ],
    "type": "object"
}

Response 200 OK

{
    "query_time": 10.12,
    "results": [
        {
            "content": "string",
            "dataset_id": "string",
            "object_id": "string",
            "similarity": 10.12
        }
    ]
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "query_time": {
            "description": "The time taken to process the query in milliseconds.",
            "format": "float",
            "type": "number"
        },
        "results": {
            "items": {
                "$ref": "#/components/schemas/API_SearchResult"
            },
            "type": "array"
        }
    },
    "type": "object"
}

Response 400 Bad Request

{
    "code": 102,
    "error": "Validation Error",
    "message": [
        {
            "Key": "string",
            "Value": [
                "string"
            ]
        }
    ]
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "code": {
            "example": 102,
            "type": "integer"
        },
        "error": {
            "example": "Validation Error",
            "type": "string"
        },
        "message": {
            "items": {
                "$ref": "#/components/schemas/ValidationErrorDetail"
            },
            "type": "array"
        }
    },
    "type": "object"
}

Response 401 Unauthorized

Response 403 Forbidden

Response 424 Failed Dependency

{
    "code": 104,
    "error": "error communicating with underpinning service",
    "message": {
        "correlationId": "string",
        "payload": {},
        "source": "string",
        "statusCode": 0
    }
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "code": {
            "example": 104,
            "type": "integer"
        },
        "error": {
            "example": "error communicating with underpinning service",
            "type": "string"
        },
        "message": {
            "$ref": "#/components/schemas/FailedDependencyMessage"
        }
    },
    "type": "object"
}

Schemas

API_SearchResult

Name Type
content string
dataset_id string
object_id string
similarity number(float)

FailedDependencyMessage

Name Type
correlationId string| null
payload
source string
statusCode integer

FailedDependencyResponse

Name Type
code integer
error string
message FailedDependencyMessage

SearchRequest

Name Type
dataset_ids Array<string>
k integer
query string
search_mode string

SearchResponse

Name Type
query_time number(float)
results Array<API_SearchResult>

ValidationErrorDetail

Name Type
Key string
Value Array<string>

ValidationErrorResponse

Name Type
code integer
error string
message Array<ValidationErrorDetail>

Security schemes

Name Type Scheme Description
OAuth2Bearer http bearer JWT token for authentication, obtained from the OIDC provider.

Tags

Name Description
Search Endpoints for performing search operations.
Monitoring Endpoints for monitoring the service health.