Skip to content

DataGEMS In-Dataset Discovery Service 0.3.0

The In-Dataset Discovery Service provides a secure API for performing natural language exploration queries on structured or unstructured data within datasets. It supports geospatial queries and text-to-SQL conversion using LLM capabilities.


Terms of service: https://datagems.eu/terms
License: EUPL-1.2 license

Servers

Description URL
Development server https://datagems-dev.scayle.es/in-dataset-discovery

Monitoring


GET /

Root

Description

Returns the API status.

Response 200 OK

{
    "message": "In Data Exploration API is running.",
    "status": "success"
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "message": {
            "example": "In Data Exploration API is running.",
            "type": "string"
        },
        "status": {
            "example": "success",
            "type": "string"
        }
    },
    "type": "object"
}

GET /health

Health Check

Description

Checks the availability of the API service. Returns a 200 OK if the service is healthy. This endpoint does not require authentication.

Response 200 OK

{
    "status": "ok"
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "status": {
            "example": "ok",
            "type": "string"
        }
    },
    "type": "object"
}

Text-to-Geo


GET /text2geo

Geospatial Query

Description

Processes a natural language geospatial question, identifies the place using Wikidata, generates an OverpassQL query, and returns geospatial results including coordinates, GeoJSON data, and bounding boxes.

Input parameters

Parameter In Type Default Nullable Description
OAuth2Bearer header string N/A No JWT token for authentication, obtained from the OIDC provider.
question query string No The geospatial query question to be processed.

Response 200 OK

{
    "most_relevant_wikidata": {
        "place": "string",
        "reasoning": "string",
        "wiki_id": "string",
        "wiki_properties": {
            "aliases": "string",
            "coordinate location": "string",
            "country": "string",
            "description": "string",
            "found_osm_json": true,
            "instance of": "string",
            "label": "string",
            "located in the administrative territorial entity": "string",
            "part of": "string"
        }
    },
    "oql": {
        "OQL": "string",
        "reasoning": "string"
    },
    "place": "string",
    "results": {
        "bounds": {
            "maxlat": 10.12,
            "maxlon": 10.12,
            "minlat": 10.12,
            "minlon": 10.12
        },
        "center": [
            10.12
        ],
        "geojson_data": {},
        "points": [
            {
                "lat": "string",
                "lon": "string"
            }
        ]
    }
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "most_relevant_wikidata": {
            "$ref": "#/components/schemas/MostRelevantWikidata",
            "description": "The most relevant Wikidata entity related to the query."
        },
        "oql": {
            "$ref": "#/components/schemas/OQLResponse",
            "description": "The OverpassQL query generated from the geospatial question."
        },
        "place": {
            "description": "The place identified in the geospatial query.",
            "type": "string"
        },
        "results": {
            "$ref": "#/components/schemas/GeospatialResults",
            "description": "The results of the OverpassQL query includes points, bounding box, multipolygons and centroid."
        }
    },
    "required": [
        "place",
        "most_relevant_wikidata",
        "oql",
        "results"
    ],
    "type": "object"
}

Response 422 Unprocessable Content

{
    "code": 102,
    "error": "validation error",
    "message": [
        {
            "Key": "question",
            "Value": [
                "Field required"
            ]
        }
    ]
}
Schema of the response body
{
    "properties": {
        "code": {
            "example": 102,
            "type": "integer"
        },
        "error": {
            "example": "validation error",
            "type": "string"
        },
        "message": {
            "items": {
                "$ref": "#/components/schemas/ValidationErrorDetail"
            },
            "type": "array"
        }
    },
    "required": [
        "code",
        "error",
        "message"
    ],
    "type": "object"
}

Response 500 Internal Server Error

{
    "code": 0,
    "error": "string"
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "code": {
            "description": "HTTP status code",
            "type": "integer"
        },
        "error": {
            "description": "Error message",
            "type": "string"
        }
    },
    "required": [
        "code",
        "error"
    ],
    "type": "object"
}

Text-to-SQL


POST /text2sql

Text-to-SQL Query

Description

Takes a natural language question and database connection information, generates a SQL query using LLM capabilities, and executes it to return results.

Input parameters

Parameter In Type Default Nullable Description
OAuth2Bearer header string N/A No JWT token for authentication, obtained from the OIDC provider.

Request body

{
    "parameters": {
        "db_info": {
            "db_database": "string",
            "db_host": "string",
            "db_pass": "string",
            "db_port": 0,
            "db_schema": "string",
            "db_username": "string"
        },
        "results": {
            "points": [
                [
                    10.12
                ]
            ]
        }
    },
    "question": "What are the average mean temperatures in the coordinates (lon, lat) in year 2020?"
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the request body
{
    "properties": {
        "parameters": {
            "$ref": "#/components/schemas/SQLQueryParameters",
            "description": "The parameters for the SQL query, including results."
        },
        "question": {
            "description": "The text question to be converted to SQL.",
            "example": "What are the average mean temperatures in the coordinates (lon, lat) in year 2020?",
            "type": "string"
        }
    },
    "required": [
        "question",
        "parameters"
    ],
    "type": "object"
}

Response 200 OK

{
    "input_params": [
        {
            "lat": 10.12,
            "lon": 10.12
        }
    ],
    "message": "string",
    "model_name": "string",
    "output_params": {
        "coordinates": [
            "string"
        ]
    },
    "params": {},
    "question": "string",
    "reasoning": "string",
    "sql_pattern": "string",
    "sql_query": "string",
    "sql_results": {
        "data": [
            {}
        ],
        "status": "string"
    },
    "status": "string"
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "input_params": {
            "description": "Input parameters with coordinates.",
            "items": {
                "$ref": "#/components/schemas/InputParam"
            },
            "type": "array"
        },
        "message": {
            "description": "Message describing the operation result.",
            "type": "string"
        },
        "model_name": {
            "description": "The model name used for SQL generation.",
            "type": "string"
        },
        "output_params": {
            "$ref": "#/components/schemas/OutputParams",
            "description": "Output parameter definitions."
        },
        "params": {
            "additionalProperties": true,
            "description": "Parameters used in the query.",
            "type": "object"
        },
        "question": {
            "description": "The original question that was converted to SQL.",
            "type": "string"
        },
        "reasoning": {
            "description": "Reasoning behind the SQL query generation.",
            "type": "string"
        },
        "sql_pattern": {
            "description": "The SQL pattern/template generated.",
            "type": "string"
        },
        "sql_query": {
            "description": "The final SQL query with parameters filled in.",
            "type": "string"
        },
        "sql_results": {
            "$ref": "#/components/schemas/SQLResults",
            "description": "Results from executing the SQL query."
        },
        "status": {
            "description": "Status of the operation.",
            "type": "string"
        }
    },
    "required": [
        "status",
        "message",
        "params",
        "question",
        "model_name",
        "sql_pattern",
        "input_params",
        "output_params",
        "reasoning",
        "sql_query",
        "sql_results"
    ],
    "type": "object"
}

Response 422 Unprocessable Content

{
    "code": 102,
    "error": "validation error",
    "message": [
        {
            "Key": "question",
            "Value": [
                "Field required"
            ]
        },
        {
            "Key": "parameters.results.points",
            "Value": [
                "Field required"
            ]
        }
    ]
}
Schema of the response body
{
    "properties": {
        "code": {
            "example": 102,
            "type": "integer"
        },
        "error": {
            "example": "validation error",
            "type": "string"
        },
        "message": {
            "items": {
                "$ref": "#/components/schemas/ValidationErrorDetail"
            },
            "type": "array"
        }
    },
    "required": [
        "code",
        "error",
        "message"
    ],
    "type": "object"
}

Response 500 Internal Server Error

{
    "code": 0,
    "error": "string"
}
⚠️ This example has been generated automatically from the schema and it is not accurate. Refer to the schema for more information.

Schema of the response body
{
    "properties": {
        "code": {
            "description": "HTTP status code",
            "type": "integer"
        },
        "error": {
            "description": "Error message",
            "type": "string"
        }
    },
    "required": [
        "code",
        "error"
    ],
    "type": "object"
}

Schemas

Bounds

Name Type
maxlat number(float)
maxlon number(float)
minlat number(float)
minlon number(float)

ErrorResponse

Name Type
code integer
error string

FailedDependencyMessage

Name Type
correlationId string| null
payload
source string
statusCode integer

FailedDependencyResponse

Name Type
code integer
error string
message FailedDependencyMessage

GeospatialResponse

Name Type
most_relevant_wikidata MostRelevantWikidata
oql OQLResponse
place string
results GeospatialResults

GeospatialResults

Name Type
bounds Bounds
center Array<number(float)>
geojson_data
points Array<Point>

InputParam

Name Type
lat number(float)
lon number(float)

MostRelevantWikidata

Name Type
place string
reasoning string
wiki_id string
wiki_properties WikiProperties

OQLResponse

Name Type
OQL string
reasoning string

OutputParams

Name Type
coordinates Array<string>

Point

Name Type
lat string
lon string

ResultsModel

Name Type
points Array<Array<number(float)>>

SQLQueryParameters

Name Type
db_info Properties: db_database, db_host, db_pass, db_port, db_schema, db_username
results ResultsModel

SQLResults

Name Type
data Array<>
status string

Text2SQLQuery

Name Type
parameters SQLQueryParameters
question string

Text2SQLResponse

Name Type
input_params Array<InputParam>
message string
model_name string
output_params OutputParams
params
question string
reasoning string
sql_pattern string
sql_query string
sql_results SQLResults
status string

ValidationErrorDetail

Name Type
Key string
Value Array<string>

ValidationErrorResponse

Name Type
code integer
error string
message Array<ValidationErrorDetail>

WikiProperties

Name Type
aliases string
coordinate location string
country string
description string
found_osm_json boolean
instance of string
label string
located in the administrative territorial entity string
part of string

Security schemes

Name Type Scheme Description
OAuth2Bearer http bearer JWT token for authentication, obtained from the OIDC provider.

Tags

Name Description
Text-to-Geo Endpoints for performing geospatial queries using natural language.
Text-to-SQL Endpoints for converting natural language questions to SQL queries.
Monitoring Endpoints for monitoring the service health.