Skip to content

Quality Assurance

Quality assurance (QA) combines automated static analysis, vulnerability scanning, and API-level integration testing to ensure the service functions as expected.

Code Analysis

We use GitHub CodeQL for static analysis of the source code. This process helps identify potential bugs, security vulnerabilities, and quality issues before they impact production.

  • Technology: GitHub CodeQL
  • Targets: The analysis runs on both the Python source code and the GitHub Actions workflow files themselves.
  • Checks: It uses a set of queries to find common vulnerabilities (e.g., injection flaws, insecure data handling), bugs, and code quality anti-patterns.
  • Execution: The analysis is performed by the on-demand code-scan-on-demand.yml workflow. Results and alerts are available directly in the repository's "Security" tab.

Code Metrics

To ensure the long-term maintainability and readability of the code, we use the Radon library to generate code metrics.

  • Technology: Radon
  • Metrics:
    • Maintainability Index (MI): A score from 0-100 indicating how easy the code is to maintain (higher is better).
    • Cyclomatic Complexity (CC): Measures the number of independent paths through the code. A lower score indicates simpler, less complex code.
    • Lines of Code (LOC): Provides raw metrics on the size of the codebase.
  • Execution: The metrics are generated by the on-demand code-metrics-on-demand.yml workflow, which produces a downloadable code-metrics-report.txt artifact.

Vulnerability Checks

We use Trivy to scan for known vulnerabilities in our dependencies and container image, ensuring the service is secure from common threats.

  • Technology: Trivy
  • Scans:
    1. Configuration Scan: Scans the Dockerfile and other repository files for security misconfigurations.
    2. Image Scan: Scans the final Docker image for known vulnerabilities (CVEs) with CRITICAL or HIGH severity in its OS packages and Python libraries.
  • Execution: The scan is performed by the on-demand vulnerability-scan-on-demand.yml workflow. It requires an image tag as input and uploads detailed JSON reports as build artifacts.

Testing

The service's functionality is validated through API-level integration tests using Postman and its command-line runner, Newman.

  • Test Suite: The test cases are defined in the Postman collection located at tests/cross-dataset-discovery-api-tests.postman_collection.json.

  • Test Cases: The suite includes the following tests:

    • Health Check: Verifies that the /health endpoint is available and returns a 200 OK status, indicating the service and its dependencies are healthy.
    • Get User Access Token: Simulates a user login against the OIDC provider to acquire a valid JWT, which is required for all protected endpoints.
    • Perform Search - Valid Request: Executes a valid search against the /search/ endpoint and asserts that the response is successful (200 OK) and has the correct structure.
    • Perform Search - Bad Request: Sends a request with an invalid payload (an empty dataset_ids list) and asserts that the API correctly rejects it with a 400 Bad Request status.

How to Run Tests

The API tests are designed to be run automatically via a GitHub Actions workflow.

  • Workflow File: .github/workflows/test-on-demand.yml
  • Trigger: The workflow is triggered manually (workflow_dispatch) from the Actions tab in the GitHub repository.
  • Process:
    1. Navigate to the "Actions" tab and select the "Test Scenario (On-Demand)" workflow.
    2. Click "Run workflow". You will be prompted to enter a tag (e.g., main or a specific version like v1.0.0) to test against.
    3. The workflow checks out the specified version of the code.
    4. It then uses the official postman/newman Docker image to execute the test collection.
    5. All necessary environment variables (API base URL, credentials, etc.) are securely injected into the test run from the repository's secrets.

Expected Output

A successful test run will produce the following summary table in the GitHub Actions log:

┌─────────────────────────┬─────────────────────┬────────────────────┐
│                         │            executed │             failed │
├─────────────────────────┼─────────────────────┼────────────────────┤
│              iterations │                   1 │                  0 │
├─────────────────────────┼─────────────────────┼────────────────────┤
│                requests │                   4 │                  0 │
├─────────────────────────┼─────────────────────┼────────────────────┤
│            test-scripts │                   4 │                  0 │
├─────────────────────────┼─────────────────────┼────────────────────┤
│      prerequest-scripts │                   0 │                  0 │
├─────────────────────────┼─────────────────────┼────────────────────┤
│              assertions │                   4 │                  0 │
├─────────────────────────┴─────────────────────┴────────────────────┤
│ total run duration: 1674ms                                         │
├────────────────────────────────────────────────────────────────────┤
│ total data received: 6.9kB (approx)                                │
├────────────────────────────────────────────────────────────────────┤
│ average response time: 396ms [min: 147ms, max: 645ms, s.d.: 205ms] │