Skip to content

PollyKG

The PollyKG class provides an interface to interact with the Polly Knowledge Graph (KG) API. It enables users to execute and manage Cypher queries, retrieve node and relationship data, and analyze graph structures efficiently. This class simplifies access to the KG engine, allowing seamless querying and data exploration. It is designed for users who need to extract insights from complex graph-based datasets.

Parameters:

  • token (str, default: None ) –

    Authentication token from polly

Usage

from polly.polly_kg import PollyKG

kg = PollyKG(token)

run_query

run_query(query, query_type='CYPHER')

Execute a graph query on the specified KG version.

This function submits the query to Polly KG, tracks its status, and waits until it completes. Once complete, it returns the result.

Parameters:

  • query (str) –

    The query string to be executed.

  • query_type (str, default: 'CYPHER' ) –

    The query language type. Accepts "CYPHER".

Returns:

  • dict ( dict ) –

    Query result in parsed JSON format.

Raises:

  • InvalidParameterException

    If query is empty.

  • QueryFailedException

    If query execution fails.

submit_query

submit_query(query, query_type='CYPHER')

Submit a graph query on the specified KG version.

This function submits the query to Polly KG and returns the Query ID.

Parameters:

  • query (str) –

    The query string to be executed.

  • query_type (str, default: 'CYPHER' ) –

    The query language type. Accepts "CYPHER".

Returns:

  • query_id ( str ) –

    Unique Identifier for the query submitted.

Raises:

  • InvalidParameterException

    If query is empty.

  • QueryFailedException

    If query execution fails.

get_query_status

get_query_status(query_id)

Fetch the status of a submitted query.

Parameters:

  • query_id (str) –

    Unique ID of the submitted query.

Returns:

  • dict ( dict ) –

    A dictionary containing current status of the query (e.g., "IN_PROGRESS", "COMPLETED").

Raises:

  • InvalidParameterException

    If query_id is not provided.

get_query_results

get_query_results(query_id)

Get the results of a query by its ID.

Parameters:

  • query_id (str) –

    The ID of the query whose results you want to get.

Returns:

  • dict ( None ) –

    Query result in parsed JSON format.

Raises:

  • InvalidParameterException

    If the query_id is empty or None.

  • QueryFailedException

    If query execution failed.

  • RequestFailureException

    If the request fails due to an unexpected error.

download_query_results

download_query_results(query_id, folder='./results')

Download the results of a query by its ID. If the download directory does not exist, it will be created. The results will be saved in a ./results directory/query_id.json with the filename as query_id.json.

Parameters:

  • query_id (str) –

    The ID of the query whose results you want to download.

  • folder (str, default: './results' ) –

    The directory where the results should be saved.

Returns:

  • str ( None ) –

    Path to the downloaded results.

Raises:

  • InvalidParameterException

    If the query_id is empty or None.

  • QueryFailedException

    If query execution failed.

  • RequestFailureException

    If the request fails due to an unexpected error.

get_summary

get_summary()

Retrieve a summary of the Polly Knowledge Graph.

Returns:

  • dict ( dict ) –

    A dictionary containing summary information about the graph, such as node counts, edge counts, and other metadata.

Raises:

  • ResourceNotFoundError

    Raised when the specified graph summary does not exist.

  • AccessDeniedError

    Raised when the user does not have permission to access the graph summary.

  • RequestFailureException

    Raised when the request fails due to an unexpected error.

get_schema

get_schema()

Retrieve the schema of the Polly Knowledge Graph.

Returns:

  • dict ( dict ) –

    A dictionary containing schema information about the graph, such as node types, relationship types, and other metadata with descriptions.

Raises:

  • ResourceNotFoundError

    Raised when the specified graph schema does not exist.

  • AccessDeniedError

    Raised when the user does not have permission to access the graph schema.

  • RequestFailureException

    Raised when the request fails due to an unexpected error.

get_all_queries

get_all_queries(status='IN_PROGRESS', page_size=100, next_token=None, instance_id=None)

Retrieve a list of queries for the specified KG version.

This function fetches queries based on their status, allowing you to filter and paginate through query results. It supports filtering by status and optionally by instance_id.

Parameters:

  • status (str, default: 'IN_PROGRESS' ) –

    Filter queries by status. Accepts "IN_PROGRESS", "COMPLETED", "FAILED", "QUEUED", or "CANCELLED". Defaults to "IN_PROGRESS".

  • page_size (int, default: 100 ) –

    Number of queries to return per page. Maximum 100. Defaults to 100.

  • next_token (str, default: None ) –

    Pagination token for retrieving the next page of results. Obtained from the 'meta' section of the previous response. Defaults to None.

  • instance_id (str, default: None ) –

    Filter queries by specific instance ID. Defaults to None.

Returns:

  • dict ( dict ) –

    A dictionary containing query data and pagination metadata: - data (list): List of query objects with attributes including query_id, query_string, status, created_at, and metadata. - meta (dict): Pagination metadata including next_token if more results exist. If no queries are found, returns a message string.

Raises:

  • BadRequestError

    If invalid parameters are provided.

  • UnauthorizedException

    If authentication fails.

  • RequestFailureException

    If the request fails due to an unexpected error.

cancel_query

cancel_query(query_id)

Cancel a running or queued query.

This function attempts to cancel a query that is currently in "IN_PROGRESS" or "QUEUED" status. Once cancelled, the query status will be updated to "CANCELLED" and it will stop execution.

Parameters:

  • query_id (str) –

    The unique ID of the query to cancel. Must be a non-empty string.

Returns:

  • dict ( dict ) –

    A dictionary containing the cancellation response with updated query status.

Raises:

  • InvalidParameterException

    If query_id is None, empty, or not a valid string.

  • ResourceNotFoundError

    If the specified query_id does not exist.

  • BadRequestError

    If the query cannot be cancelled (e.g., already completed or failed).

  • UnauthorizedException

    If authentication fails.

  • RequestFailureException

    If the request fails due to an unexpected error.

get_all_kgs

get_all_kgs(include_unpublished=False, include_instances=False, include_terminated=False)

Retrieve all available Knowledge Graphs.

This function fetches a list of all available Knowledge Graphs (KGs) that the user has access to, including their metadata such as kg_id, kg_name, kg_description, version_id, created_at timestamp, and published status.

Parameters:

  • include_unpublished (bool, default: False ) –

    Include unpublished knowledge graphs. Defaults to False.

  • include_instances (bool, default: False ) –

    Include instance information for each KG. When True, each KG will include an 'instances' array with details about available instances (instance_id, instance_type, CPU/RAM, etc.). Defaults to False.

  • include_terminated (bool, default: False ) –

    Include terminated instances in the response. Only applies when include_instances is True. Defaults to False.

Returns:

  • list ( list ) –

    A list of dictionaries, where each dictionary contains: - kg_id (str): Unique identifier for the knowledge graph - kg_name (str): Name of the knowledge graph - kg_description (str): Description of the knowledge graph - version_id (int): Current version ID - created_at (int): Creation timestamp (Unix epoch in milliseconds) - published (bool): Whether the KG is published - instances (list, optional): Available when include_instances=True. Each instance contains: - instance_id (str): Unique identifier for the instance - instance_type (str): Type/size of instance (e.g., "t3.medium") - is_terminated (bool): Whether the instance is terminated - default_instance (bool): Whether this is the default instance - org_id (int): Organization ID - created_at (int): Instance creation timestamp

Raises:

  • ResourceNotFoundError

    Raised when no knowledge graphs are found.

  • AccessDeniedError

    Raised when the user does not have permission to access the KGs.

  • RequestFailureException

    Raised when the request fails due to an unexpected error.

Example

kg = PollyKG()

Get only published KGs

kgs = kg.get_all_kgs() for kg_info in kgs: ... print(f"{kg_info['kg_name']} (ID: {kg_info['kg_id']})") base kg v3 (ID: 14_base_kg_v3)

Get all KGs including unpublished ones with instance details

kgs_with_instances = kg.get_all_kgs( ... include_unpublished=True, ... include_instances=True, ... include_terminated=True ... ) for kg_info in kgs_with_instances: ... print(f"{kg_info['kg_name']} - {len(kg_info.get('instances', []))} instances")

Examples

PollyKG class of polly-python can be initialised using the code block below:-

import os
from polly.auth import Polly
from polly.polly_kg import PollyKG
token = os.environ['POLLY_REFRESH_TOKEN']
Polly.auth(token)
kg = PollyKG()

run_query()

results = kg.run_query("MATCH (n) RETURN count(*);")
Query submitted successfully. Query ID: 6e7723bf-5019-45de-91f5-cc515e99d827.

print(results)
{'results': [{'count(*)': 2178708}]}

submit_query()

query_id = kg.submit_query("MATCH (n) RETURN count(*);")
6e7723bf-5019-45de-91f5-cc515e99d827

get_query_status()

kg.get_query_status(query_id)
{'status': 'COMPLETED'}

get_query_results()

results = kg.get_query_results(query_id)
{'results': [{'count(*)': 2178708}]}

download_query_results()

kg.download_query_results(query_id, folder="/downloads")
Results downloaded successfully to /downloads/6e7723bf-5019-45de-91f5-cc515e99d827.json

get_summary()

kg.get_summary()
{
    "data": {
        "type": "kg_summary",
        "attributes": {
            "metadata": {
                "name": "summary",
                "kg_id": "demo_kg",
                "kg_version": "1",
                "last_updated": "2025-06-25T07:28:55Z",
                "computed_at": "2025-06-25T07:28:55Z"
            },
            "node_counts": {
                "Tissue": 113,
                "Go": 47995,
                "Disease": 29976,
                "Phenotype": 29870
            },
            "edge_counts": {
                "is_a_go": 40264,
                "part_of_go": 6737,
                "regulates_go": 3132,
                "has_part_go": 636
            },
            "totals": {
                "total_nodes": 107954,
                "total_edges": 50769,
                "node_types": 4,
                "edge_types": 4
            }
        }
    }
}

get_schema()

kg.get_schema()
  {
  "data": {
    "type": "kg_schema",
    "attributes": {
      "metadata": {
        "name": "schema",
        "kg_id": "demo_kg",
        "kg_version": "1",
        "last_updated": "2025-06-25T07:28:55Z",
        "computed_at": "2025-06-25T07:28:55Z"
      },
      "node_types": [
        "Disease",
        "Drug",
        "Gene"
      ],
      "edge_types": [
        "connects",
        "associated",
        "related"
      ],
      "nodes": {
        "Disease": {
          "properties": [
            {
              "name": "id",
              "type": "STRING",
              "description": "Unique identifier for the disease"
            },
            {
              "name": "name",
              "type": "STRING",
              "description": "Name of the disease"
            }
          ],
          "description": "Represents a disease entity in the knowledge graph"
        },
        "Drug": {
          "properties": [
            {
              "name": "id",
              "type": "STRING",
              "description": "Unique identifier for the drug"
            },
            {
              "name": "name",
              "type": "STRING",
              "description": "Name of the drug"
            }
          ],
          "description": "Represents a drug or medication entity"
        },
        "Gene": {
          "properties": [
            {
              "name": "id",
              "type": "STRING",
              "description": "Unique identifier for the gene"
            },
            {
              "name": "symbol",
              "type": "STRING",
              "description": "Gene symbol or name"
            }
          ],
          "description": "Represents a gene entity"
        }
      },
      "edges": {
        "connects": {
          "properties": [
            {
              "name": "id",
              "type": "STRING",
              "description": "Unique identifier for the connection"
            },
            {
              "name": "connection_type",
              "type": "STRING",
              "description": "Type of connection between nodes"
            }
          ],
          "connections": [
            {
              "from": "Tissue",
              "to": "Tissue Level"
            }
          ],
          "description": "Represents a connection relationship between tissue entities"
        },
        "associated": {
          "properties": [
            {
              "name": "id",
              "type": "STRING",
              "description": "Unique identifier for the association"
            },
            {
              "name": "association_type",
              "type": "STRING",
              "description": "Type of association between entities"
            }
          ],
          "connections": [
            {
              "from": "Phenotype",
              "to": "Phenotype"
            }
          ],
          "description": "Represents an association relationship between phenotypes"
        },
        "related": {
          "properties": [
            {
              "name": "id",
              "type": "STRING",
              "description": "Unique identifier for the relation"
            },
            {
              "name": "relation_type",
              "type": "STRING",
              "description": "Type of relation between entities"
            }
          ],
          "connections": [
            {
              "from": "Phenotype",
              "to": "Phenotype"
            }
          ],
          "description": "Represents a general relationship between phenotypes"
        }
      }
    }
  }

get_all_queries()

# Get all in-progress queries
queries = kg.get_all_queries(status="IN_PROGRESS")
print(queries)
{
    "data": [
        {
            "query_id": "6a96cb6b-60ba-451e-89a9-c58e94a272bd",
            "status": "IN_PROGRESS",
            "query_string": "MATCH (t:Tissue)-[r:POSITIVE_SELECTION]-(g:Gene) \n           WHERE g.name IN ['IL4'] AND g.tax_id = 9606 AND \n           RETURN \n           t.name AS Tissue,g.name AS gene,r",
            "query_type": "CYPHER",
            "timestamp": 1771949050521,
            "long_running": true,
            "kg_id": "1739987441_paratus_custom_kg",
            "version_id": 10
        },
        {
            "query_id": "2a966b-60ba-451e-89a9-c58e94a272bd",
            "status": "IN_PROGRESS",
            "query_string": "MATCH (t:Tissue)-[r:POSITIVE_SELECTION]-(g:Gene) \n           WHERE g.name IN ['IRF4'] AND g.tax_id = 9606 AND \n           RETURN \n           t.name AS Tissue,g.name AS gene,r",
            "query_type": "CYPHER",
            "timestamp": 1771949050521,
            "long_running": true,
            "kg_id": "1739987441_paratus_custom_kg",
            "version_id": 10
        }
    ],
    "meta": {
        "page_size": 100
    }
}

# Get failed queries with pagination - shows next_token in response
page1 = kg.get_all_queries(status="FAILED", page_size=1)
print(page1)
{
    "data": [
        {
            "query_id": "6a96cb6b-60ba-451e-89a9-c58e94a272bd",
            "status": "FAILED",
            "query_string": "MATCH (t:Tissue)-[r:POSITIVE_SELECTION]-(g:Gene) \n           WHERE g.name IN ['IL4'] AND g.tax_id = 9606 AND \n           RETURN \n           t.name AS Tissue,g.name AS gene,r",
            "query_type": "CYPHER",
            "timestamp": 1771949050521,
            "long_running": true,
            "kg_id": "1739987441_paratus_custom_kg",
            "version_id": 10
        }
    ],
    "meta": {
        "page_size": 1,
        "next_token": "eyJxdWVyeV9pZCI6ICI2YTk2Y2I2Yi02MGJhLTQ1MWUtODlhOS1jNThlOTRhMjcyYmQiLCAiYXN5bmNfcXVlcnlfcGsiOiAiaS0wMDgyMTllMjY0MDMzN2FiZSNGQUlMRUQiLCAidGltZXN0YW1wIjogMTc3MTk0OTA1MDUyMSwgInVzZXJfaWQiOiAiMTc0MDAzNjA0OCJ9"
    }
}

# Use next_token to get the next page of results
next_token = page1['meta']['next_token']
page2 = kg.get_all_queries(
    status="FAILED",
    page_size=1,
    next_token=next_token
)
print(f"Found {len(page2['data'])} more queries")
Found 1 more queries

# Filter queries by instance_id
queries = kg.get_all_queries(status="QUEUED", instance_id="inst_abc123")
print(queries)
{
    "data": [
      {
          "query_id": "6a96cb6b-60ba-451e-89a9-c58e94a272bd",
          "status": "QUEUED",
          "query_string": "MATCH (t:Tissue)-[r:POSITIVE_SELECTION]-(g:Gene) \n           WHERE g.name IN ['IL4'] AND g.tax_id = 9606 AND \n           RETURN \n           t.name AS Tissue,g.name AS gene,r",
          "query_type": "CYPHER",
          "timestamp": 1771949050521,
          "long_running": true,
          "kg_id": "1739987441_paratus_custom_kg",
          "version_id": 10
      }
    ],
    "meta": {
        "page_size": 1
    }
}

# Use all parameters together: status, page_size, instance_id, and next_token
# First, get the first page for a specific instance
page1 = kg.get_all_queries(
    status="COMPLETED",
    page_size=10,
    instance_id="inst_abc123"
)
print(f"First page: {len(page1['data'])} queries")

# If there's a next_token, get the next page with all parameters
if 'next_token' in page1.get('meta', {}):
    page2 = kg.get_all_queries(
        status="COMPLETED",
        page_size=10,
        instance_id="inst_abc123",
        next_token=page1['meta']['next_token']
    )
    print(f"Second page: {len(page2['data'])} queries")
First page: 10 queries
Second page: 10 queries

# Get queries with different status values
completed = kg.get_all_queries(status="COMPLETED", page_size=5)
in_progress = kg.get_all_queries(status="IN_PROGRESS", page_size=5)
failed = kg.get_all_queries(status="FAILED", page_size=5)
queued = kg.get_all_queries(status="QUEUED", page_size=5)
cancelled = kg.get_all_queries(status="CANCELLED", page_size=5)

print(f"Completed: {len(completed.get('data', []))} queries")
print(f"In Progress: {len(in_progress.get('data', []))} queries")
print(f"Failed: {len(failed.get('data', []))} queries")
print(f"Queued: {len(queued.get('data', []))} queries")
print(f"Cancelled: {len(cancelled.get('data', []))} queries")
Completed: 5 queries
In Progress: 2 queries
Failed: 1 queries
Queued: 0 queries
Cancelled: 3 queries

cancel_query()

# Submit a long-running query
query_id = kg.submit_query("MATCH (n) RETURN n LIMIT 10000000;")
print(f"Query submitted: {query_id}")

# Check status
status = kg.get_query_status(query_id)
print(f"Query status: {status['status']}")

# Cancel the query
result = kg.cancel_query(query_id)
print(result['data'])
Query submitted: 7c8b34cd-6130-45de-92f6-dd626e10d827
Query status: IN_PROGRESS
Query cancellation initiated successfully

# Verify cancellation
status = kg.get_query_status(query_id)
print(f"Query status after cancellation: {status['status']}")
Query status after cancellation: 'CANCELLED'

get_all_kgs()

# Get only published KGs
kgs = kg.get_all_kgs()
print(kgs)
[
    {
        "kg_id": "14_base_kg_v3",
        "kg_name": "base kg v3",
        "kg_description": "Base KG version 3",
        "version_id": 1,
        "created_at": 1756115807076,
        "published": true
    },
    {
        "kg_id": "15_medical_kg",
        "kg_name": "medical kg",
        "kg_description": "Medical Knowledge Graph",
        "version_id": 2,
        "created_at": 1756120000000,
        "published": true
    }
]

# Get all KGs including unpublished ones with instance details
kgs_with_instances = kg.get_all_kgs(
    include_unpublished=True,
    include_instances=True,
    include_terminated=True
)
print(kgs_with_instances)
[
    {
        "kg_id": "14_base_kg_v3",
        "kg_name": "base kg v3",
        "kg_description": "Base KG version 3",
        "version_id": 1,
        "created_at": 1756115807076,
        "published": true,
        "instances": [
            {
                "instance_id": "inst_abc123",
                "instance_type": "t3.medium",
                "is_terminated": false,
                "default_instance": true,
                "org_id": 1,
                "created_at": 1756115807076
            }
        ]
    },
    {
        "kg_id": "15_medical_kg",
        "kg_name": "medical kg",
        "kg_description": "Medical Knowledge Graph",
        "version_id": 2,
        "created_at": 1756120000000,
        "published": false,
        "instances": [
            {
                "instance_id": "inst_xyz789",
                "instance_type": "t3.large",
                "is_terminated": false,
                "default_instance": true,
                "org_id": 1,
                "created_at": 1756120000000
            }
        ]
    }
]