Knowledge Graph · June 2026

Open Knowledge Format
meets RDF, Linked Data & SPARQL

Google Cloud introduced OKF v0.1 on June 12, 2026 — a Markdown+YAML format for AI knowledge sharing. This infographic situates OKF in the broader semantic web landscape, modelled as an RDF Knowledge Graph.

Source: Google Cloud Blog  ·  Authors: Sam McVeety, Amir Hormati  ·  KG curated by kg-generator, rdf-infographic-skill, and Claude Sonnet 4.6 on behalf of Kingsley Idehen

📄
What is OKF?

The Open Knowledge Format is a vendor-neutral, agent- and human-friendly format for representing metadata, context, and curated knowledge for AI systems — built from Markdown files with YAML frontmatter.

Core Unit: Concept Document

A Markdown file with YAML frontmatter. The only mandatory field is type. All other fields (title, description, resource, tags, timestamp) and body sections are producer-defined conventions.

text/markdown application/yaml git-friendly

Container: Knowledge Bundle

A portable directory hierarchy of ConceptDocuments with index.md entry points per subdirectory. Shippable as a tarball or hosted in a git repository alongside code.

portable vendor-neutral no-SDK-required

Enrichment Agent

An AI producer that walks data sources (BigQuery datasets) and automatically drafts OKF concept docs — with a second LLM pass to add citations, join paths, and metric definitions.

AI producer BigQuery

Static Visualizer

Converts any OKF bundle into a single self-contained interactive HTML graph file — no backend, no cloud service required. A KnowledgeConsumer that renders without modifying the bundle.

static HTML no backend

🔑
OKF Design Principles

Three governing principles ensure OKF remains minimally opinionated, portable, and LLM-friendly.

Principle 1

Minimally Opinionated

Only type is required. Types, additional fields, and body sections are left entirely to the producer. The spec defines interoperability surface, not the content model.

Principle 2

Producer / Consumer Independence

Knowledge writing is fully decoupled from consumption. Humans, AI enrichers, and export tools can all produce; AI agents, visualizers, and search indexes can all consume independently.

Principle 3

Format, Not Platform

No proprietary accounts, services, SDKs, or cloud providers required. OKF is a file-format spec — it lives anywhere files live.


📁
ACME Sales Knowledge Bundle

Example OKF bundle from the blog post — modelled as an okf:KnowledgeBundle in the companion RDF-Turtle instance-data file.

sales/ ├── index.md okf:IndexDocument — entry point, links to all sub-resources ├── datasets/ │ └── orders_db.md okf:DatasetDocument — "ACME Orders Database" ├── tables/ │ ├── orders.md okf:TableDocument — order_id, customer_id, order_date, revenue │ └── customers.md okf:TableDocument — cust_id (FK ← orders), name, email, signup_date └── metrics/ └── weekly_active_users.md okf:MetricDocument — "Distinct customers with ≥1 order in past 7 days"

In the RDF knowledge graph, the customer_id column in orders carries an explicit okf:isForeignKeyTo triple pointing to the customers TableDocument — a relationship that OKF can only express in prose.


⚖️
OKF vs RDF / Linked Data / SPARQL

Both approaches share the same goals. The comparison below shows where OKF has advantages, where RDF/LD/SPARQL goes further, and where they are equivalent.

Dimension OKF (Markdown + YAML) RDF / Linked Data / SPARQL
Entity Identity Limitation
Relative file paths — break on restructuring, no global uniqueness
Advantage
HTTP IRIs — globally unique, dereferenceable, persistent across organisations
Queryability Limitation
File-system traversal or LLM prompting — no standard query language
Advantage
SPARQL 1.1 — SELECT, CONSTRUCT, aggregates, property paths, federation
Semantic Fidelity Limitation
Semantics implicit in prose. type is free-form — software cannot distinguish synonyms
Advantage
Explicit rdf:type from OWL ontologies. XSD datatypes. Machine-verifiable
Federation Limitation
Not supported — combining bundles requires custom engineering
Advantage
SPARQL SERVICE keyword — span remote endpoints in one query
Inference Limitation
None — LLM reasoning is probabilistic and not reproducible
Advantage
RDFS/OWL reasoners derive new facts deterministically
Shared Vocabulary Limitation
Free-form strings — semantic islands across organisations
Advantage
schema.org, PROV-O, SKOS, FOAF — reuse = automatic interoperability
Version Control Parity
Markdown diffs are human-readable in GitHub PRs
Parity
Turtle diffs are also human-readable and semantically interpretable
Human Authoring Advantage
Markdown is the most widely-adopted human readable format — zero new tooling
Trade-off
Turtle is readable but has a learning curve; LLM generation reduces the gap
LLM Friendliness Advantage
Designed for LLM context windows; Markdown+YAML is well-represented in training data
Advantage
Frontier LLMs generate valid Turtle reliably; SPARQL from natural language works well
Adoption Barrier Advantage
Near-zero — any developer can produce OKF with a text editor
Trade-off
Triplestore setup needed; LLM tooling now significantly lowers the practical barrier

🔗
Tim Berners-Lee's Linked Data Principles

The four principles that turn RDF into the Web of Data — giving every entity a permanent, globally dereferenceable address.

1

Use IRIs as names for things

Every entity — a table, metric, person, or concept — must have a globally unique IRI as its identity. This makes entities unambiguous across any system. OKF uses relative file paths, which are bundle-local and non-global.

2

Use HTTP IRIs so names can be looked up

IRIs should use the http: or https: scheme so that any agent can resolve the identifier over the Web and retrieve a description of the entity.

3

Return useful RDF information when IRIs are dereferenced

When an HTTP IRI is looked up, the server should return structured RDF data describing the entity — enabling fully machine-readable discovery of schema, relationships, and provenance.

4

Include links to other IRIs

RDF descriptions should include owl:sameAs and other linking predicates to related entities in external knowledge graphs, so agents can navigate the Web of Data.


🔍
SPARQL Queries over the OKF Knowledge Graph

These queries run against the RDF knowledge graph generated from the OKF blog post. Each demonstrates a capability that OKF's Markdown+YAML format cannot provide natively.

🔗 Named graph: okf-instance-data-claude_sonnet_4_6-1.ttl — 76 triples · endpoint: URIBurner SPARQL

Q1 Query orders joined with customer names

Joins okf:OrderRecord and okf:CustomerRecord instances via the shared okf:customerId data property — a real SQL-style JOIN over row-level instance data, not a description of the schema.

PREFIX okf: <https://linkeddata.uriburner.com/DAV/demos/daas/okf-open-knowledge-format-ontology.ttl#okf-> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT DISTINCT ?orderId ?customerName ?orderDate ?revenue FROM <https://linkeddata.uriburner.com/DAV/demos/daas/okf-instance-data-claude_sonnet_4_6-1.ttl> WHERE { ?order a okf:OrderRecord ; okf:orderId ?orderId ; okf:customerId ?custId ; okf:orderDate ?orderDate ; okf:revenueUSD ?revenue . ?customer a okf:CustomerRecord ; okf:customerId ?custId ; okf:customerName ?customerName . } ORDER BY ?orderDate
Run on URIBurner SPARQL
Q2 Compute Weekly Active Users (Week of 2026-06-01)

Counts distinct customers who placed at least one order in the week of 2026-06-01, computing the WeeklyActiveUsers metric directly from okf:OrderRecord instance data.

PREFIX okf: <https://linkeddata.uriburner.com/DAV/demos/daas/okf-open-knowledge-format-ontology.ttl#okf-> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT (COUNT(DISTINCT ?custId) AS ?weeklyActiveUsers) ("2026-06-01 to 2026-06-07" AS ?weekLabel) FROM <https://linkeddata.uriburner.com/DAV/demos/daas/okf-instance-data-claude_sonnet_4_6-1.ttl> WHERE { ?order a okf:OrderRecord ; okf:customerId ?custId ; okf:orderDate ?orderDate . FILTER(?orderDate >= "2026-06-01"^^xsd:date && ?orderDate <= "2026-06-07"^^xsd:date) }
Run on URIBurner SPARQL
Q3 Total revenue by customer (top spenders)

Aggregates revenue across all okf:OrderRecord instances per customer, joining to okf:CustomerRecord for names — SPARQL GROUP BY aggregation over actual row data.

PREFIX okf: <https://linkeddata.uriburner.com/DAV/demos/daas/okf-open-knowledge-format-ontology.ttl#okf-> SELECT ?customerName (SUM(?revenue) AS ?totalRevenue) (COUNT(?order) AS ?orderCount) FROM <https://linkeddata.uriburner.com/DAV/demos/daas/okf-instance-data-claude_sonnet_4_6-1.ttl> WHERE { ?order a okf:OrderRecord ; okf:customerId ?custId ; okf:revenueUSD ?revenue . ?customer a okf:CustomerRecord ; okf:customerId ?custId ; okf:customerName ?customerName . } GROUP BY ?customerName ?customer ORDER BY DESC(?totalRevenue)
Run on URIBurner SPARQL

What RDF / LD / SPARQL Adds Beyond OKF

Eight capabilities that RDF+LD+SPARQL provide that Markdown+YAML cannot.

🌐

Global, Dereferenceable Identity

HTTP IRIs give every entity a permanent, globally unique address that any agent on the Web can look up. OKF file paths are bundle-local and break on restructuring.

🔍

Declarative Standard Query Language

SPARQL provides SELECT, CONSTRUCT, ASK, DESCRIBE, aggregates, and property paths — a complete query algebra. OKF's only query mechanism is LLM prompting or custom parsing code.

🔀

Cross-Graph Federation

A single SPARQL query can span your local graph, DBpedia, Wikidata, and any SPARQL endpoint simultaneously. OKF bundles are isolated islands requiring custom engineering to combine.

🧠

Deterministic Logical Inference

OWL and RDFS reasoners derive new facts from existing RDF automatically — e.g., symmetric join relationships, subclass membership, inverse properties. This is reproducible; LLM reasoning over OKF is probabilistic.

📚

Shared Vocabularies and Semantic Interoperability

Using schema.org, PROV-O, SKOS, and domain ontologies makes your knowledge automatically interoperable with every other publisher using those vocabularies. OKF's free-form type string creates semantic islands.

🕸️

Participation in the Web of Data

RDF and Linked Data connect your knowledge graph to billions of existing RDF triples in DBpedia, Wikidata, schema.org-annotated pages, government open data, and scientific datasets.

Formal, Machine-Verifiable Semantics

SHACL validators can verify that a knowledge graph conforms to a shape — every table has a name, every metric has a source table, etc. OKF "validation" means having an LLM read the prose.

📋

First-Class Provenance

PROV-O is natively composable with RDF — every triple or named graph carries prov:wasGeneratedBy, prov:wasAttributedTo, prov:generatedAtTime. OKF relies on a timestamp YAML field.


🔄
Convergence Thesis

OKF and RDF/LD/SPARQL are complementary layers, not competitors.

OKF as the Authoring Layer; RDF as the Query Layer

OKF solves knowledge capture and authoring for teams with zero semantic web expertise. RDF/LD/SPARQL solves knowledge integration, reasoning, and Web-scale federation. The optimal architecture is OKF-style Markdown as the human authoring surface with an RDF extraction step that converts OKF bundles into queryable knowledge graphs — combining OKF's ease of production with RDF's power for consumption.

OKF → parse YAML → map to ontology → mint IRIs → load triplestore

Step 1

Extract YAML frontmatter from all OKF .md files into structured objects.

Step 2

Map fields to RDF propertiestyperdf:type; titleschema:name; resourceschema:url.

Step 3

Mint HTTP IRIs — replace relative paths with dereferenceable identifiers and load into a SPARQL triplestore.


Three structural layers every OKF document must follow, each linked to its knowledge graph entity.

Step 1

YAML Frontmatter

Every OKF document begins with a YAML block delimited by ---. Required field: type (string, producer-defined). Optional standard fields: title, description, resource (URL), tags (list), timestamp (ISO 8601).

Step 2

Markdown Body

Following the YAML block, the document body is standard GitHub-Flavored Markdown. Typical sections include Schema (table definitions), Joins, Sample Queries, Notes, and Related Documents.

Step 3

Relative Links for Cross-Document Navigation

Documents within a bundle reference each other via relative Markdown links (e.g., [customers](/tables/customers.md)). This preserves portability across hosting environments.


Common questions about the Open Knowledge Format and its relationship to RDF, Linked Data, and SPARQL.

OKF is a vendor-neutral, minimally opinionated specification for packaging knowledge as Markdown files with YAML frontmatter. Each file is a typed knowledge document — concept, dataset, table, metric, index, or runbook — and the only mandatory field is type. OKF is designed so that LLM producers can generate and maintain bundles without human bookkeeping overhead.

OKF addresses the fragmented context landscape: in most organizations, internal knowledge (table schemas, metric definitions, runbooks, API notices) is scattered across incompatible proprietary systems. Every AI agent builder must re-assemble context from scratch, and knowledge becomes locked behind platform-specific APIs. OKF gives any agent or tool a portable, human-readable, LLM-friendly knowledge starting point.

The only mandatory field is type, specified in the YAML frontmatter. All other fields — title, description, resource, join, columns, formula, etc. — are optional and contextually meaningful based on the declared document type. This minimally opinionated design lowers the barrier to producing OKF-compliant knowledge documents.

OKF defines six core document types: concept (narrative definitions and explanations), dataset (pointers to tabular data sources), table (relational schemas with column definitions and FK relationships), metric (business metric formulas and aggregation logic), index (bundle manifests that group related documents), and runbook (operational procedures and step-by-step guides).

Traditional data catalogs are platform-specific SaaS products with vendor APIs and proprietary storage. OKF documents are plain Markdown files readable by humans, LLMs, and any text-processing tool without licensing, SDKs, or vendor lock-in. The core design principle is Format Not Platform: the specification is a file format, not a product.

Producer/consumer independence means knowledge producers (data engineers, domain experts, LLMs) write OKF Markdown files without knowing how consumers will use them. Consumers (LLMs, BI tools, enrichment agents, SPARQL endpoints) read standard OKF documents without depending on the producer's toolchain. Each side evolves independently, reducing coupling and increasing reuse.

OKF and RDF/SPARQL are complementary layers: OKF is the authoring layer optimized for human and LLM production via simple Markdown + YAML; RDF/SPARQL is the query and integration layer enabling Web-scale federation, SPARQL reasoning, and global linking via HTTP IRIs. OKF bundles can be extracted to RDF triples by mapping type fields to ontology classes and minting dereferenceable IRIs — combining OKF's ease of production with RDF's power for knowledge consumption.


Key terms defined by or central to the Open Knowledge Format (OKF v0.1) specification, each linked to its knowledge graph entity.

An OKF index document that groups related typed knowledge documents into a coherent, portable unit. The bundle manifest defines scope and inter-document relationships, enabling any consumer to load complete context without platform dependencies.
An AI agent or automated pipeline that reads OKF knowledge bundles and produces derived artefacts — summaries, embeddings, additional semantic metadata, resolved FK relationships, or RDF triples. Enrichment agents are consumers that also act as knowledge producers.
Any system or agent that reads OKF-formatted documents to retrieve and apply structured knowledge: LLMs loading context, BI tools visualizing metrics, SPARQL query engines processing extracted triples, and enrichment pipelines producing derived knowledge.
The OKF design principle that delegates cross-reference maintenance and bookkeeping — the tasks that make traditional human-authored wikis go stale — to LLMs. LLMs excel at structured-update tasks: refreshing FK references, propagating renames, and updating index files across the bundle.
The structured metadata block at the top of an OKF Markdown file, delimited by ---. Contains the mandatory type field and optional fields such as title, description, resource, join, columns, and formula. YAML frontmatter is both human-editable and machine-parseable.
OKF design principle ensuring that knowledge producers write OKF files without coupling to any specific consumer toolchain, and knowledge consumers read OKF files without depending on any specific producer's system or API. Independence reduces integration friction and allows both sides to evolve separately.
OKF design principle asserting that the specification is a portable file format (Markdown + YAML), not a SaaS platform, proprietary product, or vendor API. Any tool that reads text files can produce or consume OKF knowledge without licensing fees, SDK dependencies, or platform lock-in.

Explore Knowledge Graph using SPARQL

Run live query