Loktak technical docs

The implementation behind the data layer.

This is the engineering reference for Loktak: the eight-stage pipeline and its failure behavior, the canonical finance schema, the append-only lineage engine and its query API, and how business rules are encoded, versioned, and improved through the knowledge layer.

Pipeline documentation

Component boundaries
& pipeline mechanics

Loktak and InSightOS are two products with one strict boundary. Each layer has a single, clearly bounded responsibility, and neither crosses into the other's domain.

Data layer
Loktak
Deterministic, auditable, versioned. Raw data in, canonical finance objects out. Every transformation logged. Never probabilistic. Never writes to source systems.
Ingestion
Entity resolution
Normalisation
Knowledge encoding
Lineage tracking
Canonical schema
Execution layer
InSightOS
Agent orchestration, variance attribution, scenario simulation, decision reasoning, and UI. Reads from Loktak canonical objects only. Writes decision records back to the lineage ledger on approval.
Agent orchestration
Variance attribution
Scenario simulation
Decision UI
Audit export
Component boundaries: architecture spec
v3.4
# Loktak layer responsibilities (strictly bounded)
# Ingestion → raw source connectors, format parsers, schema validators
ingest_csv() / ingest_excel() in ingestion_service.py
# Resolve → entity deduplication, cross-system conflict resolution
infer_column_mapping() in normalization.py
# Normalize → account code mapping, currency conversion, period alignment
apply_mapping() in normalization.py
# Encode → business rules, allocation logic, IC eliminations, KPI derivations
create_object() in object_service.py
# Lineage → provenance records, transformation log, version registry
create_lineage_event() in lineage_service.py

# InSightOS layer (never crosses into raw data)
# Agents → reads canonical finance objects from loktak.schema only
execute_agent_workflow() in agent_executor.py
# Audit → appends decision records to loktak.lineage on approval
LoktakClient.log_lineage_event() in client.py

# STRICT: InSightOS cannot write to loktak.ingest or loktak.resolve
# STRICT: loktak.lineage is append-only — no modifications permitted by any process
Full pipeline run log: Q4 close cycle
Complete · 21s
[2026-11-14 09:12:04 UTC] run_id: lok_20261114_091204 · schema: v3.4
[09:12:07] Stage 01 complete · 14,842 rows parsed · 0 format errors · 0 missing fields
[09:12:09] Stage 02: integrity_score=0.997 · 2 anomalies flagged (below halt threshold)
[09:12:09] Anomaly: cost_centre CC-4821 · 0 activity for 3 consecutive periods · logged
[09:12:14] Stage 03 complete · 14,842 rows matched · 23 conflicts resolved · 0 unmatched
[09:12:15] Stage 04 complete · 0 PII fields · sensitivity_scan: clean
[09:12:18] Stage 05 complete · 847 accounts mapped · segment taxonomy v1.4 applied
[09:12:21] Stage 06 complete · 14 rules applied · IC eliminations: $2.1M · OPEX alloc: $4.8M
[09:12:24] Stage 07 complete · 14,842 provenance records written (append-only, tamper-evident)
[09:12:25] Pipeline complete · integrity: 0.997 · ready for InSightOS agents
[09:12:25] Published: finance_snapshot_20261114 · schema_version: 3.4
Stages 01–05

Hard halt on failure. Validation errors, sub-threshold integrity scores, and unresolved entities stop the pipeline before any rule logic runs. Nothing partial is passed forward.

Stage 02

Soft hold, not a hard halt. Scores below threshold route to a human review queue rather than blocking the run outright — the pipeline can proceed once reviewed.

Stage 06

Hard constraint violations (e.g. allocations not summing to pool) halt the run. Soft warnings (e.g. driver coverage below 99%) are logged and surfaced, not blocking.

Stages 07–08

No conditional failure — these stages either complete and write a full record, or the run is not considered complete and InSightOS never receives the handoff.

Canonical schema documentation

The finance object model
InSightOS agents reason on.

Loktak v3.4 defines 11 core finance object types with typed fields, relationships, and versioning semantics. InSightOS agents reason exclusively on these objects, which is what makes outputs traceable and grounding scores calculable.

finance.account code · name · type · parent_id finance.entity type · parent_id · jurisdiction finance.forecast amount · version · assumption_set_id finance.period fiscal_year · fiscal_quarter calendar finance.variance delta · pct_delta · drivers[] knowledge.rule type · definition · version effective_from ATOMIC UNIT finance.entry amount · currency period_id · source_ref AUDIT LAYER lineage.record source_ref · transformation_chain[] · hash
Object typeCore fieldsDescriptionLinked to
finance.accountaccount_id, code, name, type, classification, parent_idChart of accounts node. Maps to canonical account taxonomy.finance.entry, finance.period
finance.entryentry_id, account_id, amount, currency, fx_rate, period_id, source_ref, run_idSingle-row financial fact. Atomic unit of all reporting.finance.account, finance.period, finance.entity
finance.periodperiod_id, fiscal_year, fiscal_quarter, fiscal_month, calendar_start, calendar_endFiscal period definition. Handles 4-4-5, 4-5-4, monthly.All time-series objects
finance.entityentity_id, name, type, parent_id, currency, jurisdictionLegal entity or cost centre in the organisational hierarchy.finance.entry, finance.elimination
finance.eliminationelimination_id, debit_entity, credit_entity, amount, period_id, rule_versionIntercompany elimination entry. Links to rule that produced it.finance.entity, knowledge.rule
finance.forecastforecast_id, account_id, period_id, amount, assumption_set_id, version, created_atForward-looking value with version and assumption linkage.finance.assumption, finance.account
finance.assumptionassumption_id, forecast_id, driver, value, rationale, created_at, created_byNamed assumption driving a forecast value. Human-authored.finance.forecast
finance.variancevariance_id, account_id, period_id, actual, forecast, delta, pct_delta, drivers[]Computed variance with driver attribution array.finance.entry, finance.forecast
finance.pipeline_metricmetric_id, account_id, period_id, arr, nrr, churn, source_ref, snapshot_atCRM-sourced pipeline and ARR metrics. Updated on CRM sync.finance.entry, crm.opportunity
knowledge.rulerule_id, name, type, definition, version, effective_from, created_byEncoded business rule. Versioned, immutable once published.finance.elimination, finance.entry
lineage.recordrecord_id, output_ref, source_ref, transformation_chain[], run_id, value_hashProvenance record. Append-only. Links output to source.All finance objects
finance.variance: canonical object example (Enterprise Q4)
schema v3.4
{
"object_type": "variance",
"variance_id": "var_20261114_EMEA_REV",
"forecast_value": 7244000,
"actual_value": 8642000,
"variance_amount": 1398000,
"variance_percentage": 0.1929,
"currency": "USD",
"period": "2026-Q4",
"cost_center_id": "cc_EMEA",
"lineage_id": "lin_lok_20261114_091204_EMEA_REV",
"explanation": "Forecast shortfall driven by DACH expansion and FX headwinds."
}
Schema versioning
The canonical schema is versioned independently of any single rule or pipeline run (currently v3.4). Versions follow major.minor.patch semantics: patch releases add optional fields, minor releases add new object types or relationships without breaking existing consumers, and major releases are reserved for breaking changes to existing fields. Every published canonical object and lineage record is tagged with the schema version active at write time, so InSightOS can interpret older objects correctly even after the schema evolves.
Lineage documentation

Every number. Every transformation.
Every source. Permanently.

The lineage engine maintains an append-only, tamper-evident ledger of every data transformation, so any output can be traced back to its origin in a single query.

Append-only architecture
Records added, never modified or deleted
Once a provenance record is written, it cannot be modified or deleted by any user, process, or agent. This includes Loktak itself. Attempts to modify existing records are rejected and logged as integrity violations.
Storage: append-only LSM tree · write-once semantics
Tamper detection: SHA-256 hash chain · block signing
Retention: configurable, default 7 years · regulatory holds supported
Full transformation chain
Source row to output, every step recorded
Source system and row identifier, each normalisation step applied, the business rule versions used, FX rates applied, period mappings, and the canonical object produced. A single query returns the full chain in milliseconds.
Query: lineage_for(output_ref) → transformation_chain[]
Performance: O(log n) via compound index on output_ref
Export: structured JSON · CSV · PDF audit report
Decision layer integration
InSightOS decisions appended to the same ledger
When a human approves a recommendation, the decision record, including approver, data version, reasoning chain, and action taken, is appended to the lineage ledger alongside Loktak provenance records. The full chain from source to approved decision lives in one place.
Decision record: approver · data_version · reasoning · confidence · action · timestamp
SOX export: machine-readable report · human-readable PDF
Comparative lineage
What changed between two pipeline runs
Given two pipeline run IDs, Loktak produces a structured diff of every value that changed: what the old value was, what the new value is, and what transformation or rule change caused it. It answers "why is this month's number different?" without rebuilding the model.
lineage.diff(run_id_a, run_id_b) → changed_values[], root_causes[]
Format: structured diff · human-readable summary · InSightOS-ready variance object
lineage_for("var_20261114_ENTERPRISE_REV"): example output
Traced
output_ref: finance.variance.var_20261114_ENTERPRISE_REV · delta: +$1,398,000

transformation_chain:
[1] source: netsuite_gl_nov14 · row_ids: GL-84421…GL-84897 (477 rows)
[2] parsed: CSV → staging schema · 0 errors · 2026-11-14T09:12:07Z
[3] resolved: entity CC-4819 (Enterprise) via netsuite primary key exact match
[4] normalised: segment_tag: enterprise_core → Enterprise revenue rollup
[5] rule: alloc_rule_v3 · Enterprise overhead: -$142,000 · rule_version: 3.2.1
[6] rule: ic_elim_rule_v2 · intercompany: $0 applicable to Enterprise Revenue
[7] compared: fc_20261001_ENTERPRISE_base · fy2026_q4_m1 · forecast: $7,244,000
[8] variance: delta: +$1,398,000 · pct: +19.3% · 3 drivers attributed

lineage_record: lin_lok_20261114_091204_ENTERPRISE_REV
value_hash: sha256:a4f8c2e1… · tamper-evident · written: 2026-11-14T09:12:24Z
Knowledge layer documentation

Your finance logic, encoded once,
applied everywhere and versioned forever.

Your organisation's specific finance intelligence lives not in a spreadsheet, not in someone's head, and not in undocumented model logic. It is encoded in Loktak, applied to every pipeline run, and versioned permanently.

Six rule types supported
Every category of finance logic encodable
Allocation (distribute cost pools by a defined driver), elimination (remove intercompany transactions), adjustment (management adjustments to GAAP), derivation (calculated metrics from base data), validation (flag constraint violations), and enrichment (add context from external sources).
Definition: YAML-based DSL · plain-English intent field · test suite required
Review: human approval required before publishing any rule
Diff: structured diff view vs. prior version on every update
Versioning semantics
Every rule version preserved and pinnable
When a rule is updated, the new version is published with an effective_from date. Prior pipeline runs retain their original rule version in the lineage record, so a reforecast comparison between November and October uses the rule versions active in each period, not the current version.
Versioning: semantic (major.minor.patch) · immutable once published
History: full version history retained indefinitely
Pinning: pipeline can be pinned to a specific rule version for comparison
LoRA-based SME encoding
Domain expertise encoded into the model layer
For complex heuristics that are difficult to express as deterministic rules, such as industry-specific anomaly thresholds and seasonality patterns unique to your business model, Loktak supports encoding domain expert knowledge via LoRA adapters trained on your historical decisions. Clearly labelled as probabilistic in all outputs.
Method: LoRA fine-tuning on historical decision data
Labelling: probabilistic outputs clearly distinguished from deterministic
Update: retrain trigger on significant decision pattern shift · human approval required
Feedback loop integration
Human decisions improve the knowledge layer over time
When InSightOS surfaces a recommendation and a human overrides it, Loktak registers the override as a feedback signal. Patterns of overrides on specific rule applications surface as rule improvement suggestions for human review. The model never updates automatically without explicit approval.
Signal: override events · approval events · manual rule edits
Suggestion: pattern-detected improvements → human review queue
Gate: no automatic rule updates · human approval required for all changes
Rule definition: OPEX allocation (YAML DSL)
rule_version: 3.2.1
rule_id: alloc_rule_v3
name: "OPEX allocation to cost centres by headcount"
type: allocation version: 3.2.1 effective_from: 2026-09-01
approved_by: eddie@phrasiq.com · 2026-08-28T14:32:00Z

source_account: acc_OPEX_shared_pool
target_accounts: [acc_OPEX_ENTERPRISE, acc_OPEX_SMB, acc_OPEX_MID_MARKET, acc_OPEX_STRATEGIC]
driver: finance.pipeline_metric.headcount_fte # from rippling HRIS
method: proportional # (target_hc / total_hc) × pool_amount
period_scope: month_end

validation:
- allocations_sum_to_pool: true # hard constraint; pipeline halts if violated
- no_negative_allocations: true # hard constraint
- driver_coverage: >= 0.99 # warn if headcount <99% of target entities

intent: "Replace the manual spreadsheet-based allocation that ran quarterly. Runs every pipeline now."
Continuous improvement

Loktak learns from every decision
your team makes.

Every override, every approval, and every manual rule edit feeds back into the system, making Loktak progressively more aligned with how your finance team actually thinks.

Continuous improvement loop diagram
1
InSightOS surfaces a recommendation
Variance attributed, scenario proposed, and reforecast recommended based on the current knowledge layer state.
2
Finance team approves or overrides
Human judgment applied. If overridden, the reason is optionally captured. It can be quantitative or qualitative.
3
Loktak registers the signal
The event is written to the feedback log. Patterns of similar overrides on the same rule or driver are tracked.
4
Knowledge layer updates
When a pattern is statistically significant, a suggested rule update appears in the knowledge management UI for human review and approval. No automatic rule changes, ever.
5
Outputs improve on next cycle
The updated knowledge layer means InSightOS recommendations are more aligned with how your team thinks. No retraining sprint. No manual updates required.
The compounding advantage
The longer your team uses Loktak, the more aligned the knowledge layer becomes with your specific finance logic. Accuracy improves. The rule library deepens. The override rate decreases. It creates a compounding advantage that makes switching costs real, and it makes every PhrasIQ output progressively more defensible over time. The model layer is replicable. Your encoded finance logic, your decision history, and your rule library are not.
Integrations documentation

Custom REST API connectors

Pre-built connectors handle authentication, pagination, incremental sync, and schema mapping for every major ERP, CRM, and data warehouse. Custom sources connect via the Loktak REST API and are mapped to the canonical finance schema automatically.

Custom REST API connector: configuration (YAML)
example
connector_id: custom_erp_legacy type: rest_api
base_url: https://erp.internal.yourco.com/api/v2
auth: bearer_token # stored in Loktak secrets vault
sync_mode: incremental sync_frequency: daily_at_02:00_UTC

endpoints:
- path: /gl/entries · maps_to: finance.entry · cursor: updated_at
- path: /accounts · maps_to: finance.account · sync_mode: full_refresh
- path: /cost_centres · maps_to: finance.entity · sync_mode: full_refresh

field_mappings:
account_number → finance.entry.account_id
posting_amount → finance.entry.amount
posting_currency → finance.entry.currency
fiscal_period → finance.period.period_id

# Loktak auto-generates entity resolution rules from the existing entity graph
# Signal integrity check runs on first sync. It is visible in dashboard before agents run.