flowchart LR
A[Raw Data Sources] --> B[Data Ingestion Layer]
B --> C[Normalisation Layer]
C --> D[Feature & Factor Engine]
D --> E[Valuation & Scoring Engine]
E --> F[Versioned Analytical Store]
F --> G[Dashboard Layer]
F --> H[API Layer]
F --> I[Monitoring & Alerts]
F --> J[Oracle-Compatible Output Layer]
````
The diagram above captures the basic logic of the platform. Raw data enters the system. It is standardised and transformed into comparable token representations. Analytical features are then constructed. These features feed valuation and scoring models. The results are stored in a version-aware analytical layer and then exposed through different interfaces.
This sequence is important because it prevents a common weakness in crypto analytics systems: directly exposing partially processed data to users without a stable analytical backbone.
## Data Source Layer
The first architectural block is the raw data source layer. LSDx requires multiple categories of input because no single source captures the full economic state of an LSD.
### Protocol and staking data
This category includes information directly related to the staking protocol and LSD design. Examples include:
* staking reward mechanics,
* protocol fees,
* validator set composition,
* node operator distribution,
* exchange-rate growth or rebasing behaviour,
* withdrawal and exit mechanism design,
* and protocol-level metadata.
This information is essential for understanding how value accrues and where operational or structural risk may enter.
### Market price data
Observed market price is a necessary input but not a sufficient one. LSDx uses price data not as a complete answer but as one component in the interpretation framework. Relevant inputs include:
* token spot price,
* relative price against native asset,
* historical premium or discount behaviour,
* price volatility,
* and pricing behaviour during stress.
### Liquidity and microstructure data
Because many LSD risks emerge through tradeability rather than purely protocol mechanics, liquidity information is central. This includes:
* on-chain pool depth,
* swap slippage,
* venue concentration,
* trading volume,
* routing quality,
* and market fragility under stress.
### Redemption and exit-related data
Some LSDs are more easily redeemed or unwound than others. Redemption conditions influence both fair value and risk. Inputs in this category may include:
* expected waiting time,
* queue conditions,
* protocol exit design,
* dependence on secondary market liquidity,
* and practical conversion friction.
### Governance and dependency data
Some risks are not numerical in the narrow sense but remain analytically important. LSDx should therefore ingest or maintain structured metadata related to:
* governance centralisation,
* upgradeability,
* protocol dependency graph,
* audit maturity,
* and structural reliance on other systems.
This category does not have to begin fully automated. Some components may start as curated research inputs and later become more systematised.
## Data Ingestion Layer
The ingestion layer collects raw information from the different sources and converts it into a controlled internal data flow.
The ingestion design should satisfy four requirements:
* regularity,
* integrity,
* timestamping,
* and source traceability.
Every raw observation entering the system should carry at least:
* a source identifier,
* a timestamp,
* an asset or token identifier,
* a field definition,
* and a retrieval version.
This is important because later analytical disagreements often originate not in the model itself, but in ambiguity about what data was used and when.
### Ingestion modes
The platform may operate with several ingestion modes.
#### Scheduled ingestion
This is the standard mode for periodic retrieval of market, liquidity, and protocol data. It is suitable for dashboards and daily analytics.
#### Event-triggered ingestion
This mode is useful when a major event occurs, such as sudden market dislocation, unusual premium widening, or protocol incident. In such cases, the system may refresh selected inputs more aggressively.
#### Research or manual ingestion
Some structural inputs, such as governance changes or protocol design characteristics, may initially be curated manually or semi-manually. This is acceptable if clearly versioned and documented.
### Raw data registry
A good design choice is to maintain a raw data registry that stores the unmodified form of retrieved observations before any modelling transformation takes place.
This serves several purposes:
* debugging,
* reproducibility,
* source validation,
* and later methodology revision.
A system that only stores processed outputs becomes difficult to trust once the model evolves.
```mqccdzznu
flowchart TD
A[External Source A] --> E[Ingestion Controller]
B[External Source B] --> E
C[External Source C] --> E
D[Research / Manual Inputs] --> E
E --> F[Raw Data Registry]
F --> G[Validation & Quality Checks]
G --> H[Normalised Data Store]
6 System Architecture and Implementation Design
6.1 Why Architecture Matters
A quantitative framework becomes useful only when it can be implemented in a disciplined and reproducible manner. In the context of LSDx, architecture is not a secondary engineering detail. It is part of the credibility of the analytical layer itself.
The purpose of this chapter is to translate the conceptual framework of LSDx into a concrete system design. The key question is no longer what LSDx wants to measure, but how the platform should ingest data, transform it into structured intelligence, preserve methodological consistency, and expose the resulting outputs to users and downstream systems.
This matters for several reasons.
First, LSD analytics are data-intensive. The value and risk profile of a Liquid Staking Derivative depends on protocol mechanics, market conditions, secondary liquidity, validator composition, redemption pathways, and protocol-level dependencies. These inputs come from heterogeneous sources and do not arrive in a single neat format.
Second, a serious analytical system must distinguish between raw data, transformed features, model outputs, and end-user interpretation. Without this separation, methodology becomes opaque, reproducibility weakens, and users may confuse observations with judgements.
Third, if LSDx is to evolve beyond a static research paper into an infrastructure layer, the architecture must be modular. New tokens, new chains, new data sources, and improved models must be incorporable without breaking the design.
The architectural goal of LSDx is therefore not merely to create a dashboard. It is to define a quantitative intelligence engine that is transparent, extensible, and operationally robust.
6.2 Architectural Objectives
The implementation design of LSDx should satisfy several objectives simultaneously.
6.2.1 Analytical integrity
The system should preserve a clean boundary between observed data and derived outputs. Market prices, liquidity snapshots, staking rates, and validator sets are observations. Fair value estimates, adjusted yields, regime states, and risk scores are model outputs. The system must never blur these categories.
6.2.2 Reproducibility
The same input state and the same methodology version should produce the same analytical outputs. This is especially important if LSDx is used later for treasury decisions, comparative research, or machine-readable policy systems.
6.2.3 Modularity
Each analytical block should be independently maintainable. Token normalisation, liquidity diagnostics, fair value estimation, and regime monitoring should not be tightly entangled. A modular structure allows the system to improve without requiring a complete redesign.
6.2.4 Extensibility
The architecture must be able to absorb: - new LSDs, - new chains, - new wrappers, - new risk factors, - improved scoring functions, - and new delivery surfaces such as APIs or oracle-compatible outputs.
6.2.5 Interpretability
A user should not only see a score. They should be able to trace where it came from. The system should retain component-level information and expose factor decomposition when needed.
6.2.6 Delivery flexibility
The same analytical core should support different interfaces: - research dashboards, - comparative reporting, - APIs, - alerting systems, - and, where suitable, smart-contract-consumable outputs.
6.3 High-Level System View
At a high level, LSDx can be understood as a layered system with six major blocks:
- data ingestion,
- normalisation,
- feature and factor construction,
- model and scoring engine,
- storage and versioning,
- delivery and interface layer.
The architecture should be designed so that each layer performs one type of function clearly and well.
The function of the ingestion layer is therefore not merely retrieval. It is controlled acquisition with auditable lineage.
6.4 Data Validation and Quality Control
No analytical system should assume that incoming crypto data is automatically reliable. The quality-control layer is therefore essential.
6.4.1 Why validation matters
Different sources may disagree. Fields may be missing. Liquidity data may be stale. Wrapped token mappings may be incomplete. Certain protocol metadata may change without warning. If these issues are not detected, the model outputs may appear precise while resting on flawed inputs.
6.4.2 Validation categories
LSDx should ideally perform validation along several dimensions.
6.4.2.1 Schema validation
Incoming data should be checked for structural correctness. Required fields should be present, types should match expectation, and token mappings should be valid.
6.4.2.2 Range validation
Values that lie outside plausible ranges should be flagged. For example, sudden extreme changes in reported yield, implausible fee values, or negative liquidity depth should trigger inspection.
6.4.2.3 Cross-source consistency
Where possible, critical quantities should be cross-checked across more than one source. If observed values differ materially, the system should either reconcile them under a rule or explicitly mark the field as uncertain.
6.4.2.4 Freshness validation
Some data loses usefulness quickly. The system should know whether a value is sufficiently fresh for its purpose.
6.4.2.5 Completeness validation
A token should not be scored as though it were fully covered if crucial inputs are missing. In such cases, the system should either withhold outputs or mark them with reduced confidence.
6.4.3 Confidence-aware data state
One practical design choice is to attach a confidence state to the analytical record of each token. This may include:
- complete,
- partial but usable,
- degraded quality,
- insufficient for scoring.
This protects the system from pretending to know more than it does.
6.5 Token Normalisation Layer
The normalisation layer is one of the most important parts of the LSDx architecture. It converts heterogeneous token designs into a unified analytical representation.
6.5.1 Purpose of the layer
Without normalisation, tokens cannot be compared consistently. A rebasing token and a non-rebasing token may reflect economically similar staking exposure while differing in representation. A wrapped LSD may embed another layer of abstraction. Some tokens derive convenience value from integration rather than staking mechanics. These differences must be standardised before fair comparison can begin.
6.5.2 Core normalisation tasks
The normalisation layer should perform tasks such as:
- mapping token identifiers to canonical assets,
- translating balance mechanics into total-return representation,
- estimating redemption-based anchor value,
- standardising fee and carry fields,
- mapping liquidity observations to comparable dimensions,
- and attaching structural descriptors such as token type, wrapper type, or exit pathway class.
6.5.3 Canonical token object
A useful implementation idea is to define a canonical token object for internal use. Each LSD entering the system is translated into this standard representation before further analytics are applied.
Such an object may contain:
- token metadata,
- economic state variables,
- structural descriptors,
- and links to the relevant raw data lineage.
The goal is not to flatten all differences away. It is to represent those differences within a common structure.
flowchart LR
A[Token-Specific Raw Inputs] --> B[Canonical Mapping]
B --> C[Return / Carry Standardisation]
C --> D[Redemption Anchor Estimation]
D --> E[Liquidity & Exit Mapping]
E --> F[Canonical LSD Object]
This layer should be treated as a standalone analytical responsibility, not as a side effect inside scoring code.
6.6 Feature Engineering and Factor Construction
Once tokens have been normalised, the next task is to transform their state into analytical features and factor inputs.
6.6.1 Feature categories
The feature layer may be organised into several broad categories.
6.6.1.1 Valuation features
These features support fair value estimation. They may include:
- carry proxies,
- net yield fields,
- premium or discount history,
- redemption anchor deviation,
- convenience proxies,
- and cost-of-friction measures.
6.6.1.2 Liquidity features
These support execution-quality assessment and liquidity scoring. They may include:
- depth,
- slippage at representative trade sizes,
- venue concentration,
- stress fragility,
- and persistence of market activity.
6.6.1.3 Structural risk features
These capture protocol and design-related attributes such as:
- validator concentration,
- governance concentration,
- contract complexity,
- protocol dependencies,
- and exit pathway limitations.
6.6.1.4 Behavioural and regime features
These capture dynamic behaviour over time, for example:
- widening of premium/discount,
- acceleration in liquidity deterioration,
- increased volatility of dislocation,
- and persistence of abnormal states.
6.6.2 Feature computation principles
The feature engine should follow three principles.
6.6.2.1 Deterministic computation
Given the same inputs and the same methodology version, feature computation should be deterministic. This reduces ambiguity and supports reproducibility.
6.6.2.2 Explainability
Each feature should have a clear interpretation. It should be possible to explain why a feature exists and what it measures.
6.6.2.3 Modular evolution
New features should be addable without forcing redesign of the entire system.
6.7 Valuation and Scoring Engine
This is the analytical centre of the system. It transforms structured features into outputs that users can interpret and use.
6.7.1 Core analytical modules
The engine may be divided into several modules.
6.7.1.1 Fair value engine
This module estimates model-based fair value or fair value range using:
- redemption-based anchor,
- expected net carry,
- liquidity adjustments,
- structural risk discounts,
- exit friction adjustments,
- and convenience premium estimates.
6.7.1.2 Risk factor engine
This module computes component-level risk measures across the major dimensions defined in the framework:
- peg risk,
- liquidity risk,
- redemption risk,
- validator risk,
- protocol risk,
- governance risk,
- and composability risk.
6.7.1.3 Composite scoring engine
This module maps component-level risk and value information into summary metrics such as:
- composite risk score,
- adjusted yield,
- yield efficiency,
- relative-value indicator,
- and use-case-specific suitability score.
6.7.1.4 Monitoring and regime engine
This module evaluates whether a token remains in a normal, watch, stress, dislocation, or recovery regime based on time dynamics and threshold logic.
6.7.2 Why separation within the engine matters
It is tempting to write a single piece of code that takes all inputs and outputs a final score. That would be a mistake.
The engine should preserve internal separation because:
- users may trust decomposition more than summary,
- different delivery surfaces may require different subsets of output,
- methodology changes may affect one module but not another,
- and the system becomes much easier to test and govern.
flowchart TD
A[Canonical LSD Object] --> B[Feature Engine]
B --> C[Fair Value Engine]
B --> D[Risk Factor Engine]
B --> E[Liquidity Diagnostics Engine]
C --> F[Relative Value Module]
D --> G[Composite Risk Module]
E --> H[Stress Liquidity Module]
F --> I[Suitability Engine]
G --> I
H --> I
I --> J[Monitoring & Alert Engine]
The logic is hierarchical. The system should not create a final interpretation before its components are computed.
6.8 Methodology Versioning
A serious analytical platform requires methodological version control. This is not only a software-engineering concern. It is part of the epistemic discipline of the system.
6.8.1 Why versioning is essential
Models evolve. Factor weights change. A new liquidity penalty may be introduced. A redemption adjustment formula may be improved. If outputs change over time, users need to know whether the token changed, the market changed, or the methodology changed.
Without versioning, this distinction becomes blurred.
6.8.2 Versioned analytical records
Every published or stored analytical record should ideally contain:
- token identifier,
- timestamp,
- methodology version,
- input coverage status,
- output values,
- and confidence metadata.
This design allows:
- backtesting,
- auditability,
- research comparison across model versions,
- and transparent communication with users.
6.8.3 Stability versus innovation
Methodology should improve over time, but not in an uncontrolled way. Versioning allows LSDx to remain innovative without becoming arbitrary.
6.9 Storage Design
The storage layer sits between the analytical engine and the delivery surfaces. It should not merely cache outputs. It should preserve the analytical state of the system in a structured way.
6.9.1 Storage categories
The architecture may distinguish several stores.
6.9.1.1 Raw data store
This contains unprocessed retrieved observations. It supports debugging, source traceability, and research reproducibility.
6.9.1.2 Normalised data store
This contains the canonical token representation after translation and standardisation.
6.9.1.3 Analytical output store
This contains computed outputs such as:
- fair value estimates,
- risk factors,
- liquidity scores,
- adjusted yield measures,
- suitability outputs,
- and regime states.
6.9.1.4 Metadata and methodology store
This contains:
- methodology versions,
- token taxonomy,
- source maps,
- and system configuration.
6.9.2 Historical persistence
LSDx should preserve history, not overwrite it. Historical analytical records are essential for:
- comparing tokens through time,
- evaluating regime transitions,
- testing whether scores were informative,
- and supporting research use cases.
6.10 Delivery Layer
The delivery layer translates the analytical system into user-facing or machine-facing products.
6.10.1 Dashboard delivery
The dashboard should expose the analytical outputs in a clear and disciplined manner. It should not overload the user with raw data that bypasses the model, nor should it hide the decomposition entirely.
A strong dashboard should allow users to see:
- current market state,
- model fair value range,
- premium or discount to model,
- component risk scores,
- adjusted yield,
- and context-specific suitability.
6.10.2 API delivery
The API layer allows programmatic consumption of LSDx outputs. This is essential for systematic users such as:
- research teams,
- DAO treasury tools,
- vault infrastructure,
- and risk-monitoring systems.
API responses should be structured, version-aware, and explicit about coverage and confidence.
6.10.3 Alert delivery
The monitoring engine should be able to trigger structured alerts when:
- deviation from fair value widens materially,
- liquidity deteriorates,
- regime state shifts,
- or one of the structural risk dimensions worsens.
This turns LSDx into an operational intelligence layer rather than a passive reporting tool.
6.10.4 Oracle-compatible delivery
A later-stage output layer may expose a conservative subset of metrics in oracle-compatible form. This should only be done for signals that are sufficiently robust, interpretable, and manipulation-resistant.
The architecture should allow this possibility without forcing premature on-chain publication of every analytical output.
6.11 Confidence, Coverage, and Output Discipline
One of the strongest ways to make LSDx credible is to let the system admit uncertainty explicitly.
6.11.1 Coverage-aware output logic
The system should not score every token identically if data coverage differs materially. Instead, outputs should reflect whether the analytical view is:
- complete,
- partial,
- degraded,
- or insufficient.
6.11.2 Confidence annotation
Each major output may include a confidence indicator or coverage note. This is especially useful when:
- the token is new,
- one or more sources are stale,
- structural inputs are manually curated,
- or liquidity diagnostics are incomplete.
This design improves honesty and reduces false confidence.
6.12 Operational Workflow
The full operational workflow of LSDx can be described in a simple sequence.
- external sources are queried or updated,
- raw observations are stored with lineage and timestamp,
- validation checks are run,
- token-specific data is normalised into canonical form,
- features and factors are computed,
- valuation and scoring modules generate outputs,
- results are stored with methodology version and confidence state,
- dashboards, APIs, and monitoring systems consume the outputs.
The following visual summarises the process.
flowchart TD
A[External Data Retrieval] --> B[Raw Data Registry]
B --> C[Validation & Quality Checks]
C --> D[Canonical Token Normalisation]
D --> E[Feature & Factor Construction]
E --> F[Valuation, Scoring, Monitoring]
F --> G[Versioned Analytical Store]
G --> H[Dashboard]
G --> I[API]
G --> J[Alerts]
G --> K[Future Oracle Layer]
This sequence is not just a technical pipeline. It expresses the logic of the platform itself.
6.13 Implementation Philosophy
The implementation philosophy of LSDx should remain aligned with the intellectual philosophy of the paper.
First, build the analytical core correctly before over-expanding interfaces.
Second, preserve decomposition rather than collapsing everything into opaque summary outputs.
Third, treat methodology as an evolving but governed system.
Fourth, prioritise transparency over cosmetic complexity.
Fifth, design the system so that future extensions such as multi-chain support, broader token classes, or scenario engines can be added without conceptual inconsistency.
The objective is not only to build a tool that works today. It is to build an analytical foundation that can remain coherent as the LSD market evolves.
6.14 Closing Remarks
The architecture of LSDx is designed to support a disciplined translation from heterogeneous staking-token reality into structured financial intelligence.
The system begins with raw, multi-source inputs. It then validates, normalises, transforms, scores, stores, and delivers the resulting analytics through interfaces appropriate to different user classes. Throughout this process, the architecture aims to preserve traceability, interpretability, version awareness, and modularity.
This chapter therefore turns the framework from an abstract methodology into an implementable platform design.
The next step is to show why this architecture matters in practice. That is the role of the use-case chapter, where the same analytical engine is viewed through the eyes of treasuries, allocators, vault designers, collateral systems, and advanced DeFi researchers.