Why We Open-Sourced Our Pre-Consult Symptom Schema — From Stakeholder Mapping to 26 Lawyer-Confirmed Conclusions
Blog/
||||||

Why We Open-Sourced Our Pre-Consult Symptom Schema — From Stakeholder Mapping to 26 Lawyer-Confirmed Conclusions

iRehab Brief Schema v0.1 went public today. It defines 'how to author a structured pre-consult symptom questionnaire' — not a product, not a platform, not a medical device. Just a structured blueprint. This piece records how we walked from a stakeholder mapping exercise to the conclusion 'format can be open, logic stays closed forever' — why we picked Apache-2.0 + CC BY 4.0 dual licensing, why Republic of China (Taiwan) law as the governing law, what 26 lawyer-confirmed pre-draft conclusions signal to integrators, and finally a direct workflow for clinicians: hand the GitHub URL to your AI assistant and have it design a specialty-specific brief for your practice.

Starting from an awkward observation

Honestly, as far as we can tell, very few clinical software companies are actually shipping a "structured pre-consult symptom questionnaire" feature today. iRehab (Taiwan) is one of them. We could not find peers in Taiwan or internationally building this exact thing.

Does that mean the problem hasn't surfaced yet? No — it means the problem is about to surface.

Look at the history of medical IT: every hospital wrote its own EHR / HIS, structures didn't interoperate, decades of integration debt followed. The US ran the same script with Meaningful Use (2009) and the 21st Century Cures Act (2016) — necessary corrections, but they only happened after the fragmentation already existed. The cost of walking back fragmentation is always orders of magnitude higher than preventing it.

If "structured pre-consult intake" becomes the next ubiquitous clinical software feature — and AI proliferation makes that nearly certain — the same fragmentation pattern is likely to repeat:

  • Each vendor builds their own (first 3 years)
  • Competitive pressure forces standardization talks (years 5-7)
  • But by then internal schemas are calcified, integration debt is stacked, and migration costs are prohibitive

When we mapped this trajectory against the stakeholders who would care — HIS vendors, third-party scheduling platforms, clinical societies, individual specialty physicians, academic citers, medical device / private insurance partners — we noticed a pattern. Each constituency would eventually need a format. But every collaboration path under a proprietary regime is high-friction, slow, and expensive — for both sides.

It would take a single vendor 10× the resources to do this through bilateral business development, and it would still likely fail.

So — rather than waiting for everyone to reinvent and then negotiate, we put a reference format on the table now, as a starting point for that conversation.

This blueprint may not become the final standard. But it does move the discussion onto the table.


A framework: distributor vs engine

Pull the Brief feature apart and you find two layers.

The "format" layer is JSON schema, field definitions, the trigger points for red-flag modules — technically nothing extraordinary. Anyone who's read the FHIR Questionnaire resource specification can reproduce this layer in a weekend.

The "logic" layer is where the actual work lives: how the red-flag pipeline judges (thresholds, weighting, escalation copy), how the AI SOAP integration is prompted and tuned, how the override corpus accumulates, how the Composer editor designs the physician's authoring experience.

These two layers have historically been bundled, because the SaaS instinct is "sell everything." But bundling has a cost: every HIS vendor who wants to natively support your format has to sign licenses, pay royalties, and clear legal review. Friction is too high; nobody bothers.

The moment you separate the two, the strategy clarifies:

Format can be open. Logic stays closed forever.

Format is the distributor. Logic is the engine. Keep the engine closed (commercial moat); release the distributor for free (adoption multiplier).


What opening up unlocks

Once the format is public, the following events become structurally possible — not "smoother", but actually impossible under proprietary terms:

  • HIS vendors can read the schema and natively integrate — no NDAs, no royalties, no legal review
  • Third-party platforms can implement format reciprocity — "your questionnaire and mine can be read by the same parser" is impossible under proprietary terms
  • External physicians can contribute specialty packs via pull requests (OB-GYN, neurology, psychiatry, rehab, …) — a contributor channel that simply doesn't exist in closed-source form
  • Academic papers can cite the schema as a citable standard

Each of these would cost 10× more under bilateral business development, with high failure rates. Open-sourcing turns "we have to push" into "they will come."


What stays closed — forever

The boundary needs to be explicit. Without it, an open-core strategy is easily challenged with "you just haven't open-sourced enough yet."

The following six components will never enter this repository (documented in LICENSE_STRATEGY.md §6):

  1. Red-flag pipeline internal logic (thresholds, weighting, scoring algorithms, multi-question composition rules, escalation copy) — this is medical decision logic; opening it loses the quality gate, and it's also our actual moat
  2. AI SOAP integration prompts and tuning — prompt engineering is continuously evolving commercial IP
  3. Patient PII handling code — privacy compliance is not something to touch lightly; any open-source modification triggers a compliance audit
  4. iRehab Doctor PWA UI / UX — workflow design is commercial differentiation
  5. Outcome registry data pipeline — core of the B2B revenue stream
  6. Composer (the GUI editor for schema authoring) — key differentiation in physician experience

More importantly: each clinic, hospital, or service provider's needs in these areas differ too much. We don't think these decisions should be centralized at any one vendor. Leaving them to each organization and their own AI to build is the healthier design.


Dual licensing structure

ContentLicense
Schema files (spec/v0.1/schema/*.json), validator codeApache-2.0 (with patent grant)
Spec documents, examples, tutorials, READMECC BY 4.0 (commercial use / adaptation / redistribution permitted with attribution)

The two licenses are not interchangeable — apply the right license depending on the file type when citing.

The combination is deliberate: Apache-2.0's explicit patent grant clause provides strong protection for HIS vendors and third-party platforms adopting the schema validator. CC BY 4.0 is the international standard for open content licensing and is the most familiar to academic citers.


Why governing law is Republic of China (Taiwan)

The licensor is a Taiwan company (De Novo Orthopedics 谷盺生物科技股份有限公司). Trademarks are registered in Taiwan ("iRehab 愛復健 with logo" covers Nice classes 9/10/35/44; "DE NOVO ORTHOPEDICS / 谷盺生物科技" covers 5/9/10/35/42/44). Liability and dispute resolution flow through Taiwan most directly.

The declaration is in the repository and README. It does not modify the Apache-2.0 / CC BY 4.0 license texts themselves (both prohibit text modification). Instead, it serves as the licensor's "governing law statement" — a separate declaration accompanying, but not altering, the underlying license texts. Court of first instance: Taichung District Court, where De Novo Orthopedics is registered.

See docs/governing-law.md for the full statement.


26 lawyer-confirmed pre-draft conclusions

Before going public, we authored four groups of pre-draft legal opinions:

  • Group A: Personal Data Protection Act (9 conclusions)
  • Group B: Electronic Medical Records and Medical Care Act (6 conclusions)
  • Group C: Open-source licensing and intellectual property (8 conclusions)
  • Group D: Personal Data Protection Commission audits and miscellaneous (3 conclusions)

Our retained counsel confirmed on 2026-05-06: 26/26 conclusions confirmed, 0 corrections required.

This does not mean "no risk" — it means "the current design holds up under Taiwan's existing legal framework." The single "to-be-supplemented" item is D-1c, a written Personal Data File Security Maintenance Plan, an internal audit-backup document. Not public, not blocking release.

The signal to HIS vendors / third-party platforms / clinical societies: this repository's legal foundation has been actually reviewed by counsel — not simply published in enthusiasm.


The most direct workflow for clinicians

If you're a physician and you're reading this thinking "what could I actually do with this" — here's the most direct workflow:

Hand this GitHub URL to your AI assistant — ChatGPT / Claude / Gemini / Microsoft Copilot — and have it do the following:
  • Design an 8-10 question structured pre-consult brief for your specialty (OB-GYN, neurology, psychiatry, family medicine, emergency, pediatrics, cardiology…)
  • Convert your existing paper questionnaire into a JSON brief that matches this schema
  • Plan how to integrate the brief into your clinic / hospital systems

No engineering required. Once the AI has the repository URL, it will read spec/v0.1/authoring-spec.md, examples/orthopedics-complete.json, docs/overview.md, and docs/for-doctors.md on its own, then walk you through the design step by step — producing a JSON file you can use directly.

Compliance environments use a local LLM. Many hospitals (Taiwan medical centers, US HIPAA-bound institutions, EU GDPR-bound institutions) prohibit sending clinical content to external AI services. But "designing a questionnaire format" doesn't involve patient data — the schema itself is public, and what the AI receives is a format blueprint plus your specialty's needs. The workflow stays compliant even in the strictest environment. Options: Azure OpenAI on a hospital tenant, Ollama on a local server, an in-house LLM provided by your HIS vendor. Microsoft has the most aggressive hospital-side compliant deployment — if your institution already uses Microsoft 365 / Azure, Copilot is most likely already wired through compliance channels; check with your IT department.

A few sample prompts:

"I'm a psychiatrist; my outpatient cases are predominantly anxiety / insomnia / mood disorders. Using the spec at https://github.com/Denovortho/open-irehab-brief-schema, design a structured 8-10 question pre-consult brief that outputs valid JSON conforming to brief-template.schema.json. Cover PHQ-9 / GAD-7 territory in spirit, but do not directly transcribe those instruments."

"I have a 12-question paper pre-op questionnaire (attached as image). Convert it into iRehab Brief Schema v0.1 JSON (repo: https://github.com/Denovortho/open-irehab-brief-schema). Preserve the original intent, structure where possible, ensure i18n with at least zh-TW + en."

The first attempt won't be perfect, but it gets you started today — instead of waiting for "an engineer / a budget / an IT cycle."


Repo + Releases

Ten files: spec + dual licensing + Taiwan governing law + lawyer-opinion trail.


What's next

  • Reference validator (Apache-2.0): targeted for v0.2, enforces cross-field rules
  • More specialty examples: community contributions welcome via PR
  • HIS integration patterns documentation
  • Webhook receiver reference implementation
  • v1.0 stable release: targeted 2026 Q4, activates the forward-compatibility commitment

If you want to contribute a specialty pack for your field, see CONTRIBUTING.md for the process.

For commercial integration / licensing / trademark inquiries: service@denovortho.com


Why this matters

Not because "open source = good."

It matters because the history of every hospital writing its own HIS — and the decades of integration debt that followed — should not repeat itself. The pain of cross-hospital records exchange in Taiwan today, the time it takes to onboard a new institution into any healthcare network, the lateral cost of vendor switching — all of this is the price paid for letting the original 30-year fragmentation happen unchecked. If "structured pre-consult intake" follows the same path, by 2035 we'll be having the same standardization conversation we should be having now, with another decade of accumulated integration debt.

And not because "open source = generosity."

It matters because "the entity that puts the format on the table first" frequently turns out, in retrospect, to be the standard-setter for that domain. HTTP (Tim Berners-Lee gave it away), TCP/IP (Vint Cerf and Bob Kahn published it), FHIR (HL7 released it) — none of them won by maximum sales price. They won by being first to leave the format publicly visible.

We may not end up in that position. But we are putting a proposal on the table today, to see if it can become a common starting point.

If you also believe "structured pre-consult intake" deserves a common format, we welcome critique, citations, forks, and pull requests. We also welcome challenges to our design choices — the entire purpose of the spec is to put choices on the table for debate.


Further reading

International standards and licensing

  • HL7 FHIR QuestionnaireHL7 FHIR R5 Questionnaire Resource. The same problem domain, with broader scope and more complexity. Worth comparing the design tradeoffs.
  • Apache License 2.0 full textApache Software Foundation. Includes the patent grant clause. The license used for the schema and validator code in this repo.
  • Creative Commons CC BY 4.0 full textCreative Commons. The international standard for open content licensing. Used for spec docs and examples in this repo.

Historical precedents for open standards

Adjacent posts on the De Novo blog