Holochain RSM Migration Guide

This is a living document, which will be updated as Holochain RSM matures and approaches beta status. We invite and encourage you to contribute your own discoveries and fixes to this document, which lives in the holochain-open-dev/blog GitHub repo.

Cartoon drawing: A phoenix, liveried in Holochain colours and the letters 'RSM' on its breast, rises from the flames and ashes of Holochain Redux.

Holochain has always been about individuals making changes to their local state, which then transform a shared, globally visible state. First, an individual participant commits entries to her local source chain, then those entries are transformed into DHT operations and broadcast to the appropriate validation authourities. An entry can be considered ‘a thing that’s been said’, while a commit represents ‘the act of saying something’. This is a peer-to-peer application of the Event Sourcing pattern: source chains are streams of events that affect both their local states and the shared DHT. If all the source chains were replayed up to any point in their histories, they’d recreate the DHT at that point.

The biggest change with Holochain RSM is more attention to this reality. It’s now more explicit in the design of the new API. For instance, when you commit an entry, you get back a header hash, not an entry hash. What Holochain is saying is “here’s the ID of the event that represents you speaking this entry into existence”. And when you update or delete something, once again you’re acting on a commit, not an entry. This ends up making Holochain RSM better at modelling the multi-perspective nature of the real world; Alice and Bob can now both create a to-do item saying “Buy milk after work”, and Alice can mark hers as done without affecting Bob’s (or all the previous times she created that same item). No need to disambiguate with extra timestamp fields in your entry structs.

Holochain RSM also hides fewer things behind abstractions. For instance, canonical updates no longer exist. Now, updates and deletes are simply accumulated for your code to retrieve and make sense of. This requires a bit more thinking, but there are a couple benefits:

  • Because there are fewer ‘leaky abstractions’ to try to make guesses about, things are clearer and you can reason about them more correctly. The dreaded ‘update loop’ bug no longer exists.
  • With access to lower-level building blocks, you’re free to create patterns that work best for your use case. Consider two styles of wiki, one that has ‘official’ versions of each article like Wikipedia and another that captures different people’s perspectives like Federated Wiki. Now you can create either kind with the same tools.

Finally, the HDK is no longer necessary. It’s still strongly recommended, because getting data across the WebAssembly boundary is still fiddly, but the core API is now simple enough for you to use directly if the HDK gets in your way.

Language

Holochain Redux Holochain RSM Comment
DNA instance Cell A DNA paired with an agent key, running in the conductor.
Entry with Header pair Element The data structure that represents a commit, which is a header + optional entry data. Each commit to a source chain is represented as an element.
Commit Create While they’re still the same thing, we felt ‘create’ was more in line with the CRUD language you’re used to from other frameworks.
Remove Delete See above. Note that deleted things are still just marked as deleted; their data still exists on the DHT.
Update RSM now allows you to do ‘multivalent’ updates; that is, many branching non-canonical realities.
Update Update with a redirect flag The previous concept of canonical updates will be replaced with updates carrying a ‘redirect’ flag; this feature is not yet available.

Data structures

Holochain Redux Holochain RSM Comment
Headers contain the signature of the entry. Agents now sign the header; the (header, signature) pair is what gets distributed to the DHT. This prevents third-party forging of source chains.
There is only one header type; system actions are special entry types. There is a different header type for each action, and some system-level actions’ data (DNA, links creates/deletes, entry deletes) are completely contained in their headers. This reduces DHT chatter. Header structs
Entries and headers don’t have a joint struct since they are not used together very often. An Element is defined as a Header, alongside its Entry if the header type contains one. The majority of HDK calls accept and return Elements or their hashes.

Design

Holochain Redux Holochain RSM Comment
Links must have explicitly defined types, and contain type and tag string. Links have no type, and can contain any arbitrary data as a vector of bytes. This makes links more versatile. It also addresses some issues devs were experiencing; now you can easily create links between entry types in different zomes.
Entry type names are not namespaced; identical definitions in two separate zomes can clash with each other. Entry types are represented by a numeric zome index and numeric entry type index. These indexes are ordered by appearance – zomes by their appearance in the dna.json manifest file; entry types by their appearance in the return value of your zome’s entry_defs callback. It’s more difficult to work with numbers than names, so expect future HDK tooling to help with this.
You can only validate an agent with their public key and nickname. There is a new MembraneProof entry which contains data that proves that you have permission to join the DHT — invite codes, vouches from existing members, proof of subscription, etc. The membrane proof is passed to the conductor when an agent creates a cell, then saved into the second source chain element, the AgentValidationPkg. Validation callbacks for membrane proofs are currently unimplemented.
You have no way to retrieve all the headers committed by an agent. You can ask an agent ID’s validation authorities for the agent’s entire history, including evidence of a forked chain. This is one of the pillars of Holochain’s integrity model; peer witnessing of agent activity prevents malicious agents from counterfeiting their history.
Bridges (communication channels between a user’s DNA instances) must be set up at install time and cannot contain any circular relationships. Calls to other cells are not preconfigured, can be made at any time, but are subject to the capability-based access model (that is, cells owned by one agent ID are covered by the author grant, but cells owned by different agent IDs will need an explicit grant/claim). Now that circular relationships are allowed, discipline is requred to prevent infinite recursion – especially when considering transitive dependencies with third-party DNAs (e.g., DNA A depends on DNA B depends on DNA C depends on DNA A).

CRUD

Holochain Redux Holochain RSM Comment
Update and delete operate on an entry hash. Update and delete operate on a header hash. This is a big shift in focus from data to state changes, with a lot of benefits: it disambiguates between the same data written at different time by different people, allows previously deleted entries to be re-created, and avoids the confusion caused by update loops.
An entry can be only updated or deleted once, and that status change is canonical; conflicting changes can’t be resolved and result in an inconsistent DHT. An entry can be updated or deleted multiple times for diverging realities, a la Git. In the future, updates will get a ‘redirect’ flag and a conflict resolution callback to emulate canonical updates.
Update, delete, and get follow the update chain to the latest version of the specified entry before performing actions. Update, delete, and get operate directly on the specified element. This prevents redirect loops and lets you implement your own selection logic. Future update + redirect functionality will emulate the old behaviour. In the future, canonical redirects may back-propagate to earlier versions of an entry to optimise DHT lookups.
If you delete an entry, that entry is dead forever (tombstone set). An entry is still alive until all headers are deleted (reference counting).  
Creation is called ‘commit’ and deletion is called ‘remove’. Creation is called ‘create’ (e.g., create_entry) and deletion is called ‘delete’. In spite of the name, keep in mind that ‘delete’ still doesn’t actually delete data — it just marks it as dead.
Only app entries can be updated and deleted. App entries, agent public keys, and capability grants/claims (or rather, their elements) can be updated and deleted.  
Deleted data stays on the DHT. In the future you may be able to scrub some data from the DHT: ‘withdraw’ will ask validation authorities to remove an element that you mistakenly committed, and ‘purge’ will ask validation authorities to erase unsafe/illegal content created by others. Data scrubbing will still depend on good faith; malicious validation authorities will be able to ignore these operations and keep the data.
Anyone can write the same entry multiple times. In the future, we’ll introduce simple ‘CRDT’ types for entries that should only have one author. This is good for scarce/rivalrous resources, such as usernames.
Link creates and deletes operate on an entry. Link creates operate on an etry, whereas link deletes operate on a link create element. This means that prevously deleted links can now be recreated.

Development

Holochain Redux Holochain RSM Comment
Zome interaction with the host is complicated and requires the HDK. The host API, and the callbacks API that the host expects the zome to implement, are simple enough to work with directly if the HDK gets in the way. It’s still preferable to use the HDK in most cases, because it hides away the boilerplate code required to work around the Rust compiler and transfer data through the WASM boundary.
Individual entry types are defined with a callback tagged with the #[entry_def] macro and return a ValidatingEntryType struct; the entry! macro helps construct it. Entry types are passed to the host as a wrapped vector of EntryDef structs by your zome’s entry_defs callback. There are many ways to create the EntryDef structs; the most Rust-y is to use the #[hdk_entry] macro on your structs and enums.
The number of required validations cannot be specified. The number of required validations can be specified per entry type definition. Not currently hooked up to the DHT layer.
Capabilities are statically defined beside the zome function, never fully implemented. Capabilities can be dynamically granted or revoked for any function to any agent, for enforcing security on function calls. In the future, granted capabilities may prepopulate function parameters to limit callers’ privileges.
UIs make zome calls freely (behind the scenes, the conductor applies the ‘author’ capability grant). UIs must use a valid capability claim to make a zome call. Currently not fully enforced.
Nodes communicate with send/receive, passing JsonString messages to each other. call_remote allows one agent to call another agent’s function as if it were her own, and enforces capability constraints. You can still emulate send/receive with a receive zome function that has an unrestricted capability grant.
Anchors are defined in a separate library. Anchors are available in the HDK. Anchors are a specialization of a new ‘path’ pattern, and can be sharded to reduce DHT hotspots. Anchor, Path, sharding
Zome functions are tagged with the #[zome_fn] macro and must be defined inside the #[zome] module. Zome functions can be defined anywhere in your code as long as you make them externally visible. This makes module definition, extension, and importing cleaner and easier. Managing WASM data and the Rust compiler is still tricky though; the HDK has tools to make this easier.
Zome functions can take multiple input parameters. There can only be one input parameter for zome functions. Functions that need multiple parameters should define a special serializable struct to hold them (see note below on SerializedBytes.
Validation callbacks are defined alongside the entries and links definitions. Validation callbacks are defined just like zome functions, and must conform to a naming convention validate[_<action>[_<entry_type>[<app_entry_def_name>]]]. Valid actions are create, update, and delete. Valid entry types are entry (for app entries), agent, cap_claim, and cap_grant. You can specify multiple validation callbacks, and the host will try them in decreasing order of specificity until one returns a failure; e.g., validate_update_entry_albumvalidate_update_entryvalidate_updatevalidate. As with zome functions, it’s easiest to use the HDK to help you define these. Validation callbacks for capability grants and capability claims are not currently called. Validation callbacks for membrane proofs are not currently implemented. Link creation and deletion have their own validation callbacks without a specificity cascade — validate_create_link and validate_delete_link.
The JsonString trait is expected for input/output parameters and entry content. Types can implement this using the DefaultJson macro. SerializedBytes is used for input/output parameters and entry content. Types used in all these cases must implement SerializedBytes. SerializedByptes data uses MessagePack by default, is smaller than JSON, and can contain raw binary data without needing to be Base64-encoded. You can make your types implement SerializedBytes automatically with the #[derive(Serialize, Deserialize, SerializedBytes)] macro, which the #[hdk_entry] macro also applies for you. Primitive types can’t be used, but you can wrap them in a simple struct. You can use the empty tuple () for functions that don’t take parameters or return values; we’ve built in a serialisation implementation for it.
UIs refer to instances by an arbitrary string instance_id. UIs refer to cells (instances) by the pair (DNA hash, agent pub key). Referring to a cell by a nickname, defined at instantiation time, may be supported in the future.
All hashes and public keys are identified by the Address type. Each type of hash has a dedicated Rust type (EntryHash, HeaderHash); you can also use AnyDhtHash when needed. holo_hash crate. AnyDhtHash can be used when specifying missing validation dependencies, getting data from the DHT, and linking (as both base and target).
Addresses (hashes and public keys) are serialised/deserialised to/from strings. Addresses are serialised/deserialised to/from vectors of bytes. You can use these wrappers from the hc-utils crate to deserialise from strings in function parameters and serialise to strings in return values; this’ll make it easier to work with addresses in the UI code.
The only callbacks available are ìnit, receive, validate_agent, and the validation callbacks. There will be a lot of useful “hooks”: app_install, app_uninstall, post_commit  
The instances running in the conductor are defined and maintained in the conductor-config.toml file. The conductor-config.toml only contains initial environment settings; the DNA, agent, cell, and interface information are stored in a database.  
Zome functions are not transactional: an initial commit can succeed and stay committed even if a following commit from the same function call fails. All zome functions are transactional; the call fails and rolls back all source chain writes if anything fails inside that call. Local state available to a zome function is a snapshot of the source chain, taken at the beginning of the function call, while global state from the DHT is live, retrieved at get time.
The validation package for an entry is specified in its entry def, and can be either the entry only, the full chain, or just the headers. The validation package for an entry can either be built in a callback as a vector of elements, or specified in its entry def. Validation package types: Element only, all chain elements of the same entry type, or the full chain.
If get_entry fails for a required dependency in an init or validation callback, there’s no way to retry; a validation callback can either succeed or fail. If a get fails or returns None, your init, validation, or validation package callback can return the hash of the missing dependency, and Holochain will pause the validation and retry again in the future. Entry validation callback return types, including UnresolvedDependencies. init, link validation, and validation package callback return types are similar.

HDK calls

Holochain Redux Holochain RSM Comment
All host API functions are complicated to use and are shadowed by Rust functions in the HDK. All host API functions can be used directly, but are shadowed by macros in the HDK to facilitate transfer of data between host and zome. These macros create usability problems with IDEs that support Rust code intel via RLS; they may become functions in the future to fix this issue.
commit_entry returns the entry hash. create_entry returns the header (commit) hash.  
get_entry returns only the entry. get on a header hash returns the full Element (Header + Entry); get on an entry hash returns the latest written Element for that entry.  
There are multiple variations of some host functions: get_entry, get_entry_result, get_entry_as_type There are two variations of get (get and get_details) and get_links (get_links and get_links_details). The ‘details’ calls retrieve all the raw information about the item that is available in the DHT (for example, all its headers, updates/deletes, validation receipts etc.) while simple ones get you the latest element.
hdk::AGENT_ADDRESS gets you the initial public key of the agent. The agent_info host API function gets you both the initial and the latest public key for the agent.  
hdk::DNA_ADDRESS gets you the hash of the DNA. The zome_info host API function gets you the DNA name and hash, and the zome name.  
Random numbers have to come from the UI or be hacked by asking the keystore to generate a new secret. The host API now has a random_bytes function.  
Timestamps in app entries have to come from the UI. The host API now has a sys_time function.  

UI / front-end / client / RPC

Holochain Redux Holochain RSM Comment
UIs call conductor admin functions and the DNA’s zome functions via a local JSON-RPC call over WebSocket. UIs call functions by sending MsgPack-serialised messages over WebSocket. WireMessage envelope for function calls, admin API request enum, zome call invocation struct, holochain-conductor-api for JavaScript-based UIs
Signals are sent to the UI as JSON-serialised objects over WebSocket. Signals are sent as MsgPack-serialised objects over WebSocket. Signal message struct
JavaScript clients can use holochain/hc-web-client to make zome or admin calls and listen for signal. JavaScript clients can use holochain/holochain-conductor-api. This new library has TypeScript typedefs for the zome call function, the admin API function, and all their input/output types. These typedefs document the RPC interface fairly well.
The conductor admin API has the functions install_dna_from_file, uninstall_dna, add_instance, remove_instance, add_interface, remove_interface, add_instance_to_interface, remove_instance_from_interface, add_agent, remove_agent, add_bridge, remove_bridge that manipulate the conductor’s state and saves its state to the conductor config file. The conductor admin API has the functions GenerateAgentPubKey, InstallApp, ActivateApp, DeactivateApp, AttachAppInterface, ListActiveAppIds, ListDnas, ListCellIds, DumpState. The holochain/holochain-conductor-api repo documents the admin interface and its input/return types.

Developer tooling

Holochain Redux Holochain RSM Comment
hc init scaffolds a new DNA directory with a dna.json manifest file and Tryorama test template. No DNA scaffolding function yet.  
hc generate scaffolds a new zome with Cargo tooling and a build script. No zome scaffolding function yet, but zomes are simple Rust library crates and no longer need a complex build script. The build script was necessary for optimising the compiled WASM; this is no longer needed.
DNAs and zomes have app.json and zome.json files that contain metadata (name, description, UUID, arbitrary properties). There are no metadata files; DNA metadata is contained in the dna.json manifest and zome metadata is contained in its Cargo.toml file.  
hc package builds a DNA manifest and a collection of zomes into a DNA package. Zomes are compiled into WASM, their build artifacts are placed in a DNA workdir, and dna-util compiles them into a DNA package along with the dna.json manifest. The dna.json manifest now needs to specify all the included zomes by the path to their WASM file; zomes now no longer have a JSON file. Sample dna.json manifest
hc test runs the Tryorama test script tests/index.js. You can run a Tryorama test with npm run test in the folder that contains your test script’s package.json. There is no scaffolding tool for the Tryorama test script; take a look at a sample.
hc run starts up a development conductor with an instance, RPC interface, and UI server. There is no equivalent yet. holochain-open-dev/holochain-run-dna is a temporary solution created by the dev ecosystem.
Uses Rust nightly. Uses Rust stable. Committed to always targeting Rust stable for both Holochain and the HDK.
Holoscape can be used for dev diagnostics and user-friendly app management. Currently no equivalent, but will eventually have two tray apps – one for devs and one for end-users.  
Keystore is integrated into Holochain. Keystore is a separate binary and can be shared by multiple conductors, similar to ssh-agent or Pageant.  
hApp bundles (DNAs + UI) can be specified using a bundle.toml manifest file. Manifest file not yet supported; the conductor admin API is used to install a hApp and its DNAs. The holochain/holochain-conductor-api repo has fairly self-documenting code for the conductor admin API’s InstallApp function.
Written by Guillem Córdoba, Tatsuya Sato, and Paul d'Aoust on October 30, 2020