Detailed response to Ruben's blog

Tim Berners-Lee
Date: 2023-01-01, last change: $Date: 2023/01/11 23:17:25 $
Status: personal view only. Editing status: first draft. Response to attempt damage control by R's blog which attacks the solid architecture's concept of Solid's linked data graphs being the content of web documents.

Up to Design Issues

Here is the full text of the blog, which suffers from so many different problems at different stages that I have to just comment in line. It seems, firstly, just not to understand, or take into account the solid architecture, in particular the "client to client" standards which provide interoperability the Application level (level 7). It seems to have issue with the pod as being a set of graphs The only valid issue

Ruben Verborgh

30 December 2022

Let’s talk about pods

A new solution space for apps emerges if we adopt a better model for thinking about Solid.

Who writes, decides”. That is, the first app to sculpt documents and containers in your pod determines where other apps need to look for data.

No, that's not how Solid works. If the first app to write into your pod is one which has designed this data shape for the first time, and then it documents it as a spec which is then adopted by the community, then that is how it works, bt in general, no apps writing according to to stanbdards -- most of which still need to be written.

Apps above the line, pod servers below the line - the green line and the pink line. — The Solid architecture has two layers of key standards - the Client-Server architecture (green) and the Client-Client standards (pink)

Unfortunately, this creates an undesired dependency between apps, which now have to agree amongst each other on how to store things. Yet Solid promises apps that will seamlessly and independently reuse data in order to provide us with better and safer experiences. At the heart of this contradiction is that the mental model we’re using for Solid pods no longer works. This model restricts our solution space and is a main reason why apps struggle to reuse each other’s data. In this blog post, I argue why we should stop thinking of a pod as a set of documents,

The Pod, in RDF terms, is a quadstore, not a triple store. A triples store is not powerful enough. The 4th part of the quad, the ID of the graph, we call a 'Document' to make it match with the way people talk. They might be called Named Graphs or Linked Data Resources but "Documents" is simpler. The fact that the ;inked data in a pod is basically a set of distinct graphs is really important. (This is one of the four things you attack. The others are the structure of documents in folders, and the lack of triple-level access control).

and start treating it as the hybrid graph it actually is.

(my bold). Ouch. It is one thing to propose a a model which is congruent and equivalent to the existing model, but when they are actually different to then suggest that only your one the one true model is much more destructive.

But talking about a pod as a hybrid graph is useful, if hybrid graph has the metadata about each document at the top level and within it nested blobs of binary like PNG files, and also nested separate graphs of the linked data of each of the data documents, and possibly further nested graphs within N3 documents.

By adjusting our perspective, Solid apps can become more independent of variations in data—and thus more powerful for us.

For several centuries, it appeared self-evident to many that the sun and planets revolved around the earth. Supported by our intuitive notions of sunrise and sunset, basic mathematical theories for this geocentric model could adequately explain and predict planetary movements. But over time, we started observing strange movements that the model couldn’t readily explain: in an apparent retrograde motion, Mars was sometimes seen flying backwards before returning to a more sensible

forward path.

When observations contradict your model, there are essentially two things you can do. Either you make the model more complex to account for those observations, or you switch to a new model that can fit them in. Hence, a first group of people attempted to explain Mars’ unusual orbit by adding epicycles to the geocentric model, letting celestial bodies wriggle around in additional circles. A second group abandoned geocentrism altogether, with proponents such as Copernicus and Galilei supporting a heliocentric model that explained the same motions better with far less trickery—by placing the sun in the center. Not only does Occam’s razor favor this simpler theory; heliocentrism was the model that could explain all observations and enable accurate predictions.

A new model does not magically make problems disappear, as many questions about the universe remain unanswered today. But the right model gives us the correct frame of reference to reason about our universe, and to devise new solutions for existing and new problems. Further corrections to the heliocentric model were indeed needed: planets follow an ellipse rather than a circle, and the moon doesn’t orbit the earth but they both orbit a common barycenter. Yet the framework within which those solutions were discovered was essential to their discovery.

Our models of reality determine the shape of our solution space.

In this blog post, I want to challenge the prevalent model we’ve used so far to think about Solid pods.

Philosophically, this is a bogus analogy: You say that Solid's model of reality is wrong - but Solid is not a model of reality. It is a new architecture, a new type of designed space. Solid, if you like, defines a new space with properties we chose. You can't fault Solid for being a bad model of realty. It is Philosophical Engineering, not Physics.

My research team demonstrated that some very common cases cannot be addressed by the current document-centric model. Even worse, we found that—no matter how creative or sophisticated your solution — holding on to the single document hierarchy means that any Solid app you develop will always be bound by certain key limitations. I will explain what those limitations are and how we can more easily solve data challenges within Solid by adopting a different model.

The article is confused whether it it is criticizing the fact that data is stored in documents, or the fact those documents are in a hierarchy of containers.

Keeping data and apps separated

Here’s the most common question I get asked by people who build Solid use cases:

Where in a pod should I store [this particular piece of] data?

This question might seem peculiar, since they know their use case best, so why are they asking me how to store their data? To find a satisfying answer, we must understand the deeper question that implicitly underlies this particular problem.

The technological promise of Solid is the separation of data from apps.

Yes, this is done with the solid standards - the client-server Solid Protocol and the client-client standards. The article's core fault is that it completely ignores the client-client standards.

We want to reuse data in any app where it makes sense, regardless of which app first captured that data. This independence unlocks the unprecedented creativity and innovation

that Solid is all about: apps don’t need to collect your data (again) before they can make your data work for you. And the way to make our data work, is to make it seamlessly interoperable with different apps.

Indeed. Thus interop comes from standards.

Therefore, the question of where to store data is not about making one app work—it’s about how to make that data work for all other apps. In traditional client–server development, a developer’s task is to make apps run against a specific backend.

Well, currently some non-Solid systems make their own Restful APIS but other, like Contacts and Calendars, use standards like CardDav and CalDAV.

In contrast, today’s Solid developers feel a responsibility for the entire ecosystem, as any decision they make affects other apps. So the question they’re really asking is:

How does the way we write data impact others’ ability to reuse that data?

And that, of course, is a much more complex matter, because we can’t predict how other apps in the future might want to reuse the data we write today.

Except you can, because we develop publish standards.

Solid is not about making data flow better for one app, but for all current and future apps.

The fact that one app’s decision on how to store data potentially affects all other apps, is a tell-tale sign that data and apps are not really separated yet in today’s Solid landscape.

That is nonsense. The data and the apps are separated - the data goes in the pod. The server which runs the pod does not care what app the users runs. The apps use client-client standards so you can pick whichever app you like at any time to work with the same data.

Because apparently, there is some meaning, some kind of semantics in the location an app chooses to write our data, and that understanding is not shared between different apps.

Here the author is confused maybe about a couple of things. In the general the URIs we use for things often end up having words in them like "Profile" or "Public" or "Settings" to make life easier for users and developers. But there is no semantics in those bits of URI. The access allowed is set in the Solid ACL system, and it is that which determines which bits of a pod are public. If a folder is called "Contacts" that is useful nmonic ut is NOT any semantics. but the software tracks that it follows linked from a person't identity through their pod to find things which are of class AddressBook. To in that sense there is no semantics of the locatuon.

In another sense, though there is semantics to the location as the location affects what access who has to it -- and hence, what it is trusted for. I share with my sports coach (say) my medical test results in a folder which can only be written by the medical facilities I use. I don't share it with anything or anyone else. So if a developer ass you next time where to store some data, could questions include, "Where is it coming from (what is its provenance?) and where is it going to (what is its destiny?). Because the Access Control systems work around the folder systems, things with similar provenance and desiny it is useful to keep together. But also of course things whch are part of the same project, same activity, same activity, same interaction, and so on.

This seriously hinders interoperability and serendipitous reuse, which I consider absolutely vital to a thriving personal data ecosystem.

As we’ll discuss next, one approach aims to address issues within the current pod model by making those semantics explicit in a contract.

You mean a client-client standard? Yup that would be good.

However, we’ll argue that, no matter what we try, this model will always break because it insufficiently reflects reality.

This may be the crux of the falacious argument. The author looks at the existing state of the solid universe in 2022, when the client-server standards are quite developed but the client-client standards very rudimentary, and assumes that that is the "reality" which must be modelled. It assumes that there will never be those sndards in place, rather tha help put them in place.

In practice on the Internet, standards are made by a mixture of processes.

The adoption of existing domain standards and their mapping into the Solid world
The design of new apps using sound engineering and existing ontologiues where they exist
Review by interested parties and consesnsus

The author suggests that "first app writes whatever it likes" has happened and that can be a valid part of the second so lng as it is captured in the other pwrts.

That’s why we’ll discuss a new model, which is better equipped to tackle app interoperability.

A pod as a single document hierarchy

The desirable simplicity of documents

Our mental model for a Solid pod today is a single hierarchical collection of documents. That makes sense, because documents are a common abstraction for human–computer interaction: our own computers offer document-based filesystems, and so do familiar platforms such as Dropbox and Google Drive. And of course, the Web itself started off as a document system on

the Internet.

It therefore comes as no surprise that the smallest unit of organization in current Solid pods is a document.

This is bizarre augument as the Solid pod is not a model of some reality which can measured experimentally. It is an engineered system where these things (document model, folder structure, slash semantics) are expplicit design choices. They can be reconsidered - but as a complete alternative system.

These are organized recursively in containers, which can contain documents or other containers. Many documents are documents consisting of Linked Data, such that they can point to other documents inside this pod or other pods.

Importantly, each pod has exactly one document organization; it’s the main and only entry point to a pod. The entire pod is thus its document organization and the content of those documents.

For example, Helen’s pod might be a collection of documents that could look like this:

And Jennifer’s pod might be another collection of documents:

Note how Helen and Jennifer both have a Solid pod consisting of containers, documents with Linked Data, and non- documents—organized in their own ways.

The power of Solid is that these different collections of documents are exposed in a standardized way through the Solid Protocol. Therefore, a client uses the exact same steps to retrieve a document from Helen’s or Jennifer’s pod — just like a Web browser uses the same instructions () to retrieve webpages from different websites.

The differences between how Helen and Jennifer organize documents in their own pods, highlight that not everything is standardized. Each Solid app has many degrees of freedom for writing data to pods, as the Solid specifications do not stipulate:

How is data modeled inside of a document?

How is data distributed across documents?

How are documents structured across containers (and pods)?

Not coincidentally, these aspects are exactly what developers address when they define a Web for their app, as servers and clients must agree on where and how things are stored. Consequently, the Solid Protocol is not a Web ! Rather, it allows and requires each individual app to decide where and how they store Linked Data. And the burden of this requirement is that this decision impacts every potential consumer of that data. And while non-Solid apps consider their Web a most important and visible contract, Solid apps tend to not document their Web at all.

■ ■ ■

Extending interoperability to the data level

This is a core challenge that any model for a pod should address: the authentication and authorization interoperability provided by the protocol must extend to the crucial data level on which Solid wants to achieve independence from apps. Neither the Solid Protocol nor the use of Linked Data are sufficient to achieve this required data-level interoperability. This is why Solid apps aren’t great yet at reusing others’ data, and hence why they can’t fulfill the Solid promise of data and app independence.

"Neither the Solid Protocol nor the use of Linked Data are sufficient" .. exactly... but then add the client-client Solid app-level standards, which this article completely ignores.

Basically, each Solid app is designing its own Web by creating documents and containers in a way that suits this application. In doing so, the app implicitly puts meaning in the document organization that is not captured anywhere else. This forces

Solid apps to resort to classical integration rather than data integration.

Of course, the simplicity of the document-centric pod is hard to resist. Therefore, several community proposals aim to make the unwritten contracts explicit by documenting a pod’s Web and its underlying semantics within dedicated documents in each pod. Core ideas include shapes to characterize data inside of documents, and descriptions of how to distribute data across documents.

These meta-stndard systems in which the data is not standard, but its descibed in a stndard way, are interesting but much more complex than stndard stsrems. This reqies rul-level inference from all partipating systems, which is not something we shpuld introduce at this stage.

Current proposals such as Type Indexes help clients find their way by linking to pod items, and the current Solid Application Interoperability draft points to the Shape Trees specification to describe, validate, and enforce the specific Web exposed by a pod. The aim of these proposals is allowing apps to share their data organization with each other while staying within the realm of documents, at the cost of increasingly complex structural descriptions that clients need to maintain faithfully. But even if we could trust all clients with all this bookkeeping—can it ever be enough?

The Type indexes spec is part of the solution at the high level. It allows a user to track large domain specific structures like Address Books, Photo Libraries, Medical Data Libraries, Calendars, Recipe Libraries, Music Libraries, and so on. It provides a simple index to start the disovery. Within each data library, further subindexes are very domain-specific -- music, for example is indexed by Artist, Track, Album, Genre, Recipes by ingregients, etc. This is desibed in @@@ref PodStuff.

The effect of this structure though, is to mean that the whole indirect indexing of the whole pod is tractable. You have to put in the same sort of effort existing desktop or web apps do to manage photos, fitness, etc, and then add aeach domain's data into the pod. Then the user has a lot of functionalty and the ability to share in all the various domains also within communities of various forms. (An interesting question is how to reguster the extra tools one app adds to a completely different app though the solid "anythimg can do anytiing with anything" mantra.)

One can’t rule them all

The contacts conundrum

No matter how clever or complex interoperability proposals become, we found that document-centric techniques will never be able to address several key scenarios. Current and future proposals share the limitation that the single document-centric is an insufficient model to represent the complexity of many real-world use cases. Solutions within this model will always be bound by its flaws, and they will encounter problems they can’t solve because their solution space is restricted to that

same model.

I’ll demonstrate the issue with a very simple example that the document-centric model can’t account for. The idea is that, if

even this toy example manages to break the model, then more complex use cases will definitely break it.

Let’s assume Helen wants to store contacts in her pod: her family, friends, co-workers, staff members, ... . There are dozens of ways of organizing these contacts across multiple documents in a pod. Fortunately, she doesn’t need to directly edit those documents, because she chooses Solid apps that do this for her:

She uses an address book app to browse her contacts one by one, to edit people’s details, and to add new contacts. Thanks to the birthday app, she gets a timely reminder when significant events are coming up.

For example, the address book app might store Helen’s contacts like this:

■ ■

But the birthday app might store Helen’s contacts like this:

Now of course, if those two apps organize things differently, how will they reuse each other’s data? The core idea of Solid is that data and apps are independent, thus that data should flow freely between apps. And neither structure is necessarily more appropriate or obvious than the other: structuring contacts in documents makes as much sense as structuring events in documents. Each view of the world makes perfect sense; they’re just different reflections of the same reality.

Can we tell those apps which structure was chosen for a given pod, by making the pod’s Web structure explicit? We could create agreements for certain topical domains. Or we could set up Type Indexes to point apps to the contacts, or Shape Trees to describe which of both organizations (or perhaps yet another) Helen has chosen. Unfortunately, that won’t help either app, as their expectations differ too much.

Any solution within the document-centric pod model cannot organize data at a more granular level than documents, because documents are the smallest organizational unit. And even with our trivial contacts example, our two apps have conflicting requirements:

If we choose the address book that creates one document per contact:

The birthday app can access data it’s not supposed to (address, phone number, ...).

Helen cannot share her birthday list within her friend group, because she’d be sharing everyone’s full details.

If we choose the birthday that stores multiple birthdays in one document:

The contacts app would have to mix the details of different contacts in one single document, which would get messy.

■

Helen would not be able to share individual contact details with her colleagues.

We can imagine workarounds, but none could fully achieve what we want. Copying data means that apps are not really reusing so they will get out of sync. And setting up synchronization processes leads to all kinds of corner cases. For instance, we could create a virtual document on top of the address book that aggregates birthdays from across documents. But that creates

a problem when Helen wants to add new contacts via the birthday app, because it cannot just write into the aggregated document view. And things get progressively worse when Helen also wants to store birthdays of people’s partners and children, because where would they be stored?

And recall that contacts were supposed to be the easy example! What if we want to store highly sensitive data, such as medical or financial records? Clearly, you should be able to track your heart rate with an app without it having access to your glucose or hormone levels. Why can’t the current pod model allow for this?

The perfect document box doesn’t exist

These and other problems inevitably occur because the single document organization is flawed in at least two major ways:

1. There’s more than one meaningful way to map reality to a hierarchy.

Since the current pod model only offers a single hierarchy of documents, we’ll have to make arbitrary choices that sooner or later might make life more difficult.

Actually these "arbitrary choices" are the stuff of standards. That's what making a standard involves. Though often in solid domains there will be other standards to follow.

In fact in this case, there are common shapes for contacts, derived from the previous VCARD and VCARD/RDF standards. SO in fact, the the Address Book and Birthdays apps work fine:

Contacts data is in a standard form, and accessed by both apps. — @@

2 The single hierarchical organization tries to capture multiple structures.

Often, these structures have conflicting requirements that prevent them from being combined in the same document hierarchy.

As a result, the term “document” has become heavily overloaded within Solid, because the boundaries of a document capture more aspects than just the data within. So let’s unpack the document box that apps draw around data when defining their Web :

The document-based grouping of data implicitly assumes that all data in this group shares the same history (how it was created) and destiny (how it’s intended to be used):

The provenance indicates how the data was obtained.

The trust describes for which purposes the data can be relied upon.

The context is a logical grouping for the data items.

The permissions detail who can read and write the data.

The performance is an expectation that the data will often be used together.

The document box inseparably couples these different aspects, which explains why the model so easily breaks down. Whenever there is a mismatch between the actual boxes of aspects, we can’t model the data within the single hierarchy. So if two pieces of data logically belong together (same context), but they were created by multiple people (different provenance), the assumption of the single document box fails.

For example, a single hierarchy can never deliver Helen’s contact management because the permissions boxes conflict between different apps:

The address book app can see all details per individual contact. The birthday app can see one specific detail for many contacts.

This is an inherent limitation of the current model. So no matter where and how we draw the document boxes, one app is always going to conflict with the other. Hence, no singular document-based organization can address the needs of these apps, and similarly not those of similar or more complex cases.

Still, we might feel tempted to try different document organizations. What if we create separate documents for each person, one holding their birthdate and the second holding everything else? But then we’d have a conflict with yet another app, which only needs to see people’s email addresses but not their phone numbers. So we’d shrink the document boxes to smaller and smaller sizes until each one contains a single triple. In other words, as long as pods only have one way to put things in boxes, conflicting requirements will lead to tiny boxes that are unmanageable and inefficient.

No single view can be enough

The core issue with the current pod model is not the documents themselves, but rather that each pod needs to have one base hierarchy of documents. While document-based systems have proven their usefulness; they represent only one view of what is actually a multi-dimensional reality. That’s fine when the data is quite rigid and you only need to support specific cases. Yet Solid has the ambition to tackle use cases spanning many different domains, whose needs are much less predictable.

We’ve seen that aspects such as context and permission can differ so widely between apps, that they need to access the world through different sets of documents. In other words, different Solid apps need different Web s to read and write their data. Therefore, any pod model must be able to incorporate such multiple views.

This notion of multiple s per pod in itself is not new. I wrote about it in my first blog post about Solid, and last year I discussed that decentralized clients need to be prepared to interact with multiple Web s. The idea to offer a endpoint per pod dates back to the early days of Solid, and I built a prototype providing authenticated access to pod data through the more lightweight Triple Pattern Fragments () . More recently, Quad Pattern Fragments () were added as a secondary pod to the Enterprise Solid Server for reasons of discovery and performance.

In the context of the document-centric model, these discovery and performance needs can be explained again as mismatched document boxes:

The requirement of discovery indicates that apps are indeed oblivious of the hidden semantics in the pod’s document structure. Rather than relying on client-managed description mechanisms (such as Type Indexes or Shape Trees) to find documents, apps can access data through the to bypass the context boundaries formed by documents. For example, a new app might not know in which container a pod’s Web stores contacts.

The performance concerns indicate that the underlying distribution of data across documents might match the access patterns of one app, but not those of others. For example, instead of reading through dozens of contact documents to extract people’s birthdays, the birthday app could more swiftly retrieve the same selection through the interface that partitions the data differently.

But doesn’t the creation of a show that we can derive arbitrarily structured Web s from a document-centric pod? Not quite—because of a consequential subtlety: the current model still assumes that the pod has one main document hierarchy from which all others are derived:

■ ■

■

Because the and other s are derived from this one “true” hierarchy, they still inherit the mismatch of the document boxes in other aspects. The graph component of each quad in the specifically points to the assumed main document it was derived from. The entire thereby unavoidably inherits any context, provenance, and trust mismatches. Crucially, the permissions aspect breaks—and thus so do the address book and birthday apps—as only the pod owner can access the .

These externally observable problems indeed confirm that an underlying pod model with a single document hierarchy was implicitly assumed during the design of those implementations. And the resulting issues are so deeply rooted within this model, that they cannot adequately be resolved within its limited solution space.

A pod as a hybrid graph

A graph as the underlying conceptual model

Having established that real-world use cases require multiple views to enable data-level interoperability, we need to find an appropriate mental model for a pod that makes generating such views easy. Based on the findings above, we identify the following key characteristics for such a model:

The smallest unit of organization is an elementary piece of data.

A piece of data can either be an individual statement, or a blob that is considered indivisible (such as a image or a ).

Each piece of data can have metadata associated with it.

Instead of being a single document hierarchy, our new model considers a pod a hybrid, contextualized knowledge graph. The term “knowledge graph” reflects that the pod is a collection of Linked Data. With “hybrid”, we indicate that both triples and blobs are first-class citizens. (We nonchalantly refer to , , and the likes using the word “blob” because, as discussed above, Solid overloads the term “document”.) Finally, “contextualized” indicates that each piece of data can have extra pieces of metadata associated with it, capturing aspects such as permissions, provenance, trust, etc.

We’re thus conceptually treating the pod as a knowledge graph, which serves as the source of truth when generating views:

■ ■

■

Other than with the document-centric model, no single hierarchy is assumed or necessary, so no view is more special than any other. This is because the graph associates each piece of data with the metadata required to construct the views: all semantics are expressed within the data.

And there will be plenty of views, since we eliminate the need to compromise on one view as the lowest common denominator. So make no mistake: document-centric interfaces are here to stay. In particular, expect pods to offer at least a couple of Solid Protocol-based Web s. If anything, there will be more document interfaces instead of less. What changes is the role document-centric interfaces play: they are views of an underlying, richer, graph-centric model. Use cases can choose the particular views on the pod’s data that reflect their context and constraints.

Implementing graph-centric pods and views

Most pod implementations today internally use a document-based backend. We might wonder how the graph-centric model can be realized technologically. Basically, we need two things:

. We need a way to store a hybrid, contextualized knowledge graph. . We need a way to describe and generate views.

As far as storage is concerned, bear in mind that the notion of a graph-centric pod is in the first place conceptual. So it does not mean that an actual graph database is strictly needed for a pod, or that graph-centric systems could not be built on top of plain documents. What we are saying is that each pod will have to behave as if it has a graph at its core—and it just so happens that graph databases are especially good at supporting such a model. But by all means, implementations are hidden behind the Solid Protocol, so the internal backend can be anything.

When considering off-the-shelf triplestores or quadstores, however, note that the hybrid and contextualized parts are an inseparable part of the model that needs to be providers. Pod implementers will likely need to create this additional functionality, since blobs and context are typically not first-class citizens. Blob storage might necessitate a dedicated backend; to keep track of context, -star might be a promising candidate.

At first, view generation might work similarly to how it happens today: by manually assigning each triple to a document. The difference in a contextualized knowledge graph is that each triple can be assigned to multiple documents and thus multiple s. For example, the statement

<#Helen> schema:birthDate "1984-04-03"^^xsd:date.

could be part of documents in two different Web s:

https://jennifer.pod.example/contacts/family/helen.ttl

https://jennifer.pod.example/birthdays/personal.ttl

■ ■

and thus be available to both the address book app and the birthday app.

Soon, we will want more flexible mechanisms that can automatically route triples to documents within a Web . We’ll need to create a view definition language and a view processor to instantiate the views. Whereas there are many ways to define views, they will always define some kind of mapping from space to the graph. For example, they might take patterns such as

https://{pod}/contacts/{group}/{nickname}.ttl

https://{pod}/birthdays/{context}.ttl

which are then associated with parameterized queries that read and write the right data for documents corresponding to each . And these queries can rely not just on the data, but also the associated metadata to make selections.

View definitions might be published and shared so different apps can reuse them, like they can with data vocabularies and shapes today; views could be organized in browsable repositories. Plus, we don’t have to restrict views to classical documents, since and other s can equally be derived from the graph. The important part is that we derive s from the data and metadata (as opposed to from other s).

A pod by any other name

Transitioning to the graph model

The nature of reality doesn’t change—only our understanding of it evolves. Whatever our method of analysis, the planets and stars will keep on moving the way they do. So we cannot model away our challenges, but we can try to find a framework that lets us address them more easily. The additional calculations required by the geocentric model ultimately became so complex, that society could no longer ignore the elegance of the (perhaps less intuitive) heliocentric model.

Similarly, I believe that the perceived simplicity of the document-centric pod is slowly turning into a complex set of descriptions. It’s not the documents themselves that are the culprit, but rather the single- illusion that has proven unmaintainable. It simply cannot support the use cases we need it to. So back to our main question:

Where in a pod should I store my app’s data, such that other apps can reuse it?

The answer is that, in the graph-centric pod model, it doesn’t really matter. Given that different apps can have different

Web s as views on the same underlying graph, the write decisions of one app do not affect those of others. Each app gets

a virtualized environment with the appropriate permission boxes, such that they can predictably interact with data. In principle, it’s not even an issue if apps hardcode against a view, since it is virtual and server-generated anyway. Views should thus also eliminate the need for more complex index-based solutions that require client-side bookkeeping.

By now, some of you might be worried: did we get it wrong? After all, the current Solid Protocol describes a document-based interface. Is this now obsolete and should we start standardizing other s? Actually, the graph-centric model is 100% compatible with the Solid Protocol. The situation is similar to how was originally designed to remotely serve files from disk, but the protocol design was so universal that people could use technologies like and .Net behind the scenes to serve dynamically generated documents with the exact same technology. and .Net didn’t need to change or ; they just gave a different interpretation to how the existing protocol was used. Analogously, we’re interpreting the Solid Protocol as one view to a pod rather than it being the pod—and nothing in the protocol prevents us from doing so. We’re thus building on top of existing Solid technology rather than suggesting replacements.

However, in my opinion, some things we did get wrong. For example, several default pod templates will create public and private folders in the root in a further conflation of document boxes, specifically context and permissions. Some solutions like Type Indexes continue this conflation. And many apps make assumptions about pod structure that can never generalize to the entire ecosystem—but that’s precisely why we need a more powerful pod model that eliminates unsustainable dependencies on such assumptions.

■ ■

The way forward

If we are right, the graph-centric model for a pod will have profound implications upon existing client and server software. Specifically, server implementations tied to document-oriented backends will have a difficult time transitioning into a graph model. On the other hand, developing client-side applications should become easier as the impact of their decisions becomes more localized because of virtualized views.

The concept of graph-centric pods is a starting point leading to fundamental questions about decentralized client–server interactions. If pod are graphs, do we still want the symmetry of resource-oriented read/write prescribed by the architectural style? Or can we just send messages to a pod’s inbox, given that the server generates the views anyway? And for whom are we providing arbitrary document-based Web s if clients will execute queries to reconstruct a local knowledge graph? Or will document-centric s always remain important for the user and developer experience?

Speaking of experiences, every Solid app so far has assumed that the underlying pod is a document hierarchy. This is directly reflected in most current s, which are either file browsers like macOS Finder, or more specialized versions thereof for

a specific use case. But user interfaces for pods can become vastly different if pods are seen as graphs instead of document hierarchies. This means that users could have different windows into their pods; and perhaps they’ll need such windows to adequately express permissions in a way that document-centric does not account for. Clearly, these points are just the beginning of possible discussions on this topic.

Everything starts with adopting the appropriate conceptual model to think about pods. We’ve seen why the single document hierarchy is already causing breakage in real-world use cases today. Understanding the pod as a graph is key to realizing Solid’s promised independence of data and apps; holding on to the current model’s limited solution space would prevent the entire ecosystem from moving forward. Eppur si muove.

—Ruben

Thanks to these wonderful people for discussions that directly inspired this blog post: my co-authors of What’s in a pod? , Tim

Berners-Lee, John Bruce, and Emmet Townsend from Inrupt, and Tom Haegemans and Wouter Termont from Digita.