Cover photo

Rethinking Data Ownership

AI introduces yet another argument for the benefits of owning our data through decentralized protocols.

Each year, as the holiday season approaches, the rush of text messages commences with friends soliciting updated addresses to minimize the number of return-to-sender envelopes they get from their annual Christmas card blitz. When you're in your late 20s and 30s, a similar ritual occurs as wedding invites are sent to friends and family worldwide. This time-consuming process is annoying and also fraught with the possibility of errors and outdated information, and somehow, a solution seems out of reach.

Each time you move, or your details change, the onus is on you to update every individual or service that might need your new information. With each platform (e.g., social media, professional networks, personal contacts) potentially holding different versions of your data, there’s a significant risk of inconsistency. What if your old address is updated on Amazon but not the IRS?

Just today, I learned:

  1. I am subscribed to HP Instant Ink

  2. refill deliveries have been shipping to my old address for the past 3 years, where the new homeowner now apparently sits on a big supply of authentic HP cartridges while my pages are coming out blank.

We could seek a centralized solution to this problem. But if we depended on centralized counterparties, like Google or Facebook, now these entities can control your data and potentially exploit it for commercial gains. Moreover, the centralized nature of such databases makes them prime targets for data breaches - as has happened repeatedly.

Finally, in the current setup, individuals have limited control over who can access their personal information and how it is shared between platforms. This lack of control complicates data management and affects personal security and privacy.

The simple task of letting people and companies know where you live is not so simple anymore. This dynamic is just one example of the challenges of the current model of personal data management and centralized data systems.

Pivot to Blockchain

In a decentralized data management system, individuals become the central hub of their personal data. Imagine your core information—name, address, and other key details—securely stored on a public, domain-specific blockchain. This 'digital vault' is under your control, with you deciding who can access your data and when.

Instead of repeatedly inputting your information for each service, you maintain it once in your digital vault. When a service or individual needs your data, it requests access from you, and you can grant this securely and transparently. It will be constantly kept up to date any time you update the central source. Think of it as a universal 'sign-in,' where your data is always up-to-date and consistent across services.

This shift transforms your interaction with digital services. Instead of constantly feeding data into disparate systems, you manage your information centrally, and the systems pull from your controlled source as needed. It’s a world where data management is seamless, secure, and user-centered, returning control to individuals and paving the way for innovative, efficient digital services.

A new set of data protocols

This new model can take a few forms. Still, the one I hypothesize replaces many of the counterparties we rely on today is being replaced with decentralized data protocols housed on top of some decentralized data store, each serving a specific use case. I am eagerly watching to see who steps in to build the following solutions as examples of store once, populate anywhere:

  • Identity Protocol: Today, crucial identity data has two forms: basic contact details and critical identifying information. Companies like Google and Facebook currently manage these data types. These two kinds of information would be stored on a secure blockchain in a decentralized model. For instance, you could share your actual contact information, like your mailing address, to receive packages or holiday cards without unnecessarily exposing other data.

    For businesses and services that need to verify your identity, they could request access to specific pieces of your critical identifying information. You hold the power to grant or deny these requests. This ensures that your data is shared only as needed, giving you control over its use. Farcaster and Lens are notable entrants in this game, as well as long-running protocol, Civic.

  • Work Protocol: Professionals often rely on LinkedIn to showcase their work history, but a simple list of job titles fails to represent their full range of accomplishments. Instead, imagine a world where all of your career achievements - products launched, metrics improved, deals closed - are not manually input but tracked and verified in real-time. Your professional identity would extend beyond just job titles and be accessible to those you permit, while securely stored under your control. Grappa is among the teams building in this area.

  • Credit Protocol: In the future, consolidating all your financial data on a public blockchain can help you manage your financial reputation better. Decentralized finance (DeFi) offers a preview of this world. But it becomes truly useful when you have complete control over your financial history, net worth, and other details that you can share with counterparties without compromising privacy. This can help lenders verify your creditworthiness while you can grant access to stored data and automate loan covenants.

  • Health Protocol: Healthcare is another area ripe for transformation. Epic’s monopoly over EMRs means that our health data is centralized and often difficult to access. Apple Health actually represents the closest approximation of the model here, as you maintain a central store of health data on your phone's secure enclave. Now imagine this being unconstrained to your Apple device.

    Apps and wearables directly write to your health data, linking with your test history and other records. If you switch doctors or need to see a specialist, you can grant them access to your medical records without the cumbersome process of transferring files. This ensures that your healthcare providers have up-to-date information while you maintain control over your sensitive data.

Public but Private

The capabilities of decentralized, secure data management are amplified with Fully Homomorphic Encryption (FHE) and Zero-Knowledge Proofs (ZKP).

FHE lets us perform operations on encrypted data, keeping it secure. It enables services to interact with your data without exposing it. For instance, financial services could assess your creditworthiness or healthcare providers could make diagnoses without knowing the specifics, keeping your data private and secure.

ZKPs allow one party to prove they know a value without sharing it. They can be used to verify data without revealing it, adding an extra layer of privacy and security. For example, a lender could verify your income is above a threshold, or a doctor could confirm your vaccination status without knowing the specifics of your medical history.

Example: Consider applying for a loan. Today, this process is brutal.

You manually upload dozens of files (many of which the lender already has), and any time a new loan is requested, the profile needs to be rebuilt from scratch. Instead of sharing your income and credit history the old-fashioned way, you could provide encrypted data stored across all the previously mentioned protocols. The lender could perform calculations to determine your eligibility without seeing your information. If further proof is needed, such as verifying your income, you could provide a ZKP, confirming your income meets requirements, but no other details.

These technologies reduce data breach risks, streamline processes, and maintain privacy. With Decentralized Data, FHE, and ZKPs, a secure, decentralized data management system gives individuals control over their data and enables powerful new services and user experience gains.

Owning Our Memory

The next big data category is beginning to manifest itself as we speak—our personal data for AI. Over the next year, we will see the introduction of more memory-capable LLMs, where the record of our interactions and the data we can supply them becomes increasingly powerful in driving a personalized and useful user experience.

Examples are emerging to build these personal contexts, like Limitless, where a small wearable clip passively records your life and creates a complete memory store of your conversations and interactions for superpowered recall. There’s really exciting potential in this model of conversational personal memory, even if the form factor here can be debated.

Today, AI systems rely on centralized vector databases to store and manage interactions. These databases, operated by companies like OpenAI, Weaviate, Pinecone, and major cloud providers, limit user control and interoperability unless you operate at an advanced technical level. For most users, their AI data is confined within these systems, making it difficult to integrate with other services or platforms. This lack of control and flexibility often leads to inefficiencies, such as the "cold start" issue when transitioning between different AI tools. While some companies, like LangChain are working on interoperability solutions, they still largely depend on traditional Web2 architectures and self-hosted data storage.

Decentralized AI Data Management

In the decentralized model we've discussed, your AI data would be stored on a public blockchain. This means you own your historical context data and the memory that informs your AI interactions, allowing you to control its use. Here are some practical examples of how this could revolutionize everyday technologies:

Seamlessly Transition Between Assistants: Imagine switching from ChatGPT to a Claude, while preserving full historical context. In a decentralized AI data management system, the new service can instantly access your historical interaction data stored on a blockchain, including past commands, music preferences, reminders, and frequently asked questions. This lets the new assistant offer personalized responses immediately, without the usual learning curve.

Enhanced Recommendation Systems: Consider AI in streaming services like Netflix or Spotify. When you switch to a new platform, it typically doesn't know your preferences and must learn your tastes over time. With decentralized AI data management, your viewing and listening history could be stored on a blockchain and shared with new service providers. A new streaming service could immediately access your watch history and start recommending content you are likely to enjoy, bypassing the "cold start" phase and providing a personalized experience from day one.

Educational and Learning Platforms: When users move between learning platforms, they often lose valuable data about their learning pace, strengths, and preferences. If educational data were managed in a decentralized system, a learner's history, including completed courses, quiz scores, and areas of interest, could be securely stored and shared with new platforms. This would allow educational AI systems to adapt to the learner's level and preferences right away, providing a tailored educational experience that builds on past progress without redundancy.

Achieving this today typically requires self-hosting data, using open-source LLMs, and manually integrating it into your models and services. In your decentralized memory model, the blockchain provides a permanent, always-on storage solution that is secure and accessible from anywhere. This ensures that your AI data is always available without the hassle of managing infrastructure. New products and services can immediately offer fully customized user experiences by leveraging your complete personal context, including traditional services and the growing data set of personalized memory and interaction data you build daily. With control over your data, the fear of giving it up to an AI diminishes, as the models can access knowledge without possessing it.

Conclusion

The transition to decentralized, secure data management on public blockchains promises to revolutionize the way we handle personal data. Individuals can regain control over their personal information by leveraging blockchain technology and zero-knowledge proofs, ensuring privacy, security, and interoperability.

This vision encompasses various domains, from core identity information to financial reputation, medical records, and personal AI data. By moving away from centralized custodianship, we can create a digital landscape where individuals truly own and manage their data. This enhances privacy and security and fosters innovation and efficiency in how services interact with personal data.

The rapid advancements in AI have made it more crucial and impactful to change the paradigm of data ownership. This might be the perfect time to try and succeed with this model, even though previous attempts may have been too early.

Loading...
highlight
Collect this post to permanently own it.
Factor Capital Blog logo
Subscribe to Factor Capital Blog and never miss a post.
#ai#blockchain
  • Loading comments...