How ChainML Aims to Use Web 3 Technology to Democratize AI
Artificial Intelligence (“AI”), Machine Learning (“ML”), and Large Language Models (“LLMs”) are taking the tech world by storm. OpenAI’s LLM, ChatGPT, was released on November 30th 2022, and it crossed 1 million users less than a week after its release. Per the research firm Similarweb Ltd., ChatGPT already reached 100 million users in January 2023.
The growth of LLMs had turned compute into a precious commodity because both ML training and inference require a massive amount of compute. The computational complexity of AI systems is doubling every three months. Scarcity of affordable computing resources threatens to bottleneck the advancement and innovation of AI, and it could cause the AI space to become dominated by the few large players who can afford the necessary expensive hardware at scale. Blockchain technology is well positioned to solve this challenge because it enables the rental of idle compute power around the globe on a decentralized marketplace. We believe this represents one of the most exciting use cases for Web3. As part of our series on projects that are capitalizing on the synergies between Web3 and AI, Ruceto reached out to David Müller, co-founder and product lead of ChainML, for an exclusive look at what they are building.
ChainML was founded with the goal of keeping AI decentralized and accelerating the adoption of AI across industries. Per ChainML, “As a team, we believe that the Web3 ethos of decentralization is key to increase the democratization of AI at a time of rapid technological advancement and societal impact. By embracing the principles of decentralization, we aim to empower individuals and communities worldwide, granting them greater access and control over AI technologies. This democratization will not only foster a more inclusive AI landscape but also encourage diverse perspectives and contributions, leading to the development of AI systems that better reflect the needs and values of the global population.” ChainML’s team has over 75 years of experience working in AI, and they also have deep experience in Web3. ChainML’s CEO and co-founder, Ron Bodkin, has been an advisor to Consensys since 2018 and Space and Time prior to starting ChainML. David Müller has been involved with the blockchain space since 2017.
The initial focus of ChainML is in the inference space for LLMs. In this context, inference refers to a LLM’s ability to generate a response based on a user’s input. Currently the company is working on 2 initiatives, Council and the ChainML Protocol. Council is a platform for developing AI applications that are using LLMs and seamlessly making them available for inference. The ChainML Protocol supplies compute to Council as well as anybody else who will want to run AI compute (or similar data-intensive types of compute) on the ChainML Protocol.
Council: Helping maintain the accuracy of LLM outputs
LLMs like GPT-4, Llama 2, and Claude 2 can make the jobs of knowledge workers much easier and more efficient, but they also run the risk of hallucinating and compounding errors. ChainML’s core offering, Council, was built to address this risk without sacrificing performance for safety.
Council is an open-source platform for the rapid development, deployment, and oversight of customized generative AI applications using teams of agents. An agent is a program that perceives its environment and performs autonomous actions, supporting activities such as customer service, self-service analytics, code generation, and market research. ChainML observed that LLMs perform better as a collaborating team of agents with complimentary skills as opposed to a single generalist agent. Use cases can include an organized team of agents with one manager and multiple specialized subordinates that perform parallel tasks. Agent teams can do anything from performing research and fact checking for a report, automated analysis and debugging, to even creating music with each agent using different instruments.
While agents deployed on Council have limited autonomy, they act within a budget and under human oversight to protect against the codification of harmful biases or behaviors that defy common sense. The fact that agents interact through natural language means that they can be reviewed and refined via automated or human evaluations.
Council will be used by customers in Web2 and Web3, and it is not necessary for users to be well versed in Web3. ChainML is seeing the significant initial interest in Council from Web3 builders who are interested in leveraging AI to improve the usability and functionality of DeFi, Web3 infrastructure and gaming projects. Per ChainML, “One area of great interest is simplified analytics – such as analyzing Blockchain data with Space and Time, analyzing trades and predicting risk in DeFi, security events, and modeling financial data sets. Another area of great interest is technical support – generating integration code and using recent code, documentation and support information to simplify integration and resolving technical issues.” ChainML is also seeing initial demand from the Web2 space including SaaS businesses, hardware companies, and other digital natives that seek to use LLMs to power activities such as self-service analytics and technical support.
Per ChainML, Council, “will natively integrate with the ChainML Protocol [discussed below] to enable easy and robust deployment and monitoring of generative AI models and ensure they can be operated with confidence and accuracy.”
ChainML Protocol: Providing the compute to power Council and more
The ChainML protocol enables the owners of idle hardware around to the world to be compensated for renting out compute power to compute users. Demand for compute on the ChainML protocol will not be hamstrung at all be any sort of Web3 learning curve because ChainML’s customers will have the ability to pay in fiat without connecting crypto wallets. Only compute providers offering compute capacity on the ChainML network will need to have an understanding of blockchain technology. Compute providers will include specialized GPU cloud providers and organizations that are currently offering mining operations, node operations, storage providers, and data centers. Depending on application requirements, users can select compute providers on the compute marketplace ranging from anonymous participants to participants who provided some degree of proof of identity.
There are a number of Web2 compute providers, but ChainML chose to leverage blockchain technology because, “Blockchains enable transparency and auditability of compute providers and requesters as a public store of record. Blockchain enabled tokenomics create incentives for compute providers to participate in the marketplace while staking helps secure the network ensuring its integrity and reliability.”
There are also several general compute marketplaces in Web3 that are understandably working to include the AI and ML use cases in their existing offering, but not all of them have much actual experience in the AI space like ChainML. Per the company, “We believe we best understand the needs of AI practitioners and the intricacies of running AI models in production. These include dedicated management, monitoring and quality management tools purpose built for AI, efficient handling of mission-critical infrastructure resources (such as GPUs) and a focus on performance at minimum latency.”
Equally important to creating a permissionless and decentralized compute distribution method is the ability to effectively verify off-chain work. Per ChainML, the most frequently used means of verifying the work will be an optimistic protocol. Under this mechanism, requests are routed to one validator who then delegates the request to a single reservation that is responsible for executing the request end-to-end. The initial input of the request, the complete input, and the output response are all signed by the compute node and the validator, and signed values are all returned to the caller to be recorded on chain. The caller has the necessary info to complete fraud proofs offline and submit a dispute if warranted.
The above-described optimistic flow is appropriate for cases where there is insignificant economic incentive to tamper with results, such as using ChainML’s protocol to power language applications like summarization or documentation lookup. For cases like these, ChainML explained, “a dishonest node operator that sought to simply avoid providing the compute they have sold would not be able to produce reasonable results that match expectations of a user and as the issue was investigated, it would be nearly certain that the tampering would be discovered. Given the risk of losing stake, reputation and possible legal consequences, it will be extremely unlikely that a provider would tamper with results in this manner.”
For higher security use cases, ChainML also will support a pessimistic consensus protocol. Per ChainML “in the simplest case, validation nodes will agree upon the reading of input data, select among reserved capacity for computing a service, pass the input data to compute nodes that are reserved, and return the correct compute results.” The protocol must be able to 1) Execute Proof of Data to prove the validity of data read by services in the protocol; 2) Execute Proof of Inference to prove the validity of computation used to calculate service outputs; 3) Confirm the validity of Zero Knowledge proofs. Users can choose to have services executed pessimistically or optimistically.
ChainML’s pessimistic consensus mechanism will include dynamic committees among validator nodes that will execute a byzantine fault tolerant (BFT) consensus protocol based on HotStuff. This essentially means the consensus can operate even if some components are faulty. Nodes must stake a minimum value to tokens to participate as a validator in the consensus. As an example, the consensus algorithm for computing an ML service is as follows:
|Step||Consensus Algorithm Activity|
|1||Validators form consensus on which compute nodes will compute the service call.|
|2||Data will be read and validated with consensus by proof of data.|
|3||Validated data will be passed back to the selected compute nodes from step 1 with the service request.|
|4||Compute nodes return the computed result by gossiping a commitment to the response to all the validator nodes in the committee, and sign it using their private key to verify that they computed it. When results are completed or timed out, they gossip a key to retrieve the committed value.|
|5||Validation nodes form consensus on the returned result with proof of inference.|
|6||The consensus protocol returns the service response which includes the result, a BLS signature for all participating validators, signed hashed data used as inputs to computation, signed results from the compute nodes, and identified problems. This data will be used to update reputations and for any disputes.|
Potential for a Future Token
There is no specific timeline for a native token yet, but ChainML is looking into ways that a future native token could successfully incentivize all stakeholders in the ChainML protocol. The team envisions the ChainML protocol eventually becoming fully autonomous, being governed via a DAO to achieve full decentralization, but this would require an incremental process.
When asked about the initial market response to Council, ChainML said, “We are excited at the response to the recent launch of Council and the vision we have articulated: we believe that generative AI models will be transformative and enabling transparency, control, and community participation is critical to achieving good outcomes for multiple stakeholders.”
This is a project that Ruceto will track closely because we are always looking for projects that have the potential to make Web3 more mainstream. If ChainML can contribute to keeping AI decentralized, both the AI and Web3 spaces will benefit massively.
Ruceto’s team of experts combines deep experience in both crypto and traditional finance to offer the best fundamental research available on Web3 projects. Subscribe to our research platform here, or sign up for our free newsletter to keep up with the movement of institutional capital into Web3.
Disclaimer: The content presented is for informational purposes only and does not constitute financial, investment, tax, legal, or professional advice. Nothing contained in this report is a direct or indirect recommendation or suggestion to buy, sell, make, or hold any investment, loan, commodity, security, or token, or to undertake any investment or trading strategy with respect to any investment, loan, commodity, security, token, or any issuer. Ruceto does not guarantee the accuracy, completeness, sequence, or timeliness of any of this content. Please see our Terms of Service for more information.