AI And Blockchain, A New Paradigm And Its Expression In Distributed AI


Blockchain and confidence

The threshing media has been subsumed by IA media threshing in the past two years. These two technologies are relatively new. AI has a longer pedigree, going back in concept Golem and human antiquity. You might think that the blockchain begins with the hash functions and the programming distributed. Leslie Lamport’s The work on distributed systems are running together in time and confidence is necessary for the decentralized confidence solution and therefore blockchain. So, at least 40 years more for the blockchain and 80 more for the current forms of the AI.

IT distributed to resolve problems in collaboration requires a time order as well as a means of creating a version of the truth from a set of computers, some of which can be defective or malicious. IT and distributed storage are the necessary condition for decentralization. Independent governance of distributed machines gives us decentralization. Decentralization is therefore based on the nature and propagation of entities which control a distributed calculation and storage infrastructure. According to these measures, even Bitcoin cannot be considered decentralized, because 5 minimum operating pools control mining and a bunch of large institutions, including exchanges, control ramps and to the Bitcoin ecosystem. Whales hold 93% of Bitcoin.

AI challenges

The well -known AI problems include private data leakage, unabled energy consumption, continuing education reusing its own production, the availability of partitioned and private data to target tailor -made solutions and be paid for data private used in training models. Some of these problems can be resolved by integrating blockchains into AI. General challenges are sketched in the initial section.

The article describes a startup, model.ai. Most quotes on how he behaves from an interview that I made with Jamiel Sheikh, the CEO of Modelx.ai. Jamiel declares that his startup is in the distributed camp, because the solutions are in the federated AI area.

AI has usually been controlled by unique entities. By AI, we hear models of large language based on in-depth learning (LLMS) similar to Chatgpt, the generation of images incorporated into post-testable v1.5 diffusion, the audio to the text and the reverse ( Audio text), and the ultimate: video generation such as Sora or film-gen since it combines image, audio etc. In the future, AI has the potential to be a “Country of geniuses in a data center”. Current training methods for AI require a large amount of data. These data include almost everything that is produced by humanity that is digitized and accessible. Vast amounts of data prevent over-adjustment. Over-adjustment obliges the model to be specialized for the low amount of data and therefore cannot predict precisely. The open source models have broken this story.

There are some problems with this heavy approach to data. First, if the data consumed by the model have already been generated by the AI, the tone and the content of the data do not have the nuance and the variability of the original content. The AI ​​begins to eat its own production and could deteriorate in bias and ineffectiveness. A simple word of Greek origin, “autophagy” describes this phenomenon. Such a development is not aof a fanciful future Scenario and was seen in nature as the amount of content generated by AI has exploded. The second is data confidentiality. All data used to form AI were in the public domain or accessible to the public even if the data was protected by copyright. This includes scratching portal data that have never been supposed to be used in this way, such as youtube video trains or all the content of the New York Times.

Deepseek: an open AI model

Developments such as In depth have shown comparable performance without much data, the last chips or the time spent in training. Time and calculation for inference (actual use) using Deepseek increases. Deepseek is also an open model.

An open model means that all the source code of the model is open. In addition, this means that the weights of the model are visible. Anyone can change the model and recycle it or refine it using their own data. This Definition by OSI A had decline Because training data should not be shared so that the model is open source. According to Critic training data is the source code for ai. Without sharing training data, the model cannot be considered open source. Osi defended their definition.

Modelx.ai

The Modelx product depends on the maintenance of private model data, but the model and its weights open. Address the confidentiality enigma. Namely, how to train the best targeted AI on a particular field using data accessible to the public and continue to improve results, when certain data is deprived by law. In certain areas such as health care, hospitals are prevented from sharing private data due to HIPAA rules. If you just take a single type of data, namely X -rays; Take a public AI and train it more using only the private X -rays of your hospital will improve the model. However, a better way is to use X -rays from a lot of hospitals to improve the model much more than if you only use one hospital. Modelx.ai invented a way to share this data in a federated parameter, without losing confidentiality.

The model is taken and then formed in series in the hospitals of the federation. In other words, Hospital 1 takes an open and formed model and form it on its private data. Hospital 1 then releases its recycled model to the Federation and Hospital 2 leads to the same AI on their private data. This continues until all hospitals in the Federation are training. This AI has refined by private data from the Federation is only available in the Federation. The blockchain part is supposed to prove improvements and be paid as contributors as well as to keep private data.

The weights of the model are chopped and put on a large federated book at each training stage. Each hospital obtains tokens depending on the work they do. The quality of the model after each refinement is also measured, obtaining a quality note and the quantities of tokens. Later, when the models are used, each hospital pays tokens depending on the use. When I heard about it in October 2024, the open source models were clumsy and did not get good results. With the arrival of Deepseek in February 2025, such criticisms lost their bite.

Another argument that training data can somehow be exfiltrated or extracted from the model using certain techniques require additional precautions for safety. These include cleaning private identifiable information data as well as other protections against disanonymization techniques EU ACT AI.

Leave a Reply

Your email address will not be published. Required fields are marked *