a16z 'Disciple' Kuzco Practical Guide II: From Solo Operations to Cluster Deployment

Written by: J1N, Techub News
Introduction: Epoch One to Two
Kuzco is a network specifically serving the LLM large language model computing power mining, which was selected for a16z's Crypto Startup Accelerator (CSX) fall accelerator program launched on September 9 in New York. Projects selected for this program can receive at least $500,000 in investment from a16z and will receive guidance and support from the a16z operation team. This accelerator program has now concluded.
On November 16, Kuzco announced that the first phase (Epoch One) incentive program will end on November 18, 2024, all operations will be suspended, data snapshots will be permanently stored, and final point rankings will be announced on the new leaderboard.
According to official disclosures, Epoch One launched on March 6, 2024, with peak device numbers exceeding 8,000. The network runs Meta's 8B specification Llama-3 AI large language model, reasoning over 1 trillion tokens in total.
And announced that in the coming weeks, financing information and project development roadmap will be released, and the second phase (Epoch Two) incentive program will start on December 9, bringing new features such as higher throughput and reliability of NVIDIA hardware; encouraging users to connect top computing power devices like A100 and H100; supporting more image generation and multimodal language models VLM.
There are still half a month of preparation time before Epoch Two opens; this article will explore:
Sharing personal mining practices and results, from single machine to cluster transition.
Show the entire process of obtaining financing through research and practice, and building high-spec machines.
Discuss the matching of hardware configuration with project requirements and answer common questions from investors.
Epoch One Review: Solo Operations
Configuration
The author's configuration list includes RTX series graphics cards 2060, 2070S, 3080, 4060, 4060Ti, as well as 4 4070S and 2 Apple M2, M3 devices. These devices are distributed across several hosts, laptops, and a dedicated mining machine.
Cost
It is worth mentioning that these graphics cards were originally purchased by the author annually based on gaming needs, not specifically for mining. Therefore, when calculating costs, the hardware purchase costs were not included, only the actual electricity costs of the mining machine were counted. Here, the mining machine assembled in the first article (a16z 'Disciple' Kuzco Practical Guide: How to Efficiently Mine AI Computing Power?) is used as an example.
This mining machine configuration:
Motherboard: z490 (to be replaced with an industrial board later)
CPU: 10th Gen I9
Graphics card: 2060, 2070s, 3080, 4060ti, 4070s
Handcrafted mining machine
The following chart shows the electricity consumption of this mining machine in October and November, totaling 564 degrees, earning approximately 600 million points (KZO Point). All machines together earned about 1.1 billion points. Specific electricity cost needs to be calculated based on the electricity rates in each person's location, this is for reference only.
The far right of the image shows a total of 1 billion points acquired
Preparing for Epoch Two: Cluster Deployment
Based on the author's sharing in the first article and rich operational experience in equipment assembly, debugging, and environmental deployment, the author successfully secured some funding support and invested it all in assembling high-performance mining machines to further enhance computing power scale and operational efficiency.
From solo mining to cluster deployment
Configuration and selection logic of high-spec machines
Combining the author's practical experience in Epoch One, a comprehensive optimization of motherboard, CPU, graphics card, power supply, platform, and network configuration has been conducted, choosing a more suitable hardware combination that not only enhances overall operational stability, safety, and efficiency but also emphasizes liquidity in the second-hand market in hardware selection. This strategy can effectively reduce actual investment costs and provide subsequent participants with a higher cost-performance ratio.
Motherboard
The author chooses industrial motherboards instead of mainstream B85, mainly based on a comprehensive consideration of performance, stability, and cost-effectiveness.
In terms of performance, running Kuzco's Llama-3 model requires starting multiple Docker processes, and running these processes in parallel will consume a lot of CPU resources, which puts high demands on CPU performance, while the CPUs compatible with B85 cannot meet this requirement.
In addition, industrial motherboards have obvious advantages in long-term stable operation, high-temperature resistance, and manufacturer warranty, while also having stronger liquidity in the second-hand market, making them undoubtedly the best choice.
Graphics card
The author chooses to use the 4070S as the main graphics card, primarily based on the following points:
The advantages of AI computing performance: Compared with the 30 series graphics cards, the performance improvement of the 40 series graphics cards in AI computing is much greater than the improvement in gaming performance. The core reason lies in the fact that AI computing power mainly relies on the number of CUDA cores in graphics cards, and the 40 series graphics cards have significantly more CUDA cores than the 30 series.
Energy efficiency advantage: The author conducted detailed tests on multiple GPUs, calculating the average power consumption per Token.
4060Ti (160W): 0.125 Tokens/W
3080 (330W): 0.22 Tokens/W
4090 (450W): 0.26 Tokens/W
4070S (220W): 0.38 Tokens/W
From the test results, the 4070S performs best in balancing performance and power consumption, and its higher energy efficiency directly reduces electricity costs, making it the most cost-effective choice.
Prices and liquidity in the second-hand market: As a mid-to-high-end graphics card, the 4070S possesses high liquidity and value retention in the second-hand market, further reducing the holding costs of the device and providing flexibility for future hardware upgrades.
CPU
As mentioned earlier, Kuzco's Llama-3 requires starting multiple Dockers during operation, which significantly occupies CPU resources, especially under multi-card operation, where CPU utilization can reach as high as 80%-90%. Therefore, multi-core and multi-threaded processing capabilities are particularly important. High-performance, multi-threaded, and stable CPUs can not only effectively support multi-tasking but also ensure the stability and efficiency of the entire mining process.
The 13th Gen i5 can achieve over 70%+ utilization when fully loaded with graphics cards.
Network environment
Soft routing is the square box in the image
The network environment is also crucial in mining. Even if you have a high-performance graphics card, if the network is not optimized, computing power will be severely affected. According to the author's testing, insufficient network speed may cause computing power to drop to 30%, while low-quality network nodes may directly lead to an inability to connect to the Kuzco network. Both points are unacceptable for mining. To solve these problems, the author adopted a soft routing solution, which is not only easy to configure but can also run efficiently with almost no manual intervention after setting up, theoretically supporting an unlimited number of devices. As for the specific operation method, readers are advised to refer to relevant materials according to their needs.
Power supply
Classic Great Wall 2000W nuclear bomb power supply
When choosing a power supply, special attention needs to be paid to the issue of peak power consumption, which is why even though the rated power consumption of 7 4070S is only 1540W, the author still chooses to use dual 2000W power supplies, totaling 4000W. This is not wasteful but is due to considerations for the stability and safety of device operation.
Graphics cards will experience peak power consumption during operation, meaning that at certain moments, their actual power consumption may reach 1.5 times or more than the rated power, and then drop back to normal levels. If the power supply is insufficient to handle such peaks, it may trigger the power supply's forced shutdown mechanism, potentially damaging the graphics cards. This poses a fatal threat to the normal operation of mining machines.
4070s power consumption performance
Taking 4070S as an example, while its rated power consumption is 220W, the peak power consumption may exceed 400W. The peak power consumption of 7 graphics cards may total over 3000W, so configuring dual 2000W power supplies is to ensure the stable operation of the machine. Users with multiple 4090 configurations need to pay special attention, as a single 4090's rated power consumption is 450W, while peak power consumption can reach up to 770W. In multi-card cases, relying on just two power supplies may not meet the demand, typically requiring three power supplies to ensure system stability.
4090 power consumption performance
Supplement
As for BIOS settings, hardware compatibility, and remote management issues, the author will not elaborate further. There are plenty of free tutorials available online to refer to, and following the tutorials can solve most problems. It is recommended to refer to and handle issues based on your hardware configuration and needs for simplicity and efficiency.
Risk and return
Answer the most concerned question: How much can be mined daily? Frankly, there is no clear answer to this question, as risks and returns always coexist. I can share a clear viewpoint: whether in the cryptocurrency circle or traditional industries, any project that can accurately calculate daily returns means that you probably won't make big money by the time you enter. Unless you have some monopolistic resources, like extremely low electricity costs or very cheap mining equipment, you can gain an advantage in returns. However, such resources are not available to everyone.
The author chooses devices with good liquidity to reduce investment risks and cost pressures. Taking Kuzco mining as an example, costs mainly focus on hardware depreciation and electricity fees, so your maximum loss is only limited to these fixed costs. If you participate without low costs, any investment decision loses its meaning. It should be emphasized that the characteristics of mining dictate that there are no clear revenue expectations, but this is precisely where the potential of mining lies.
From a subjective judgment, this track has enormous market potential: on one hand, Kuzco has received investment support from a16z; on the other hand, the demand for large language models (LLM) is rapidly expanding. Think about it, almost no one can do without an LLM, right? Platforms like OpenAI's ChatGPT, Meta's Llama, and Musk's XAI, with their rounds of high financing, clearly demonstrate the growth potential of this industry.
For ordinary people, directly participating in the AI industry is not easy. On one hand, the technical threshold for AI is high; on the other hand, training AI models requires massive resources and funding, which most people cannot afford. However, by joining the AI computing power network through Kuzco, ordinary people can easily participate in this high-growth field under controllable costs and contribute to AI computing power while earning returns.
Additionally, the price of Bitcoin is about to break $100,000, rising from $16,000 in 2022 to its current peak, which carries a huge risk of retracement. If you choose to directly purchase tokens of AI projects, you will also face similar high volatility risks. In contrast, participating in the AI computing power network is a more robust choice: not only are costs clearly controllable, but it also allows entry into the rapidly growing AI industry with relatively low risk. This is one of the practically feasible ways for ordinary people to enter the AI field in the current environment.
a16z 'Disciple' Kuzco Practical Guide II: From Solo Operations to Cluster Deployment

Explore More From Creator

Latest News