Article source: Techub News

Written by: J1N, Techub News

Introduction: Epoch One to Two

Kuzco is a mining network dedicated to LLM large language model computing power. This year, it was selected for the Crypto Startup Accelerator (CSX) Fall Accelerator Program launched by a16z in New York on September 9. Projects selected by the program will receive at least $500,000 in investment from a16z and will receive guidance and support from the a16z operations team. The accelerator program has now ended.

On November 16, Kuzco announced that the first phase (Epoch One) incentive plan will end on November 18, 2024, all operations will be suspended, data snapshots will be permanently stored, and the final points ranking will be announced on the new leaderboard.

According to official disclosure, Epoch One will launch on March 6, 2024, with peak device numbers exceeding 8000 units, running Meta's released 8B specification Llama-3 AI large language model, totaling over 1 trillion tokens of inference.

And announced that funding information and project development roadmap will be disclosed in the coming weeks, and the second phase (Epoch Two) incentive plan will start on December 9, bringing some new features such as higher throughput and reliability of NVIDIA hardware; encouraging users to connect top computing power devices like A100 and H100; supporting more image generation and multimodal language models (VLM).

There are still two weeks left to prepare for the start of Epoch Two, and this article will explore:

  • Sharing personal mining practices and results, from standalone to cluster transition.

  • Demonstrating the entire process of obtaining funding through research and practice and building high-spec machines.

  • Exploring the compatibility of hardware configuration with project needs and answering common questions from investors.

Epoch One Review: Solo Combat

Configuration

My configuration list includes RTX series graphics cards 2060, 2070S, 3080, 4060, 4060Ti, as well as 4 4070S cards and 2 Apple M2 and M3 devices. These devices are distributed across several mainframes, laptops, and a dedicated mining machine.

Cost

It is worth mentioning that these graphics cards were originally purchased by me annually based on gaming needs, not specifically for mining. Therefore, when calculating costs, I did not include hardware purchase costs, only counting the actual electricity costs of the mining machine. Here, I will take the mining machine assembled in the first article (a16z 'Disciple' Kuzco Practical Guide: How to Efficiently Engage in AI Computing Power Mining?) as an example.

This mining machine configuration:

  • Motherboard: z490 (to be replaced with an industrial board later)

  • CPU: 10th Gen I9

  • Graphics Cards: 2060, 2070s, 3080, 4060ti, 4070s

Handcrafted mining machine

The following chart shows the electricity consumption of this mining machine in October and November, totaling 564 kWh, earning approximately 600 million points (KZO Point). All machines combined earned about 1.1 billion points. The specific electricity cost needs to be calculated based on the electricity rates in each person's location, this is just for reference.

Far right in the picture, a total of 1 billion points earned

Preparing for Epoch Two: Cluster Deployment

Based on my sharing in the first article and my rich operational experience in assembling, debugging, and deploying environments, I successfully secured some funding support and invested it all in assembling high-performance mining machines to further enhance computing power scale and operational efficiency.

From standalone to cluster deployment

Configuration and selection logic for high-spec machines

Combining my practical experience in Epoch One, I comprehensively optimized the motherboard, CPU, graphics card, power supply, platform, and network configuration, selecting a more compatible hardware combination that not only improved overall operational stability, security, and efficiency but also paid more attention to the liquidity of the second-hand market in hardware selection. This strategy can effectively reduce actual investment costs and provide higher cost-performance options for subsequent participants.

Motherboard

I chose industrial motherboards instead of mainstream B85 mainly based on a comprehensive consideration of performance, stability, and cost-effectiveness.

In terms of performance, running Kuzco's Llama-3 model requires launching multiple Docker processes, and running these processes in parallel consumes a large amount of CPU resources, which places high demands on CPU performance, while the CPUs compatible with B85 cannot meet this demand.

In addition, industrial motherboards have significant advantages in long-term stable operation, high-temperature resistance, and manufacturer warranty, and they have greater liquidity in the second-hand market, making them undoubtedly the best choice.

Graphics Card

I chose to use 4070S as the main graphics card based on the following points:

Advantages of AI computing performance: Compared to the 30 series graphics cards, the performance improvement of 40 series graphics cards in AI computing is far greater than in gaming performance. The core reason lies in the fact that AI computing power mainly relies on the number of CUDA cores in the graphics card, and the 40 series graphics cards have significantly more CUDA cores than the 30 series.

Energy efficiency ratio advantage: I conducted detailed tests on multiple GPUs and calculated the average power consumption per token

  • 4060Ti (160W): 0.125 Tokens/W

  • 3080 (330W): 0.22 Tokens/W

  • 4090 (450W): 0.26 Tokens/W

  • 4070S (220W): 0.38 Tokens/W

From the test results, the 4070S performs best in balancing performance and power consumption, its higher energy efficiency ratio directly reduces electricity costs, making it the most cost-effective choice.

Prices and liquidity in the second-hand market: As a mid-to-high-end graphics card, 4070S has high liquidity and value retention in the second-hand market, further reducing the holding cost of the device while providing flexibility for future hardware upgrades.

CPU

As mentioned earlier, Kuzco's Llama-3 requires multiple Docker instances to be launched during operation, which significantly occupies CPU resources, especially in multi-card operation, where CPU usage may reach 80%-90%. Therefore, multi-core and multi-threaded processing capabilities are particularly important. A high-performance, multi-threaded, stable CPU can effectively support multi-tasking and ensure the stability and efficiency of the entire mining process.

13th Gen i5 under full load can achieve over 70% GPU usage

  1. Network Environment

Soft routing represented by the square box in the picture

The network environment is also crucial in mining; even with high-performance graphics cards, if the network is not optimized, computing power will be severely affected. According to my testing, insufficient internet speed can reduce computing power to 30%, and low-quality network nodes may directly lead to an inability to connect to the Kuzco network, both of which are unacceptable for mining. To address these issues, I adopted a soft routing solution, which is not only easy to configure but can run efficiently with almost no manual intervention after setup, theoretically supporting an unlimited number of devices. As for specific operational methods, I suggest readers refer to relevant materials based on their needs.

Power Supply

Classic Great Wall 2000W nuclear bomb power supply

When choosing a power supply, special attention needs to be paid to the issue of peak power consumption, which is why even though the rated power consumption of 7 4070S is only 1540W, I still chose to use dual 2000W power supplies, bringing the total power to 4000W. This is not a waste of resources, but rather a consideration for the stability and safety of device operation.

The graphics card may experience peak power consumption during operation, meaning at certain moments its actual power consumption may reach 1.5 times or more of the rated power consumption, before dropping back to normal levels. If the power supply's power is insufficient to handle this peak, it may trigger the power supply's forced shutdown mechanism, or even lead to graphics card damage. This poses a fatal threat to the normal operation of the mining machine.

Power consumption performance of 4070s

Taking 4070S as an example, although its rated power consumption is 220W, the peak power consumption can exceed 400W. The total peak power consumption of 7 graphics cards may reach over 3000W, so a dual 2000W power supply is configured to ensure the stable operation of the machine. Users with multiple 4090 configurations especially need to pay attention, as the rated power consumption of a single 4090 is 450W, while the peak power consumption can reach up to 770W. In the case of multi-card setups, relying solely on two power supplies may not meet the demand, at which point three power supplies are usually needed to ensure system stability.

Power consumption performance of 4090

Supplement

As for BIOS settings, hardware compatibility, and remote management issues, I will not elaborate too much here. There are already many free tutorials available online that can be referenced, and following the tutorials can solve most problems. It is recommended to conduct targeted research and handling based on your own hardware configuration and needs for simplicity and efficiency.

Risk and Return

Answering the question everyone is most concerned about: How much can I mine each day? Frankly speaking, this question does not have a clear answer, as risk and return always coexist. I can share a clear view: whether in cryptocurrency or traditional industries, if any project can precisely calculate daily profits, then you are likely to miss out on big profits. Unless you possess certain monopolistic resources, such as very low electricity costs or very cheap mining equipment, only then can you gain an advantage in profits. However, such resources are not available to everyone.

I chose devices with good liquidity to reduce investment risks and cost pressures. Taking Kuzco mining as an example, costs are mainly concentrated on hardware depreciation and electricity, so your maximum loss is limited to these fixed costs. If participation is not under low-cost conditions, then any investment decision loses its meaning. It should be emphasized that the characteristics of mining lead to no clear profit expectations, but this is also the potential of mining.

From a subjective judgment, this track has huge market prospects: on one hand, Kuzco has received investment support from a16z; on the other hand, the demand for large language models (LLM) is rapidly expanding. Think about it, almost no one would do without LLM, right? Platforms like OpenAI's ChatGPT, Meta's Llama, and Musk's XAI have gone through rounds of high financing, clearly indicating the growth potential of this industry.

For ordinary people, directly participating in the AI industry is not easy. On one hand, the technical threshold for AI is high; on the other hand, training AI models requires massive resources and funding, which most people cannot afford. However, by joining the AI computing power network through Kuzco, ordinary people can easily participate in this high-growth field with controllable costs, contribute to AI computing power, and earn returns.

Additionally, Bitcoin prices are about to break $100,000, rising from $16,000 in 2022 to the current peak, which carries a huge risk of retracement. If one chooses to directly purchase tokens from AI projects, they will face similar high volatility risks. In contrast, participating in the AI computing power network is a more robust choice: not only are costs clear and controllable, but it also allows for entry into the fast-growing AI industry with relatively low risk. This is one of the feasible ways for ordinary people to enter the AI field under the current environment.