Coinbase has conducted an experiment to assess the accuracy of the artificial intelligence language model, ChatGPT, developed by OpenAI, in detecting security vulnerabilities in smart contracts.
The Blockchain Security team at Coinbase compared ChatGPT’s risk score results for 20 smart contracts to those obtained from a manual security review, with the goal of determining whether ChatGPT could be integrated into the security review process. The test revealed that ChatGPT produced the same results as the manual review 12 times. However, in the remaining eight cases, ChatGPT failed to identify a high-risk asset, and five of these were labeled low-risk.
ChatGPT is a promising tool for improving productivity across a wide range of development and engineering tasks, including optimizing code and identifying vulnerabilities, among other things, based on the prompts it is given. However, while ChatGPT shows potential for quickly assessing smart contract risks, it does not meet the accuracy requirements needed to be integrated into Coinbase’s security review process.
The Blockchain Security team leverages in-house automation tools developed to aid security engineers in reviewing ERC20/721 smart contracts at scale. To test ChatGPT’s ability to review security risks in smart contracts, the team fed the tool with a prompt that specified the risk review framework to be used to compare the results to those obtained from the manual review. However, the team noted that ChatGPT did not have the context or information required to perform a response that could be compared to the manual review. Therefore, Coinbase had to teach ChatGPT to identify risks according to the security review framework.
Prompt engineering, a developing AI field, played a significant role in ensuring ChatGPT produced the intended results. The team had to articulate how the task should be performed or handled to obtain the intended results. Using the prompt engineered by Coinbase, ChatGPT produced risk scores that were used to compare the tool’s accuracy to that of a manual review.
Despite the efficiency of ChatGPT, the experiment revealed some limitations that impair the tool’s accuracy. ChatGPT is incapable of recognizing when it lacks context to perform a robust security analysis, resulting in coverage gaps where additional dependencies go unreviewed. An initial triage would be required to scope the review for the tool each time to prevent coverage gaps. ChatGPT is also inconsistent, as the same question may receive different answers, and the tool can be influenced by comments in the code.
Finally, OpenAI continues to iterate on ChatGPT, and Coinbase is optimistic that future versions of the tool may be more effective in identifying security vulnerabilities in smart contracts.
#Coinbase #COIN #ChatGPT #AI #azcoinnews
This article was republished from azcoinnews.com