Code Interpreter for llm

Code Interpreter for llm

Created
Jun 9, 2024 05:41 AM
Tags
LLM Agent
Code interpreter

前言

用过openai 的Assistants ,GPTS等功能的话,应该了解openai 提供了Code Interpreter的功能。可以在python解释器环境下执行代码,对于Assistants而言,Code Interpreter 应该也属于工具的一样,当用户输入自然语言描述的时候将触发工具调用,然后调用Code Interpreter执行代码,并返回输出。
notion image
 
 
但openai的Code Interpreter 依赖于openai,无法本地部署,所以本文主要目的是创建一个简单易用的代码执行器。
 
 

Jupyter kernelgateway

Jupyter KernelGateway 是一个Web服务器,它提供了一种简单的方法来让外部服务通过网路接口与Jupyter的内核进行交互。它允许开发者和系统管理员通过REST API从远程位置执行代码,获取代码执行结果,以及管理Jupyter内核,我们这里主要的Coder server 服务主要通过Jupyter kernelgateway 来实现。
 
jupyter kernelgateway --KernelGatewayApp.ip=0.0.0.0 \ --KernelGatewayApp.port=8888 \ --KernelGatewayApp.auth_token="${TOKEN}" \ --JupyterApp.answer_yes=true \ --JupyterWebsocketPersonality.list_kernels=true
 
构建Docker进行构建,这里的python 版本和
FROM python:3.10.12-slim RUN mkdir /notebook RUN apt-get update RUN apt-get install -y graphviz COPY ./docker/server.requirement.txt /app/requirements.txt RUN pip3 install --no-cache-dir -r /app/requirements.txt WORKDIR /notebook CMD jupyter kernelgateway --KernelGatewayApp.ip=0.0.0.0 \ --KernelGatewayApp.port=8888 \ --KernelGatewayApp.auth_token="${TOKEN}" \ --JupyterApp.answer_yes=true \ --JupyterWebsocketPersonality.list_kernels=true EXPOSE 8888
 
这里安装的python版本和一些依赖版本,主要参考了下Colab
jupyter_kernel_gateway==3.0.0 ipykernel==5.5.6 matplotlib==3.7.1 matplotlib-inline==0.1.6 matplotlib-venn==0.11.10 numpy==1.25.2 pandas==1.5.3 requests==2.31.0 beautifulsoup4==4.12.3 huggingface-hub==0.20.3 mitmproxy==10.2.4 graphviz==0.20.3 networkx==3.2.1
 
镜像已经构建好了,也可以直接拉取
docker run --rm -e TOKEN="241b2687-e3f2-43b5-826b-cb91e8be6b08" -p 8888:8888 qingzhaoli/fcoder-server:latest
 
 

Coder Client

Jupyter kernelgateway 作为的代码执行期服务构建好后,主要通过客户端请求来执行代码
from fcoder import CoderClient coder_server_auth_token = "241b2687-e3f2-43b5-826b-cb91e8be6b08" client = CoderClient( server_host="127.0.0.1", server_port=8888, auth_token=coder_server_auth_token ) result = client.code_interpreter("print('hello')")
 
执行结果如下
{ "status": "ok", "output": [ { "text/plain": "hello\n" } ], "error_trace": null, "message": "" }
 
也可以用于安装依赖
result = client.code_interpreter("!pip install requests") ------ { "status": "ok", "output": [ { "text/plain": "Requirement already satisfied: requests in /usr/local/lib/python3.10/site-packages (2.31.0)\r\nRequirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/site-packages (from requests) (3.7)\r\nRequirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/site-packages (from requests) (2.2.1)\r\nRequirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/site-packages (from requests) (3.3.2)\r\nRequirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/site-packages (from requests) (2024.6.2)\r\n" }, { "text/plain": "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\r\n\u001b[0m\r\n\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.0.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.0\u001b[0m\r\n\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\r\n" } ], "error_trace": null, "message": "" }
 
 

Code Interpreter

Code Interpreter 本质上还是工具调用,LLM模型需要根据描述来返回函数调用,code_interpreter 工具函数和Tool schema 如下
def code_interpreter(code: str) -> CoderExecResult: """ Code Interpreter is a code executor. Note that the execution environment of the code is Jupyter and the programming language is Python. Of course, you can also use the magic method % symbol supported by Jupyter and the ! symbol to execute system commands. % represents the line magic command. , such as %run is used to run external Python programs, the ! symbol is used to execute system commands, such as !ls is used to execute system commands, or !pip install requests means calling pip to update dependencies. Args: code: Python spec code or Linux command (pip3 only), This is the code to execute, do not include redundant comments Returns: CoderExecResult: Coder exec result """ result = coder_client.code_interpreter(code) return result coder_tool_schema = { "type": "function", "function": { "description": "Code Interpreter is a code executor. Note that the execution environment of the code is Jupyter and the programming language is Python. Of course, you can also use the magic method % symbol supported by Jupyter and the ! symbol to execute system commands. % represents the line magic command. , such as %run is used to run external Python programs, the ! symbol is used to execute system commands, such as !ls is used to execute system commands, or !pip install requests means calling pip to update dependencies.", "name": "code_interpreter", "parameters": { "type": "object", "properties": { "code": { "type": "string", "description": "Python spec code or Linux command (pip3 only), This is the code to execute, do not include redundant comments" } }, "required": [ "code" ] }, "annotation": "code_interpreter(code: str)" } }
 
用户输入,然后LLM 返回Tool call,调用本地coder interpreter执行,并获取返回结果,将结果返回给llm. 让其进行有效分析。
user> Please write a Hello World in Python. Concurrent execution tool> code_interpreter({"code": "print('Hello World')"}) exec result> Hello World user> Please analyze the execution results。 assistant> The execution result is straightforward: a Python script that outputs "Hello World" to the console. The code used is a simple print statement, and when executed, it correctly outputs the string. This confirms that the Python environment is functioning correctly and the print statement works as expected. Here's a breakdown of the process: 1. **Input Code**: The Python code provided was: ''' print('Hello World') ''' 2. **Execution**: When executed, this code invokes the `print` function, which outputs the string passed to it—in this case, "Hello World"—to the console. 3. **Output**: The result of the execution is the string "Hello World" printed to the console. This is a standard test to ensure that a Python environment is properly set up and can execute basic code. The successful output "Hello World" indicates that the environment is correctly interpreting and running Python code.
 
 
当使用Gpt-4o等模型的时候,也可以将结果图一并返回给多模态模型,让其进行分析。
user> Please draw a normal distribution graph tool> code_interpreter({ "code": "import matplotlib.pyplot as plt\nimport numpy as np\n\n# Generate data for a normal distribution\nmu, sigma = 0, 0.1 # mean and standard deviation\ns = np.random.normal(mu, sigma, 1000)\n\n# Create the histogram\ncount, bins, ignored = plt.hist(s, 30, density=True)\n\n# Plot the normal distribution curve\nplt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) *\n np.exp( - (bins - mu)**2 / (2 * sigma**2) ),\n linewidth=2, color='r')\nplt.title('Normal Distribution')\nplt.xlabel('Value')\nplt.ylabel('Frequency')\nplt.show()" }) exec result> <Figure size 432x288 with 1 Axes> user> Please analyze the execution results。 assistant> The execution result is a plot showing a normal distribution graph, which includes a histogram of randomly generated data points that follow a normal distribution, overlaid with a red curve representing the theoretical normal distribution function. ### Breakdown of the Process: 1. **Data Generation**: - We generate 1000 data points from a normal distribution using the `numpy` function `np.random.normal`, with a mean (`mu`) of `0` and a standard deviation (`sigma`) of `0.1`. ''' mu, sigma = 0, 0.1 # mean and standard deviation s = np.random.normal(mu, sigma, 1000) ''' 2. **Creating the Histogram**: - The histogram of these data points is created using the `plt.hist` function. - The `density=True` parameter normalizes the histogram such that the total area under the histogram is equal to 1. ''' count, bins, ignored = plt.hist(s, 30, density=True) ''' 3. **Plotting the Normal Distribution Curve**: - The red curve overlay on the histogram represents the theoretical normal distribution function. - This is calculated using the probability density function of the normal distribution: ''' plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * np.exp( - (bins - mu)**2 / (2 * sigma**2) ), linewidth=2, color='r') ''' 4. **Adding Labels and Title**: - The plot is given a title and axis labels for better readability and context. ''' plt.title('Normal Distribution') plt.xlabel('Value') plt.ylabel('Frequency') ''' ### Analysis: - **Histogram**: - The blue bars represent the frequency of data points within each bin, normalized such that the total area under the histogram equals 1. - The shape of the histogram approximates the bell shape, which is characteristic of the normal distribution. - **Normal Distribution Curve (Red Line)**: - The red line represents the theoretical normal distribution for the given mean (`mu=0`) and standard deviation (`sigma=0.1`). - It follows the equation of the normal distribution's probability density function (PDF). - The curve fits well over the histogram, indicating that the generated data points indeed follow a normal distribution. The plot effectively demonstrates both empirical data (through the histogram) and theoretical distribution (through the red curve), providing a visual confirmation of the normal distribution properties. user>
 
notion image
 

参考链接

  1. fcoder
    liqzzUpdated Jun 9, 2024