68 KiB
安装命令
如何实现离线安装:
pip install vllm
pip download -d /tmp/vllm_pkgs vllm
root@autodl-container-c6d54aa471-4479d4d0:~# pip show vllm
Name: vllm
Version: 0.18.0
Summary: A high-throughput and memory-efficient inference and serving engine for LLMs
Home-page:
Author: vLLM Team
Author-email:
License:
Location: /root/miniconda3/lib/python3.12/site-packages
Requires: aiohttp, anthropic, blake3, cachetools, cbor2, cloudpickle, compressed-tensors, depyf, diskcache, einops, fastapi, filelock, flashinfer-python, gguf, ijson, lark, llguidance, lm-format-enforcer, mcp, mistral_common, model-hosting-container-standards, msgspec, ninja, numba, numpy, nvidia-cudnn-frontend, nvidia-cutlass-dsl, openai, openai-harmony, opencv-python-headless, opentelemetry-api, opentelemetry-exporter-otlp, opentelemetry-sdk, opentelemetry-semantic-conventions-ai, outlines_core, partial-json-parser, pillow, prometheus-fastapi-instrumentator, prometheus_client, protobuf, psutil, py-cpuinfo, pybase64, pydantic, python-json-logger, pyyaml, pyzmq, quack-kernels, regex, requests, sentencepiece, setproctitle, setuptools, six, tiktoken, tokenizers, torch, torchaudio, torchvision, tqdm, transformers, typing_extensions, watchfiles, xgrammar
Required-by:
一些建议:
conda create -n vllm python=3.12 -y
conda activate vllm
pip install vllm
安装日志
root@autodl-container-c6d54aa471-4479d4d0:~/autodl-tmp# pip install vllm
Looking in indexes: http://mirrors.aliyun.com/pypi/simple
Collecting vllm
Downloading http://mirrors.aliyun.com/pypi/packages/4f/e9/59cf9b8939b51e859d2166ac3336b353f52ec4f9ceda34228aae7b386840/vllm-0.18.0-cp38-abi3-manylinux_2_31_x86_64.whl (433.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 433.2/433.2 MB 7.0 MB/s eta 0:00:00
Collecting regex (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/9e/40/bb226f203caa22c1043c1ca79b36340156eca0f6a6742b46c3bb222a3a57/regex-2026.2.28-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (802 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 802.0/802.0 kB 13.3 MB/s eta 0:00:00
Collecting cachetools (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/06/f3/39cf3367b8107baa44f861dc802cbf16263c945b62d8265d36034fc07bea/cachetools-7.0.5-py3-none-any.whl (13 kB)
Requirement already satisfied: psutil in /root/miniconda3/lib/python3.12/site-packages (from vllm) (7.0.0)
Collecting sentencepiece (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/04/88/14f2f4a2b922d8b39be45bf63d79e6cd3a9b2f248b2fcb98a69b12af12f5/sentencepiece-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 11.1 MB/s eta 0:00:00
Requirement already satisfied: numpy in /root/miniconda3/lib/python3.12/site-packages (from vllm) (2.3.2)
Requirement already satisfied: requests>=2.26.0 in /root/miniconda3/lib/python3.12/site-packages (from vllm) (2.31.0)
Requirement already satisfied: tqdm in /root/miniconda3/lib/python3.12/site-packages (from vllm) (4.66.2)
Collecting blake3 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/5b/94/eafaa5cdddadc0c9c603a6a6d8339433475e1a9f60c8bb9c2eed2d8736b6/blake3-1.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (388 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 388.0/388.0 kB 30.7 MB/s eta 0:00:00
Collecting py-cpuinfo (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/e0/a9/023730ba63db1e494a271cb018dcd361bd2c917ba7004c3e49d5daf795a2/py_cpuinfo-9.0.0-py3-none-any.whl (22 kB)
Collecting transformers<5,>=4.56.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/03/b8/e484ef633af3887baeeb4b6ad12743363af7cce68ae51e938e00aaa0529d/transformers-4.57.6-py3-none-any.whl (12.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.0/12.0 MB 9.5 MB/s eta 0:00:00
Collecting tokenizers>=0.21.1 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/2e/76/932be4b50ef6ccedf9d3c6639b056a967a86258c6d9200643f01269211ca/tokenizers-0.22.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 10.6 MB/s eta 0:00:00
Collecting protobuf!=6.30.*,!=6.31.*,!=6.32.*,!=6.33.0.*,!=6.33.1.*,!=6.33.2.*,!=6.33.3.*,!=6.33.4.*,>=5.29.6 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/53/1b/3b431694a4dc6d37b9f653f0c64b0a0d9ec074ee810710c0c3da21d67ba7/protobuf-7.34.1-cp310-abi3-manylinux2014_x86_64.whl (324 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 324.3/324.3 kB 32.1 MB/s eta 0:00:00
Collecting fastapi>=0.115.0 (from fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/8f/ea/18f6d0457f9efb2fc6fa594857f92810cadb03024975726db6546b3d6fcf/fastapi-0.135.2-py3-none-any.whl (117 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 117.4/117.4 kB 31.5 MB/s eta 0:00:00
Collecting aiohttp>=3.13.3 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/86/f6/a62cbbf13f0ac80a70f71b1672feba90fdb21fd7abd8dbf25c0105fb6fa3/aiohttp-3.13.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (1.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 11.9 MB/s eta 0:00:00
Collecting openai<2.25.0,>=1.99.1 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/c9/30/844dc675ee6902579b8eef01ed23917cc9319a1c9c0c14ec6e39340c96d0/openai-2.24.0-py3-none-any.whl (1.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 6.8 MB/s eta 0:00:00
Collecting pydantic>=2.12.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/5a/87/b70ad306ebb6f9b585f114d0ac2137d792b48be34d732d60e597c2f8465a/pydantic-2.12.5-py3-none-any.whl (463 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 463.6/463.6 kB 12.2 MB/s eta 0:00:00
Requirement already satisfied: prometheus_client>=0.18.0 in /root/miniconda3/lib/python3.12/site-packages (from vllm) (0.22.1)
Requirement already satisfied: pillow in /root/miniconda3/lib/python3.12/site-packages (from vllm) (11.3.0)
Collecting prometheus-fastapi-instrumentator>=7.0.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/27/72/0824c18f3bc75810f55dacc2dd933f6ec829771180245ae3cc976195dec0/prometheus_fastapi_instrumentator-7.1.0-py3-none-any.whl (19 kB)
Collecting tiktoken>=0.6.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/f4/90/3dae6cc5436137ebd38944d396b5849e167896fc2073da643a49f372dc4f/tiktoken-0.12.0-cp312-cp312-manylinux_2_28_x86_64.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 14.4 MB/s eta 0:00:00
Collecting lm-format-enforcer==0.11.3 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/a0/ef/11292bb0b85cf4c93447cab5a29f64576ed14d3ab4280e35ddd23486594a/lm_format_enforcer-0.11.3-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.4/45.4 kB 21.0 MB/s eta 0:00:00
Collecting llguidance<1.4.0,>=1.3.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/83/a8/1ff2bedb8f9acb46a2d2d603415d272bb622c142ea86f5b95445cc6e366c/llguidance-1.3.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 11.0 MB/s eta 0:00:00
Collecting outlines_core==0.2.11 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/92/c7/a65d1fddf49830ebc41422294eacde35286d9f68994a8aa905cb14f5aade/outlines_core-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 12.9 MB/s eta 0:00:00
Collecting diskcache==5.6.3 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.5/45.5 kB 16.4 MB/s eta 0:00:00
Requirement already satisfied: lark==1.2.2 in /root/miniconda3/lib/python3.12/site-packages (from vllm) (1.2.2)
Collecting xgrammar<1.0.0,>=0.1.32 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/69/62/65e664d861cdadf2d788c03dd8fe67f1faaa7bd4bd2317a2ab850aebee20/xgrammar-0.1.32-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (37.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 37.7/37.7 MB 11.5 MB/s eta 0:00:00
Requirement already satisfied: typing_extensions>=4.10 in /root/miniconda3/lib/python3.12/site-packages (from vllm) (4.14.1)
Requirement already satisfied: filelock>=3.16.1 in /root/miniconda3/lib/python3.12/site-packages (from vllm) (3.18.0)
Collecting partial-json-parser (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/42/32/658973117bf0fd82a24abbfb94fe73a5e86216e49342985e10acce54775a/partial_json_parser-0.2.1.1.post7-py3-none-any.whl (10 kB)
Requirement already satisfied: pyzmq>=25.0.0 in /root/miniconda3/lib/python3.12/site-packages (from vllm) (27.0.1)
Collecting msgspec (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/5c/a2/488517a43ccf5a4b6b6eca6dd4ede0bd82b043d1539dd6bb908a19f8efd3/msgspec-0.20.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (224 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 224.9/224.9 kB 44.8 MB/s eta 0:00:00
Collecting gguf>=0.17.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/5e/0c/e0f1eae7535a97476fb903f65301e35da2a66182b8161066b7eb312b2cb8/gguf-0.18.0-py3-none-any.whl (114 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 114.2/114.2 kB 45.5 MB/s eta 0:00:00
Collecting mistral_common>=1.10.0 (from mistral_common[image]>=1.10.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/87/c6/1429a0a3ab40f8530492b62b52eb792266c261b22ed62aa7f25d61d531ae/mistral_common-1.10.0-py3-none-any.whl (6.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/6.5 MB 9.6 MB/s eta 0:00:00
Collecting opencv-python-headless>=4.13.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/4b/33/b5db29a6c00eb8f50708110d8d453747ca125c8b805bc437b289dbdcc057/opencv_python_headless-4.13.0.92-cp37-abi3-manylinux_2_28_x86_64.whl (60.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.4/60.4 MB 10.7 MB/s eta 0:00:00
Requirement already satisfied: pyyaml in /root/miniconda3/lib/python3.12/site-packages (from vllm) (6.0.2)
Requirement already satisfied: six>=1.16.0 in /root/miniconda3/lib/python3.12/site-packages (from vllm) (1.17.0)
Collecting setuptools<81.0.0,>=77.0.3 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/94/b8/f1f62a5e3c0ad2ff1d189590bfa4c46b4f3b6e49cef6f26c6ee4e575394d/setuptools-80.10.2-py3-none-any.whl (1.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 14.2 MB/s eta 0:00:00
Collecting einops (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/2a/09/f8d8f8f31e4483c10a906437b4ce31bdf3d6d417b73fe33f1a8b59e34228/einops-0.8.2-py3-none-any.whl (65 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.6/65.6 kB 26.7 MB/s eta 0:00:00
Collecting compressed-tensors==0.13.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/0b/b5/61ac2563c62490922b603c09113a083fd74af3630ec3931e769484d6dcb5/compressed_tensors-0.13.0-py3-none-any.whl (192 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 192.6/192.6 kB 42.8 MB/s eta 0:00:00
Collecting depyf==0.20.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/cf/65/4df6936130b56e1429114e663e7c1576cf845f3aef1b2dd200c0a5d19dba/depyf-0.20.0-py3-none-any.whl (39 kB)
Collecting cloudpickle (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/88/39/799be3f2f0f38cc727ee3b4f1445fe6d5e4133064ec2e4115069418a5bb6/cloudpickle-3.1.2-py3-none-any.whl (22 kB)
Collecting watchfiles (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/cf/68/5707da262a119fb06fbe214d82dd1fe4a6f4af32d2d14de368d0349eb52a/watchfiles-1.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (456 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 456.8/456.8 kB 40.0 MB/s eta 0:00:00
Requirement already satisfied: python-json-logger in /root/miniconda3/lib/python3.12/site-packages (from vllm) (3.3.0)
Collecting ninja (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/ed/de/0e6edf44d6a04dabd0318a519125ed0415ce437ad5a1ec9b9be03d9048cf/ninja-1.13.0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (180 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 180.7/180.7 kB 39.5 MB/s eta 0:00:00
Collecting pybase64 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/62/f7/965b79ff391ad208b50e412b5d3205ccce372a2d27b7218ae86d5295b105/pybase64-1.4.3-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl (71 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.6/71.6 kB 36.8 MB/s eta 0:00:00
Collecting cbor2 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/db/9d/7ede2cc42f9bb4260492e7d29d2aab781eacbbcfb09d983de1e695077199/cbor2-5.9.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (288 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 288.2/288.2 kB 19.8 MB/s eta 0:00:00
Collecting ijson (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/6d/81/2fee58f9024a3449aee83edfa7167fb5ccd7e1af2557300e28531bb68e16/ijson-3.5.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (149 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 149.7/149.7 kB 28.8 MB/s eta 0:00:00
Collecting setproctitle (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/d0/99/71630546b9395b095f4082be41165d1078204d1696c2d9baade3de3202d0/setproctitle-1.3.7-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (32 kB)
Collecting openai-harmony>=0.0.3 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/25/3f/1a192b93bb47c6b44cd98ba8cc1d3d2a9308f1bb700c3017e6352da11bda/openai_harmony-0.0.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 11.9 MB/s eta 0:00:00
Collecting anthropic>=0.71.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/63/5f/67db29c6e5d16c8c9c4652d3efb934d89cb750cad201539141781d8eae14/anthropic-0.86.0-py3-none-any.whl (469 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 469.4/469.4 kB 27.6 MB/s eta 0:00:00
Collecting model-hosting-container-standards<1.0.0,>=0.1.13 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/48/94/052452842d39c562237a70345c57ec213a9db22bd25bba998fd2b32d70a7/model_hosting_container_standards-0.1.14-py3-none-any.whl (121 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.4/121.4 kB 32.0 MB/s eta 0:00:00
Collecting mcp (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/fd/d9/eaa1f80170d2b7c5ba23f3b59f766f3a0bb41155fbc32a69adfa1adaaef9/mcp-1.26.0-py3-none-any.whl (233 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 233.6/233.6 kB 28.6 MB/s eta 0:00:00
Collecting opentelemetry-sdk>=1.27.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/2c/c5/6a852903d8bfac758c6dc6e9a68b015d3c33f2f1be5e9591e0f4b69c7e0a/opentelemetry_sdk-1.40.0-py3-none-any.whl (141 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.0/142.0 kB 32.3 MB/s eta 0:00:00
Collecting opentelemetry-api>=1.27.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/5f/bf/93795954016c522008da367da292adceed71cca6ee1717e1d64c83089099/opentelemetry_api-1.40.0-py3-none-any.whl (68 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.7/68.7 kB 32.4 MB/s eta 0:00:00
Collecting opentelemetry-exporter-otlp>=1.27.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/2d/fc/aea77c28d9f3ffef2fdafdc3f4a235aee4091d262ddabd25882f47ce5c5f/opentelemetry_exporter_otlp-1.40.0-py3-none-any.whl (7.0 kB)
Collecting opentelemetry-semantic-conventions-ai>=0.4.1 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/a7/18/35fec29ed6e49bcbbe629b790cc0deb5bb58da9caceee29b39b54d3d7f47/opentelemetry_semantic_conventions_ai-0.5.0-py3-none-any.whl (10.0 kB)
Collecting numba==0.61.2 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/9a/2d/e518df036feab381c23a624dac47f8445ac55686ec7f11083655eb707da3/numba-0.61.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.9/3.9 MB 10.1 MB/s eta 0:00:00
Collecting torch==2.10.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/b3/7a/abada41517ce0011775f0f4eacc79659bc9bc6c361e6bfe6f7052a6b9363/torch-2.10.0-3-cp312-cp312-manylinux_2_28_x86_64.whl (915.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 915.6/915.6 MB 4.6 MB/s eta 0:00:00
Collecting torchaudio==2.10.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/98/25/e55a30d7138f8fe56ed006df25b0a3c27681f0ec7bc9989e1778e6d559c3/torchaudio-2.10.0-cp312-cp312-manylinux_2_28_x86_64.whl (1.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 14.1 MB/s eta 0:00:00
Collecting torchvision==0.25.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/68/2f/f24b039169db474e8688f649377de082a965fbf85daf4e46c44412f1d15a/torchvision-0.25.0-cp312-cp312-manylinux_2_28_x86_64.whl (8.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.1/8.1 MB 10.9 MB/s eta 0:00:00
Collecting flashinfer-python==0.6.6 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/e0/61/385d06755f3ab66333018285657adf0daf8a90a129448231fd09e315bd2e/flashinfer_python-0.6.6-py3-none-any.whl (7.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.8/7.8 MB 11.1 MB/s eta 0:00:00
Collecting nvidia-cudnn-frontend<1.19.0,>=1.13.0 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/c6/52/08f98262e77b1cbcc834cc1a5db494d0661ea1dbdea58c2e2d51a57fdaca/nvidia_cudnn_frontend-1.18.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (2.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.2/2.2 MB 13.7 MB/s eta 0:00:00
Collecting nvidia-cutlass-dsl>=4.4.0.dev1 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/a9/03/678dab0383db1ddfc449da216220f40404189eb36eeed9d87a4fa4bdb0e6/nvidia_cutlass_dsl-4.4.2-py3-none-any.whl (10 kB)
Collecting quack-kernels>=0.2.7 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/45/e6/fb900aa5d6053069c3180382874520e7313362fa03994a034626906e7094/quack_kernels-0.3.5-py3-none-any.whl (195 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 195.7/195.7 kB 28.3 MB/s eta 0:00:00
Collecting loguru (from compressed-tensors==0.13.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/0c/29/0348de65b8cc732daa3e33e67806420b2ae89bdce2b04af740289c5c6c8c/loguru-0.7.3-py3-none-any.whl (61 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.6/61.6 kB 33.3 MB/s eta 0:00:00
Collecting astor (from depyf==0.20.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/c3/88/97eef84f48fa04fbd6750e62dcceafba6c63c81b7ac1420856c8dcc0a3f9/astor-0.8.1-py2.py3-none-any.whl (27 kB)
Collecting dill (from depyf==0.20.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/1e/77/dc8c558f7593132cf8fefec57c4f60c83b16941c574ac5f619abb3ae7933/dill-0.4.1-py3-none-any.whl (120 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 120.0/120.0 kB 2.4 MB/s eta 0:00:00
Collecting apache-tvm-ffi!=0.1.8,!=0.1.8.post0,<0.2,>=0.1.6 (from flashinfer-python==0.6.6->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/70/ef/5402da5d37f5270fd88ea0348acca78dba9be8bdbf6c2bcae0935eb03ef1/apache_tvm_ffi-0.1.9-cp312-abi3-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 10.0 MB/s eta 0:00:00
Requirement already satisfied: click in /root/miniconda3/lib/python3.12/site-packages (from flashinfer-python==0.6.6->vllm) (8.3.1)
Collecting nvidia-ml-py (from flashinfer-python==0.6.6->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/8a/24/fc256107d23597fa33d319505ce77160fa1a2349c096d01901ffc7cb7fc4/nvidia_ml_py-13.595.45-py3-none-any.whl (51 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 51.8/51.8 kB 23.7 MB/s eta 0:00:00
Collecting packaging>=24.2 (from flashinfer-python==0.6.6->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/b7/b9/c538f279a4e237a006a2c98387d081e9eb060d203d8ed34467cc0f0b9b53/packaging-26.0-py3-none-any.whl (74 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.4/74.4 kB 32.2 MB/s eta 0:00:00
Collecting tabulate (from flashinfer-python==0.6.6->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/99/55/db07de81b5c630da5cbf5c7df646580ca26dfaefa593667fc6f2fe016d2e/tabulate-0.10.0-py3-none-any.whl (39 kB)
Collecting interegular>=0.3.2 (from lm-format-enforcer==0.11.3->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/c4/01/72d6472f80651673716d1deda2a5bbb633e563ecf94f4479da5519d69d25/interegular-0.3.3-py37-none-any.whl (23 kB)
Collecting llvmlite<0.45,>=0.44.0dev0 (from numba==0.61.2->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/cb/da/8341fd3056419441286c8e26bf436923021005ece0bff5f41906476ae514/llvmlite-0.44.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.4/42.4 MB 9.8 MB/s eta 0:00:00
Collecting numpy (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/8c/3d/1e1db36cfd41f895d266b103df00ca5b3cbe965184df824dec5c08c6b803/numpy-2.2.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.5/16.5 MB 10.4 MB/s eta 0:00:00
Requirement already satisfied: sympy>=1.13.3 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (1.14.0)
Requirement already satisfied: networkx>=2.5.1 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (3.5)
Requirement already satisfied: jinja2 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (3.1.6)
Requirement already satisfied: fsspec>=0.8.5 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (2025.7.0)
Collecting cuda-bindings==12.9.4 (from torch==2.10.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/a9/c1/dabe88f52c3e3760d861401bb994df08f672ec893b8f7592dc91626adcf3/cuda_bindings-12.9.4-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (12.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.2/12.2 MB 10.3 MB/s eta 0:00:00
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.8.93 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (12.8.93)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.8.90 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (12.8.90)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.8.90 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (12.8.90)
Requirement already satisfied: nvidia-cudnn-cu12==9.10.2.21 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (9.10.2.21)
Requirement already satisfied: nvidia-cublas-cu12==12.8.4.1 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (12.8.4.1)
Requirement already satisfied: nvidia-cufft-cu12==11.3.3.83 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (11.3.3.83)
Requirement already satisfied: nvidia-curand-cu12==10.3.9.90 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (10.3.9.90)
Requirement already satisfied: nvidia-cusolver-cu12==11.7.3.90 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (11.7.3.90)
Requirement already satisfied: nvidia-cusparse-cu12==12.5.8.93 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (12.5.8.93)
Requirement already satisfied: nvidia-cusparselt-cu12==0.7.1 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (0.7.1)
Collecting nvidia-nccl-cu12==2.27.5 (from torch==2.10.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/6e/89/f7a07dc961b60645dbbf42e80f2bc85ade7feb9a491b11a1e973aa00071f/nvidia_nccl_cu12-2.27.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (322.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 322.3/322.3 MB 7.1 MB/s eta 0:00:00
Collecting nvidia-nvshmem-cu12==3.4.5 (from torch==2.10.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/b5/09/6ea3ea725f82e1e76684f0708bbedd871fc96da89945adeba65c3835a64c/nvidia_nvshmem_cu12-3.4.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (139.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.1/139.1 MB 8.9 MB/s eta 0:00:00
Requirement already satisfied: nvidia-nvtx-cu12==12.8.90 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (12.8.90)
Requirement already satisfied: nvidia-nvjitlink-cu12==12.8.93 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (12.8.93)
Requirement already satisfied: nvidia-cufile-cu12==1.13.1.3 in /root/miniconda3/lib/python3.12/site-packages (from torch==2.10.0->vllm) (1.13.1.3)
Collecting triton==3.6.0 (from torch==2.10.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/ab/a8/cdf8b3e4c98132f965f88c2313a4b493266832ad47fb52f23d14d4f86bb5/triton-3.6.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (188.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 188.3/188.3 MB 8.1 MB/s eta 0:00:00
Collecting cuda-pathfinder~=1.1 (from cuda-bindings==12.9.4->torch==2.10.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/c0/66/7b2c3d23dac4bb9629b4d9702f1f796bd41c01142c2b47be6fcfdeaf4ee4/cuda_pathfinder-1.4.4-py3-none-any.whl (48 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.8/48.8 kB 18.4 MB/s eta 0:00:00
Collecting aiohappyeyeballs>=2.5.0 (from aiohttp>=3.13.3->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/0f/15/5bf3b99495fb160b63f95972b81750f18f7f4e02ad051373b669d17d44f2/aiohappyeyeballs-2.6.1-py3-none-any.whl (15 kB)
Collecting aiosignal>=1.4.0 (from aiohttp>=3.13.3->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/fb/76/641ae371508676492379f16e2fa48f4e2c11741bd63c48be4b12a6b09cba/aiosignal-1.4.0-py3-none-any.whl (7.5 kB)
Requirement already satisfied: attrs>=17.3.0 in /root/miniconda3/lib/python3.12/site-packages (from aiohttp>=3.13.3->vllm) (25.3.0)
Collecting frozenlist>=1.1.1 (from aiohttp>=3.13.3->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/6a/bd/d91c5e39f490a49df14320f4e8c80161cfcce09f1e2cde1edd16a551abb3/frozenlist-1.8.0-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (242 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 242.4/242.4 kB 4.8 MB/s eta 0:00:00
Collecting multidict<7.0,>=4.5 (from aiohttp>=3.13.3->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/f3/8d/5e5be3ced1d12966fefb5c4ea3b2a5b480afcea36406559442c6e31d4a48/multidict-6.7.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (256 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 256.3/256.3 kB 14.4 MB/s eta 0:00:00
Collecting propcache>=0.2.0 (from aiohttp>=3.13.3->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/46/4b/3aae6835b8e5f44ea6a68348ad90f78134047b503765087be2f9912140ea/propcache-0.4.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (221 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 221.6/221.6 kB 17.2 MB/s eta 0:00:00
Collecting yarl<2.0,>=1.17.0 (from aiohttp>=3.13.3->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/66/3e/868e5c3364b6cee19ff3e1a122194fa4ce51def02c61023970442162859e/yarl-1.23.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (100 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.1/100.1 kB 15.2 MB/s eta 0:00:00
Requirement already satisfied: anyio<5,>=3.5.0 in /root/miniconda3/lib/python3.12/site-packages (from anthropic>=0.71.0->vllm) (4.10.0)
Requirement already satisfied: distro<2,>=1.7.0 in /root/miniconda3/lib/python3.12/site-packages (from anthropic>=0.71.0->vllm) (1.9.0)
Collecting docstring-parser<1,>=0.15 (from anthropic>=0.71.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/55/e2/2537ebcff11c1ee1ff17d8d0b6f4db75873e3b0fb32c2d4a2ee31ecb310a/docstring_parser-0.17.0-py3-none-any.whl (36 kB)
Requirement already satisfied: httpx<1,>=0.25.0 in /root/miniconda3/lib/python3.12/site-packages (from anthropic>=0.71.0->vllm) (0.28.1)
Collecting jiter<1,>=0.4.0 (from anthropic>=0.71.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/f8/4c/09b93e30e984a187bc8aaa3510e1ec8dcbdcd71ca05d2f56aac0492453aa/jiter-0.13.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (360 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 360.7/360.7 kB 16.4 MB/s eta 0:00:00
Requirement already satisfied: sniffio in /root/miniconda3/lib/python3.12/site-packages (from anthropic>=0.71.0->vllm) (1.3.1)
Collecting starlette>=0.46.0 (from fastapi>=0.115.0->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/0b/c9/584bc9651441b4ba60cc4d557d8a547b5aff901af35bda3a4ee30c819b82/starlette-1.0.0-py3-none-any.whl (72 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.7/72.7 kB 15.4 MB/s eta 0:00:00
Collecting typing-inspection>=0.4.2 (from fastapi>=0.115.0->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/dc/9b/47798a6c91d8bdb567fe2698fe81e0c6b7cb7ef4d13da4114b41d239f65d/typing_inspection-0.4.2-py3-none-any.whl (14 kB)
Requirement already satisfied: annotated-doc>=0.0.2 in /root/miniconda3/lib/python3.12/site-packages (from fastapi>=0.115.0->fastapi[standard]>=0.115.0->vllm) (0.0.4)
Collecting fastapi-cli>=0.0.8 (from fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/c7/4b/68f9fe268e535d79c76910519530026a4f994ce07189ac0dded45c6af825/fastapi_cli-0.0.24-py3-none-any.whl (12 kB)
Collecting python-multipart>=0.0.18 (from fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/1b/d0/397f9626e711ff749a95d96b7af99b9c566a9bb5129b8e4c10fc4d100304/python_multipart-0.0.22-py3-none-any.whl (24 kB)
Collecting email-validator>=2.0.0 (from fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/de/15/545e2b6cf2e3be84bc1ed85613edd75b8aea69807a71c26f4ca6a9258e82/email_validator-2.3.0-py3-none-any.whl (35 kB)
Collecting uvicorn>=0.12.0 (from uvicorn[standard]>=0.12.0; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/0a/89/f8827ccff89c1586027a105e5630ff6139a64da2515e24dafe860bd9ae4d/uvicorn-0.42.0-py3-none-any.whl (68 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.8/68.8 kB 16.4 MB/s eta 0:00:00
Collecting pydantic-settings>=2.0.0 (from fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/00/4b/ccc026168948fec4f7555b9164c724cf4125eac006e176541483d2c959be/pydantic_settings-2.13.1-py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.9/58.9 kB 18.5 MB/s eta 0:00:00
Collecting pydantic-extra-types>=2.0.0 (from fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/17/c1/3226e6d7f5a4f736f38ac11a6fbb262d701889802595cdb0f53a885ac2e0/pydantic_extra_types-2.11.1-py3-none-any.whl (79 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 79.5/79.5 kB 25.0 MB/s eta 0:00:00
Requirement already satisfied: jsonschema>=4.21.1 in /root/miniconda3/lib/python3.12/site-packages (from mistral_common>=1.10.0->mistral_common[image]>=1.10.0->vllm) (4.25.0)
Collecting jmespath (from model-hosting-container-standards<1.0.0,>=0.1.13->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/14/2f/967ba146e6d58cf6a652da73885f52fc68001525b4197effc174321d70b4/jmespath-1.1.0-py3-none-any.whl (20 kB)
Requirement already satisfied: supervisor>=4.2.0 in /root/miniconda3/lib/python3.12/site-packages (from model-hosting-container-standards<1.0.0,>=0.1.13->vllm) (4.2.5)
Collecting nvidia-cutlass-dsl-libs-base==4.4.2 (from nvidia-cutlass-dsl>=4.4.0.dev1->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/56/98/e264964741d9cc9816625d9600d17a5249fd5cbd8c2d166fb0d0c34dfe5a/nvidia_cutlass_dsl_libs_base-4.4.2-cp312-cp312-manylinux_2_28_x86_64.whl (74.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.4/74.4 MB 8.7 MB/s eta 0:00:00
Collecting cuda-python>=12.8 (from nvidia-cutlass-dsl-libs-base==4.4.2->nvidia-cutlass-dsl>=4.4.0.dev1->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/4a/da/b4dbe129f941afe1c24a09ba53521b78875626763d96414798a74763282f/cuda_python-13.2.0-py3-none-any.whl (8.1 kB)
Collecting importlib-metadata<8.8.0,>=6.0 (from opentelemetry-api>=1.27.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/fa/5e/f8e9a1d23b9c20a551a8a02ea3637b4642e22c2626e3a13a9a29cdea99eb/importlib_metadata-8.7.1-py3-none-any.whl (27 kB)
Collecting opentelemetry-exporter-otlp-proto-grpc==1.40.0 (from opentelemetry-exporter-otlp>=1.27.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/96/6f/7ee0980afcbdcd2d40362da16f7f9796bd083bf7f0b8e038abfbc0300f5d/opentelemetry_exporter_otlp_proto_grpc-1.40.0-py3-none-any.whl (20 kB)
Collecting opentelemetry-exporter-otlp-proto-http==1.40.0 (from opentelemetry-exporter-otlp>=1.27.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/a0/3a/8865d6754e61c9fb170cdd530a124a53769ee5f740236064816eb0ca7301/opentelemetry_exporter_otlp_proto_http-1.40.0-py3-none-any.whl (19 kB)
Collecting googleapis-common-protos~=1.57 (from opentelemetry-exporter-otlp-proto-grpc==1.40.0->opentelemetry-exporter-otlp>=1.27.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/69/28/23eea8acd65972bbfe295ce3666b28ac510dfcb115fac089d3edb0feb00a/googleapis_common_protos-1.73.0-py3-none-any.whl (297 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 297.6/297.6 kB 15.8 MB/s eta 0:00:00
Requirement already satisfied: grpcio<2.0.0,>=1.63.2 in /root/miniconda3/lib/python3.12/site-packages (from opentelemetry-exporter-otlp-proto-grpc==1.40.0->opentelemetry-exporter-otlp>=1.27.0->vllm) (1.74.0)
Collecting opentelemetry-exporter-otlp-proto-common==1.40.0 (from opentelemetry-exporter-otlp-proto-grpc==1.40.0->opentelemetry-exporter-otlp>=1.27.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/8b/ca/8f122055c97a932311a3f640273f084e738008933503d0c2563cd5d591fc/opentelemetry_exporter_otlp_proto_common-1.40.0-py3-none-any.whl (18 kB)
Collecting opentelemetry-proto==1.40.0 (from opentelemetry-exporter-otlp-proto-grpc==1.40.0->opentelemetry-exporter-otlp>=1.27.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/b9/b2/189b2577dde745b15625b3214302605b1353436219d42b7912e77fa8dc24/opentelemetry_proto-1.40.0-py3-none-any.whl (72 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.1/72.1 kB 14.7 MB/s eta 0:00:00
Collecting protobuf!=6.30.*,!=6.31.*,!=6.32.*,!=6.33.0.*,!=6.33.1.*,!=6.33.2.*,!=6.33.3.*,!=6.33.4.*,>=5.29.6 (from vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/16/92/d1e32e3e0d894fe00b15ce28ad4944ab692713f2e7f0a99787405e43533a/protobuf-6.33.6-cp39-abi3-manylinux2014_x86_64.whl (323 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 323.4/323.4 kB 15.5 MB/s eta 0:00:00
Collecting opentelemetry-semantic-conventions==0.61b0 (from opentelemetry-sdk>=1.27.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/b2/37/cc6a55e448deaa9b27377d087da8615a3416d8ad523d5960b78dbeadd02a/opentelemetry_semantic_conventions-0.61b0-py3-none-any.whl (231 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 231.6/231.6 kB 16.4 MB/s eta 0:00:00
Collecting starlette>=0.46.0 (from fastapi>=0.115.0->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/81/0d/13d1d239a25cbfb19e740db83143e95c772a1fe10202dda4b76792b114dd/starlette-0.52.1-py3-none-any.whl (74 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.3/74.3 kB 16.1 MB/s eta 0:00:00
Collecting annotated-types>=0.6.0 (from pydantic>=2.12.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl (13 kB)
Collecting pydantic-core==2.41.5 (from pydantic>=2.12.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/0d/76/941cc9f73529988688a665a5c0ecff1112b3d95ab48f81db5f7606f522d3/pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 13.5 MB/s eta 0:00:00
Collecting torch-c-dlpack-ext (from quack-kernels>=0.2.7->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/e2/79/a914539b4785f3e44f891aa012a886edb8bc10fe081c440981c57543ce21/torch_c_dlpack_ext-0.1.5-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (897 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 897.8/897.8 kB 17.5 MB/s eta 0:00:00
Requirement already satisfied: charset-normalizer<4,>=2 in /root/miniconda3/lib/python3.12/site-packages (from requests>=2.26.0->vllm) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /root/miniconda3/lib/python3.12/site-packages (from requests>=2.26.0->vllm) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /root/miniconda3/lib/python3.12/site-packages (from requests>=2.26.0->vllm) (2.1.0)
Requirement already satisfied: certifi>=2017.4.17 in /root/miniconda3/lib/python3.12/site-packages (from requests>=2.26.0->vllm) (2024.2.2)
Requirement already satisfied: huggingface-hub<2.0,>=0.16.4 in /root/miniconda3/lib/python3.12/site-packages (from tokenizers>=0.21.1->vllm) (1.7.2)
Collecting huggingface-hub<2.0,>=0.16.4 (from tokenizers>=0.21.1->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/a8/af/48ac8483240de756d2438c380746e7130d1c6f75802ef22f3c6d49982787/huggingface_hub-0.36.2-py3-none-any.whl (566 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 566.4/566.4 kB 13.5 MB/s eta 0:00:00
Collecting safetensors>=0.4.3 (from transformers<5,>=4.56.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/a0/60/429e9b1cb3fc651937727befe258ea24122d9663e4d5709a48c9cbfceecb/safetensors-0.7.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (507 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 507.2/507.2 kB 32.9 MB/s eta 0:00:00
Collecting httpx-sse>=0.4 (from mcp->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/d2/fd/6668e5aec43ab844de6fc74927e155a3b37bf40d7c3790e49fc0406b6578/httpx_sse-0.4.3-py3-none-any.whl (9.0 kB)
Collecting pyjwt>=2.10.1 (from pyjwt[crypto]>=2.10.1->mcp->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/e5/7a/8dd906bd22e79e47397a61742927f6747fe93242ef86645ee9092e610244/pyjwt-2.12.1-py3-none-any.whl (29 kB)
Collecting sse-starlette>=1.6.1 (from mcp->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/78/e2/b8cff57a67dddf9a464d7e943218e031617fb3ddc133aeeb0602ff5f6c85/sse_starlette-3.3.3-py3-none-any.whl (14 kB)
Collecting dnspython>=2.0.0 (from email-validator>=2.0.0->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/ba/5a/18ad964b0086c6e62e2e7500f7edc89e3faa45033c71c1893d34eed2b2de/dnspython-2.8.0-py3-none-any.whl (331 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 331.1/331.1 kB 33.9 MB/s eta 0:00:00
Requirement already satisfied: typer>=0.16.0 in /root/miniconda3/lib/python3.12/site-packages (from fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) (0.24.1)
Collecting rich-toolkit>=0.14.8 (from fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/fb/3c/c923619f6d2f5fafcc96fec0aaf9550a46cd5b6481f06e0c6b66a2a4fed0/rich_toolkit-0.19.7-py3-none-any.whl (32 kB)
Collecting fastapi-cloud-cli>=0.1.1 (from fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/40/cc/1ccca747f5609be27186ea8c9219449142f40e3eded2c6089bba6a6ecc82/fastapi_cloud_cli-0.15.0-py3-none-any.whl (32 kB)
Requirement already satisfied: httpcore==1.* in /root/miniconda3/lib/python3.12/site-packages (from httpx<1,>=0.25.0->anthropic>=0.71.0->vllm) (1.0.9)
Requirement already satisfied: h11>=0.16 in /root/miniconda3/lib/python3.12/site-packages (from httpcore==1.*->httpx<1,>=0.25.0->anthropic>=0.71.0->vllm) (0.16.0)
Requirement already satisfied: hf-xet<2.0.0,>=1.1.3 in /root/miniconda3/lib/python3.12/site-packages (from huggingface-hub<2.0,>=0.16.4->tokenizers>=0.21.1->vllm) (1.4.2)
Collecting zipp>=3.20 (from importlib-metadata<8.8.0,>=6.0->opentelemetry-api>=1.27.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/2e/54/647ade08bf0db230bfea292f893923872fd20be6ac6f53b2b936ba839d75/zipp-3.23.0-py3-none-any.whl (10 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /root/miniconda3/lib/python3.12/site-packages (from jinja2->torch==2.10.0->vllm) (3.0.2)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /root/miniconda3/lib/python3.12/site-packages (from jsonschema>=4.21.1->mistral_common>=1.10.0->mistral_common[image]>=1.10.0->vllm) (2025.4.1)
Requirement already satisfied: referencing>=0.28.4 in /root/miniconda3/lib/python3.12/site-packages (from jsonschema>=4.21.1->mistral_common>=1.10.0->mistral_common[image]>=1.10.0->vllm) (0.36.2)
Requirement already satisfied: rpds-py>=0.7.1 in /root/miniconda3/lib/python3.12/site-packages (from jsonschema>=4.21.1->mistral_common>=1.10.0->mistral_common[image]>=1.10.0->vllm) (0.26.0)
Collecting pycountry>=23 (from pydantic-extra-types[pycountry]>=2.10.5->mistral_common>=1.10.0->mistral_common[image]>=1.10.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/9c/42/7703bd45b62fecd44cd7d3495423097e2f7d28bc2e99e7c1af68892ab157/pycountry-26.2.16-py3-none-any.whl (8.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.0/8.0 MB 10.6 MB/s eta 0:00:00
Collecting python-dotenv>=0.21.0 (from pydantic-settings>=2.0.0->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/0b/d7/1959b9648791274998a9c3526f6d0ec8fd2233e4d4acce81bbae76b44b2a/python_dotenv-1.2.2-py3-none-any.whl (22 kB)
Requirement already satisfied: cryptography>=3.4.0 in /root/miniconda3/lib/python3.12/site-packages (from pyjwt[crypto]>=2.10.1->mcp->vllm) (42.0.5)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /root/miniconda3/lib/python3.12/site-packages (from sympy>=1.13.3->torch==2.10.0->vllm) (1.3.0)
Collecting httptools>=0.6.3 (from uvicorn[standard]>=0.12.0; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/84/a6/b3965e1e146ef5762870bbe76117876ceba51a201e18cc31f5703e454596/httptools-0.7.1-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (517 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 517.7/517.7 kB 43.0 MB/s eta 0:00:00
Collecting uvloop>=0.15.1 (from uvicorn[standard]>=0.12.0; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/5f/6f/e62b4dfc7ad6518e7eff2516f680d02a0f6eb62c0c212e152ca708a0085e/uvloop-0.22.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (4.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.4/4.4 MB 12.6 MB/s eta 0:00:00
Collecting websockets>=10.4 (from uvicorn[standard]>=0.12.0; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/18/29/71729b4671f21e1eaa5d6573031ab810ad2936c8175f03f97f3ff164c802/websockets-16.0-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (184 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 184.9/184.9 kB 22.0 MB/s eta 0:00:00
Requirement already satisfied: cffi>=1.12 in /root/miniconda3/lib/python3.12/site-packages (from cryptography>=3.4.0->pyjwt[crypto]>=2.10.1->mcp->vllm) (1.16.0)
INFO: pip is looking at multiple versions of cuda-python to determine which version is compatible with other requirements. This could take a while.
Collecting cuda-python>=12.8 (from nvidia-cutlass-dsl-libs-base==4.4.2->nvidia-cutlass-dsl>=4.4.0.dev1->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/cd/08/b5e3b9822662d72d540d830531e3ab6a7cabbda3dd56175696aabccfeb76/cuda_python-13.1.1-py3-none-any.whl (8.0 kB)
Downloading http://mirrors.aliyun.com/pypi/packages/3d/4d/876c2f87d34ccde0f11688f07e98a43cb3498cc115ee85fc7ae79711b7ae/cuda_python-13.1.0-py3-none-any.whl (7.6 kB)
Downloading http://mirrors.aliyun.com/pypi/packages/31/5f/beaa12a11b051027eec0b041df01c6690db4f02e3b2e8fadd5a0eeb4df52/cuda_python-13.0.3-py3-none-any.whl (7.6 kB)
Downloading http://mirrors.aliyun.com/pypi/packages/36/4d/d04772e5ba415aad4633796d636a3abbd8f779b438c3441d795e6bc9f172/cuda_python-13.0.2-py3-none-any.whl (7.6 kB)
Downloading http://mirrors.aliyun.com/pypi/packages/02/02/078f4cba58349faad5597306ca54bf0bf129f8c713b261e1def59468a505/cuda_python-13.0.1-py3-none-any.whl (7.6 kB)
Downloading http://mirrors.aliyun.com/pypi/packages/05/99/7df7b57a5eba85b25a76c9c247c88e79770b4902bce266dbf0fc58f21198/cuda_python-13.0.0-py3-none-any.whl (7.6 kB)
Downloading http://mirrors.aliyun.com/pypi/packages/57/69/4a79126959ad6f1653504122ee1eb22d089dd6272d3fa37694dcdeb78ba5/cuda_python-12.9.6-py3-none-any.whl (7.6 kB)
INFO: pip is still looking at multiple versions of cuda-python to determine which version is compatible with other requirements. This could take a while.
Downloading http://mirrors.aliyun.com/pypi/packages/0a/02/ce79a804a2d6ee7dc2d1637b75b7c3f01eb90a796915d4d3a1ac42e2d6e6/cuda_python-12.9.5-py3-none-any.whl (7.6 kB)
Downloading http://mirrors.aliyun.com/pypi/packages/af/f3/6b032a554019cfb3447e671798c1bd3e79b5f1af20d10253f56cea269ef2/cuda_python-12.9.4-py3-none-any.whl (7.6 kB)
Collecting rignore>=0.5.1 (from fastapi-cloud-cli>=0.1.1->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/85/e5/7f99bd0cc9818a91d0e8b9acc65b792e35750e3bdccd15a7ee75e64efca4/rignore-0.7.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (959 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 959.8/959.8 kB 9.2 MB/s eta 0:00:00
Collecting sentry-sdk>=2.20.0 (from fastapi-cloud-cli>=0.1.1->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/9a/66/20465097782d7e1e742d846407ea7262d338c6e876ddddad38ca8907b38f/sentry_sdk-2.55.0-py2.py3-none-any.whl (449 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 449.3/449.3 kB 16.4 MB/s eta 0:00:00
Collecting fastar>=0.8.0 (from fastapi-cloud-cli>=0.1.1->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm)
Downloading http://mirrors.aliyun.com/pypi/packages/41/df/d663214d35380b07a24a796c48d7d7d4dc3a28ec0756edbcb7e2a81dc572/fastar-0.9.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (819 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 819.0/819.0 kB 8.8 MB/s eta 0:00:00
Requirement already satisfied: rich>=13.7.1 in /root/miniconda3/lib/python3.12/site-packages (from rich-toolkit>=0.14.8->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) (14.3.3)
Requirement already satisfied: shellingham>=1.3.0 in /root/miniconda3/lib/python3.12/site-packages (from typer>=0.16.0->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) (1.5.4)
Requirement already satisfied: pycparser in /root/miniconda3/lib/python3.12/site-packages (from cffi>=1.12->cryptography>=3.4.0->pyjwt[crypto]>=2.10.1->mcp->vllm) (2.21)
Requirement already satisfied: markdown-it-py>=2.2.0 in /root/miniconda3/lib/python3.12/site-packages (from rich>=13.7.1->rich-toolkit>=0.14.8->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) (4.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /root/miniconda3/lib/python3.12/site-packages (from rich>=13.7.1->rich-toolkit>=0.14.8->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) (2.19.2)
Requirement already satisfied: mdurl~=0.1 in /root/miniconda3/lib/python3.12/site-packages (from markdown-it-py>=2.2.0->rich>=13.7.1->rich-toolkit>=0.14.8->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) (0.1.2)
Installing collected packages: py-cpuinfo, nvidia-ml-py, zipp, websockets, uvloop, uvicorn, typing-inspection, triton, tabulate, setuptools, setproctitle, sentry-sdk, sentencepiece, safetensors, rignore, regex, python-multipart, python-dotenv, pyjwt, pydantic-core, pycountry, pybase64, protobuf, propcache, partial-json-parser, packaging, outlines_core, nvidia-nvshmem-cu12, nvidia-nccl-cu12, nvidia-cudnn-frontend, numpy, ninja, multidict, msgspec, loguru, llvmlite, llguidance, jmespath, jiter, interegular, ijson, httpx-sse, httptools, frozenlist, fastar, einops, docstring-parser, dnspython, diskcache, dill, cuda-pathfinder, cloudpickle, cbor2, cachetools, blake3, astor, apache-tvm-ffi, annotated-types, aiohappyeyeballs, yarl, watchfiles, tiktoken, starlette, pydantic, opentelemetry-proto, opencv-python-headless, numba, importlib-metadata, huggingface-hub, googleapis-common-protos, gguf, email-validator, depyf, cuda-bindings, aiosignal, torch, tokenizers, sse-starlette, rich-toolkit, pydantic-settings, pydantic-extra-types, prometheus-fastapi-instrumentator, opentelemetry-exporter-otlp-proto-common, opentelemetry-api, openai-harmony, openai, lm-format-enforcer, fastapi, cuda-python, anthropic, aiohttp, transformers, torchvision, torchaudio, torch-c-dlpack-ext, opentelemetry-semantic-conventions, nvidia-cutlass-dsl-libs-base, model-hosting-container-standards, mcp, fastapi-cloud-cli, fastapi-cli, xgrammar, opentelemetry-sdk, nvidia-cutlass-dsl, mistral_common, compressed-tensors, quack-kernels, opentelemetry-semantic-conventions-ai, opentelemetry-exporter-otlp-proto-http, opentelemetry-exporter-otlp-proto-grpc, flashinfer-python, opentelemetry-exporter-otlp, vllm
Attempting uninstall: triton
Found existing installation: triton 3.4.0
Uninstalling triton-3.4.0:
Successfully uninstalled triton-3.4.0
Attempting uninstall: setuptools
Found existing installation: setuptools 69.5.1
Uninstalling setuptools-69.5.1:
Successfully uninstalled setuptools-69.5.1
Attempting uninstall: protobuf
Found existing installation: protobuf 6.31.1
Uninstalling protobuf-6.31.1:
Successfully uninstalled protobuf-6.31.1
Attempting uninstall: packaging
Found existing installation: packaging 23.2
Uninstalling packaging-23.2:
Successfully uninstalled packaging-23.2
Attempting uninstall: nvidia-nccl-cu12
Found existing installation: nvidia-nccl-cu12 2.27.3
Uninstalling nvidia-nccl-cu12-2.27.3:
Successfully uninstalled nvidia-nccl-cu12-2.27.3
Attempting uninstall: numpy
Found existing installation: numpy 2.3.2
Uninstalling numpy-2.3.2:
Successfully uninstalled numpy-2.3.2
Attempting uninstall: huggingface-hub
Found existing installation: huggingface_hub 1.7.2
Uninstalling huggingface_hub-1.7.2:
Successfully uninstalled huggingface_hub-1.7.2
Attempting uninstall: torch
Found existing installation: torch 2.8.0+cu128
Uninstalling torch-2.8.0+cu128:
Successfully uninstalled torch-2.8.0+cu128
Attempting uninstall: torchvision
Found existing installation: torchvision 0.23.0+cu128
Uninstalling torchvision-0.23.0+cu128:
Successfully uninstalled torchvision-0.23.0+cu128
Successfully installed aiohappyeyeballs-2.6.1 aiohttp-3.13.3 aiosignal-1.4.0 annotated-types-0.7.0 anthropic-0.86.0 apache-tvm-ffi-0.1.9 astor-0.8.1 blake3-1.0.8 cachetools-7.0.5 cbor2-5.9.0 cloudpickle-3.1.2 compressed-tensors-0.13.0 cuda-bindings-12.9.4 cuda-pathfinder-1.4.4 cuda-python-12.9.4 depyf-0.20.0 dill-0.4.1 diskcache-5.6.3 dnspython-2.8.0 docstring-parser-0.17.0 einops-0.8.2 email-validator-2.3.0 fastapi-0.135.2 fastapi-cli-0.0.24 fastapi-cloud-cli-0.15.0 fastar-0.9.0 flashinfer-python-0.6.6 frozenlist-1.8.0 gguf-0.18.0 googleapis-common-protos-1.73.0 httptools-0.7.1 httpx-sse-0.4.3 huggingface-hub-0.36.2 ijson-3.5.0 importlib-metadata-8.7.1 interegular-0.3.3 jiter-0.13.0 jmespath-1.1.0 llguidance-1.3.0 llvmlite-0.44.0 lm-format-enforcer-0.11.3 loguru-0.7.3 mcp-1.26.0 mistral_common-1.10.0 model-hosting-container-standards-0.1.14 msgspec-0.20.0 multidict-6.7.1 ninja-1.13.0 numba-0.61.2 numpy-2.2.6 nvidia-cudnn-frontend-1.18.0 nvidia-cutlass-dsl-4.4.2 nvidia-cutlass-dsl-libs-base-4.4.2 nvidia-ml-py-13.595.45 nvidia-nccl-cu12-2.27.5 nvidia-nvshmem-cu12-3.4.5 openai-2.24.0 openai-harmony-0.0.8 opencv-python-headless-4.13.0.92 opentelemetry-api-1.40.0 opentelemetry-exporter-otlp-1.40.0 opentelemetry-exporter-otlp-proto-common-1.40.0 opentelemetry-exporter-otlp-proto-grpc-1.40.0 opentelemetry-exporter-otlp-proto-http-1.40.0 opentelemetry-proto-1.40.0 opentelemetry-sdk-1.40.0 opentelemetry-semantic-conventions-0.61b0 opentelemetry-semantic-conventions-ai-0.5.0 outlines_core-0.2.11 packaging-26.0 partial-json-parser-0.2.1.1.post7 prometheus-fastapi-instrumentator-7.1.0 propcache-0.4.1 protobuf-6.33.6 py-cpuinfo-9.0.0 pybase64-1.4.3 pycountry-26.2.16 pydantic-2.12.5 pydantic-core-2.41.5 pydantic-extra-types-2.11.1 pydantic-settings-2.13.1 pyjwt-2.12.1 python-dotenv-1.2.2 python-multipart-0.0.22 quack-kernels-0.3.5 regex-2026.2.28 rich-toolkit-0.19.7 rignore-0.7.6 safetensors-0.7.0 sentencepiece-0.2.1 sentry-sdk-2.55.0 setproctitle-1.3.7 setuptools-80.10.2 sse-starlette-3.3.3 starlette-0.52.1 tabulate-0.10.0 tiktoken-0.12.0 tokenizers-0.22.2 torch-2.10.0 torch-c-dlpack-ext-0.1.5 torchaudio-2.10.0 torchvision-0.25.0 transformers-4.57.6 triton-3.6.0 typing-inspection-0.4.2 uvicorn-0.42.0 uvloop-0.22.1 vllm-0.18.0 watchfiles-1.1.1 websockets-16.0 xgrammar-0.1.32 yarl-1.23.0 zipp-3.23.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
root@autodl-container-c6d54aa471-4479d4d0:~/autodl-tmp#
一、你最终安装的核心东西是什么?
你执行的是:
pip install vllm
最终成功安装了:
- vllm 0.18.0
它不是单独一个包,而是顺带安装了一大堆依赖,主要可以分成几类:
二、最核心的大件
这几个是体积最大、最关键的:
1. vLLM 本体
- vllm-0.18.0
这是高性能大模型推理框架本身,用来部署/推理 LLM。
2. PyTorch 整套
你安装/替换了:
- torch 2.10.0
- torchvision 0.25.0
- torchaudio 2.10.0
- triton 3.6.0
这说明 vLLM 把你原来环境里的 PyTorch 升级/替换成了它需要的版本。
日志里明确显示它卸载了原有版本:
- 卸载了 torch 2.8.0+cu128
- 卸载了 torchvision 0.23.0+cu128
- 卸载了 triton 3.4.0
也就是说,你现在的推理底座已经变成:
- PyTorch 2.10.0
- CUDA 12 相关依赖配套版本
3. CUDA / NVIDIA 相关高性能组件
新装或调整了这些:
- nvidia-nccl-cu12 2.27.5
- nvidia-nvshmem-cu12 3.4.5
- nvidia-cudnn-frontend 1.18.0
- cuda-bindings 12.9.4
- cuda-python 12.9.4
- cuda-pathfinder 1.4.4
- nvidia-cutlass-dsl 4.4.2
- nvidia-cutlass-dsl-libs-base 4.4.2
- nvidia-ml-py 13.595.45
这些主要作用:
- GPU 通信
- CUDA API 绑定
- 高性能 kernel
- 多卡推理支持
- NVIDIA 底层算子优化
其中:
- NCCL:多卡通信关键组件
- NVSHMEM:高性能 GPU 间共享内存通信
- CUTLASS DSL:NVIDIA 高性能 GEMM/算子支持
- CUDA Python / Bindings:让 Python 能直接调用 CUDA 能力
4. vLLM 专用高性能推理加速组件
这几个也很关键:
- flashinfer-python 0.6.6
- quack-kernels 0.3.5
- torch-c-dlpack-ext 0.1.5
- compressed-tensors 0.13.0
- depyf 0.20.0
- apache-tvm-ffi 0.1.9
作用大概是:
- 加速 attention / decoding
- 优化张量操作
- 支持压缩权重
- 支持更底层高性能 kernel
- 一些编译/调度/FFI 能力
三、模型生态相关依赖
这些是为了支持 HuggingFace 模型、tokenizer、权重格式等:
- transformers 4.57.6
- tokenizers 0.22.2
- huggingface-hub 0.36.2
- safetensors 0.7.0
- sentencepiece 0.2.1
- tiktoken 0.12.0
- gguf 0.18.0
- mistral_common 1.10.0
- openai-harmony 0.0.8
它们分别干嘛:
- transformers:加载 HuggingFace 模型
- tokenizers / sentencepiece / tiktoken:分词器
- huggingface-hub:从 HF 下载模型
- safetensors:加载 safetensors 权重
- gguf:支持 GGUF 格式模型
- mistral_common:Mistral 模型相关支持
四、API 服务相关依赖
vLLM 很常见的用法是启动 OpenAI 兼容服务,所以安装了 Web/API 框架:
- fastapi 0.135.2
- starlette 0.52.1
- uvicorn 0.42.0
- uvloop 0.22.1
- httptools 0.7.1
- websockets 16.0
- watchfiles 1.1.1
- python-multipart 0.0.22
- email-validator 2.3.0
- fastapi-cli 0.0.24
- fastapi-cloud-cli 0.15.0
- prometheus-fastapi-instrumentator 7.1.0
作用:
- 启动 HTTP 服务
- 提供 OpenAI 风格 API
- 处理 websocket / multipart
- 自动监控接口指标
- 热更新/文件监控
也就是说,你现在环境已经具备了直接跑:
vllm serve ...
或者类似 OpenAI API 服务的能力。
五、OpenAI / Anthropic / MCP 相关客户端
你还安装了:
- openai 2.24.0
- anthropic 0.86.0
- mcp 1.26.0
- httpx-sse 0.4.3
- sse-starlette 3.3.3
- pyjwt 2.12.1
这部分主要是为了:
- 兼容 OpenAI 接口风格
- 兼容 Anthropic 风格
- 支持 MCP(Model Context Protocol)相关功能
- 支持 SSE 流式输出
- 认证/鉴权
六、约束解码 / 结构化输出相关
这些是 vLLM 比较新的一类能力,支持 JSON schema、语法约束解码等:
- lm-format-enforcer 0.11.3
- llguidance 1.3.0
- outlines_core 0.2.11
- xgrammar 0.1.32
- partial-json-parser 0.2.1.1.post7
- interegular 0.3.3
作用:
- 强制模型输出符合 JSON/语法规则
- 结构化生成
- grammar-guided decoding
- 部分 JSON 流式解析
如果你要做“模型输出必须是 JSON”这种功能,这些包就是相关支持。
七、监控 / 可观测性相关
安装了 OpenTelemetry 一整套:
- opentelemetry-api 1.40.0
- opentelemetry-sdk 1.40.0
- opentelemetry-exporter-otlp 1.40.0
- opentelemetry-exporter-otlp-proto-common 1.40.0
- opentelemetry-exporter-otlp-proto-grpc 1.40.0
- opentelemetry-exporter-otlp-proto-http 1.40.0
- opentelemetry-proto 1.40.0
- opentelemetry-semantic-conventions 0.61b0
- opentelemetry-semantic-conventions-ai 0.5.0
- googleapis-common-protos 1.73.0
- protobuf 6.33.6
作用:
- 打点
- tracing
- metrics
- 将监控数据发给 OTLP 后端
如果你部署线上服务,这部分是有用的。
八、数值计算 / 编译优化相关
- numba 0.61.2
- llvmlite 0.44.0
- numpy 2.2.6
- einops 0.8.2
- cloudpickle 3.1.2
- msgspec 0.20.0
- ijson 3.5.0
- cbor2 5.9.0
- blake3 1.0.8
这些属于:
- 数值计算
- JIT/LLVM 编译支持
- 数据序列化/反序列化
- 哈希
- JSON 流式解析
九、图像/多模态相关
你还装了:
- opencv-python-headless 4.13.0.92
- pillow 已经存在
- torchvision
- mistral_common[image] 相关依赖
这说明 vLLM 这版不只是纯文本,也带了一些多模态/图像处理支持。
十、一些工具类依赖
比如:
- regex
- cachetools
- diskcache
- py-cpuinfo
- ninja
- setproctitle
- pybase64
- loguru
- tabulate
- packaging 26.0
- setuptools 80.10.2
- importlib-metadata 8.7.1
- zipp 3.23.0
- dnspython 2.8.0
- python-dotenv 1.2.2
- docstring-parser 0.17.0
- jmespath 1.1.0
- rignore 0.7.6
- sentry-sdk 2.55.0
- fastar 0.9.0
这些一般是:
- 构建工具
- 日志
- 配置管理
- 缓存
- 监控
- 辅助解析
十一、这次安装替换了你环境里的哪些已有包?
从日志里能明确看到,以下包被替换了:
被卸载并重装/降级/升级的
- triton:3.4.0 → 3.6.0
- setuptools:69.5.1 → 80.10.2
- protobuf:6.31.1 → 6.33.6
- packaging:23.2 → 26.0
- nvidia-nccl-cu12:2.27.3 → 2.27.5
- numpy:2.3.2 → 2.2.6
- huggingface-hub:1.7.2 → 0.36.2
- torch:2.8.0+cu128 → 2.10.0
- torchvision:0.23.0+cu128 → 0.25.0
这里要特别注意两点:
1)huggingface-hub 被“降版本”了
从:
- 1.7.2 → 0.36.2
这通常是因为 transformers/tokenizers/vllm 当前依赖链要求 <1.0 风格版本。
2)numpy 被换成了 2.2.6
原来是 2.3.2,现在是 2.2.6。
这是依赖兼容性导致的,不算异常。
十二、按功能给你一个简版总结
你这次安装,本质上装了:
1. vLLM 主程序
- vllm
2. 大模型推理底座
- torch / torchvision / torchaudio / triton
3. GPU 高性能依赖
- NCCL / NVSHMEM / CUDA bindings / CUTLASS / flashinfer / quack kernels
4. HuggingFace 模型生态
- transformers / tokenizers / huggingface-hub / sentencepiece / safetensors / tiktoken
5. Web 服务框架
- fastapi / uvicorn / starlette / websockets
6. 结构化输出与语法约束
- lm-format-enforcer / llguidance / xgrammar / outlines_core
7. 监控与可观测性
- prometheus / opentelemetry 全家桶
8. 多模态支持
- opencv-python-headless / torchvision / mistral_common
十三、占空间最大的几个包
从日志看,比较大的有:
- torch-2.10.0:915.6 MB
- vllm-0.18.0:433.2 MB
- nvidia-nccl-cu12:322.3 MB
- triton-3.6.0:188.3 MB
- nvidia-nvshmem-cu12:139.1 MB
- nvidia-cutlass-dsl-libs-base:74.4 MB
- opencv-python-headless:60.4 MB
- llvmlite:42.4 MB
- xgrammar:37.7 MB
所以这次安装是很重的一次安装,不是一个轻量包。
十四、最终结论
一句话总结:
你安装的不只是 vLLM 本身,而是一整套“LLM 推理服务运行环境”,包括:
- vLLM 本体
- PyTorch 2.10 + Triton
- CUDA/NVIDIA 高性能推理组件
- HuggingFace 模型加载生态
- FastAPI/uvicorn 服务框架
- OpenAI 兼容 API 支持
- 结构化输出/语法约束生成
- 监控和可观测性组件
- 一部分多模态支持
同时它还替换了你环境中原有的 torch、torchvision、triton、numpy、protobuf、huggingface-hub 等包。
如果你愿意,我还可以进一步帮你做两件事里的任意一种:
- 把这份日志整理成“安装包清单表格”(包名 / 版本 / 作用)
- 帮你判断这个 vllm 环境是否和你当前 CUDA、显卡、驱动兼容