ClaudeBot/1.0 via l’IP 216.73.216.160 m’a saturé mes serveurs

J’ai donc bloqué cette IP : 216.73.216.160 . En fait j’ai bloqué 216.73.216.0/24 et 3.141.17.0/24. 

C’est visible via : https://webanalyse.cyber-neurones.org/ 

1 119 354 (27.47%) 9 (00.09%) 1.7 GiB (16.20%) 216.73.216.160
5 26 720 (06.15%) 4 (00.04%) 386.9 MiB (03.53%) 216.73.216.136
7 11 672 (02.69%) 1 (00.01%) 175.9 MiB (01.60%) 216.73.216.114
8 9 434 (02.17%) 1 (00.01%) 137.8 MiB (01.26%) 216.73.216.207
9 5 867 (01.35%) 1 (00.01%) 91.1 MiB (00.83%) 216.73.216.78

Misère.

Proxmox/Ollama : llm_benchmark (Test n°2)

En passant

J’ai changé la carte NVIDIA car deux cartes NVIDIA avec 8 Go chacune, elles sont vues par la VM qui est lancé par proxmox :

# nvidia-smi --list-gpus
GPU 0: Quadro M5000 (UUID: GPU-)
GPU 1: Quadro M4000 (UUID: GPU-)
# nvidia-smi      
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15              Driver Version: 570.86.15      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Quadro M5000                   Off |   00000000:00:10.0 Off |                  Off |
| 38%   37C    P8             13W /  150W |       5MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Quadro M4000                   Off |   00000000:00:11.0 Off |                  N/A |
| 46%   39C    P8             13W /  120W |       5MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

les résultats du test sont les suivants :

# llm_benchmark run
-------Linux----------
{'id': '0', 'name': 'Quadro M5000', 'driver': '570.86.15', 'gpu_memory_total': '8192.0 MB', 'gpu_memory_free': '8110.0 MB', 
'gpu_memory_used': '5.0 MB', 'gpu_load': '0.0%', 'gpu_temperature': '36.0°C'}
{'id': '1', 'name': 'Quadro M4000', 'driver': '570.86.15', 'gpu_memory_total': '8192.0 MB', 'gpu_memory_free': '8110.0 MB', 
'gpu_memory_used': '5.0 MB', 'gpu_load': '0.0%', 'gpu_temperature': '38.0°C'}
At least two GPU cards
Total memory size : 61.36 GB
cpu_info: Intel(R) Xeon(R) CPU E5-2450 v2 @ 2.50GHz
gpu_info: Quadro M5000
Quadro M4000
os_version: Ubuntu 22.04.5 LTS
ollama_version: 0.5.7
----------
....
-------Linux----------
{'id': '0', 'name': 'Quadro M5000', 'driver': '570.86.15', 'gpu_memory_total': '8192.0 MB', 'gpu_memory_free': '3277.0 MB', 
'gpu_memory_used': '4838.0 MB', 'gpu_load': '0.0%', 'gpu_temperature': '65.0°C'}
{'id': '1', 'name': 'Quadro M4000', 'driver': '570.86.15', 'gpu_memory_total': '8192.0 MB', 'gpu_memory_free': '2348.0 MB', 
'gpu_memory_used': '5767.0 MB', 'gpu_load': '0.0%', 'gpu_temperature': '76.0°C'}
At least two GPU cards
{
    "mistral:7b": "16.56",
    "llama3.1:8b": "15.71",
    "phi4:14b": "8.01",
    "qwen2:7b": "15.27",
    "gemma2:9b": "15.81",
    "llava:7b": "17.82",
    "llava:13b": "13.14",
    "uuid": "1a60faf0-e97b-5d47-8de5-03d3b22dfbbc",
    "ollama_version": "0.5.7"
}

Actuellement j’utilise « llama3.1:8b », je suis donc passé le 1.12 (unitilisable) à 15,71 . L’idéal est d’avoir dans les plus de 32 … donc il va falloir trouver deux nouvelles cartes.

Misère.

Proxmox/Ollama : llm_benchmark

En passant

J’ai trouvé un outil de test de llm : llm_benchmark ( installation via pip )

Je suis en dernière position : https://llm.aidatatools.com/results-linux.php , avec « llama3.1:8b »: « 1.12 ».

 llm_benchmark run
-------Linux----------
{'id': '0', 'name': 'Quadro 4000', 'driver': '390.157', 'gpu_memory_total': '1985.0 MB',
'gpu_memory_free': '1984.0 MB', 'gpu_memory_used': '1.0 MB', 'gpu_load': '0.0%', 
'gpu_temperature': '60.0°C'}
Only one GPU card
Total memory size : 61.36 GB
cpu_info: Intel(R) Xeon(R) CPU E5-2450 v2 @ 2.50GHz
gpu_info: Quadro 4000
os_version: Ubuntu 22.04.5 LTS
ollama_version: 0.5.7
----------
LLM models file path:/usr/local/lib/python3.10/dist-packages/llm_benchmark/data/benchmark_models_16gb_ram.yml
Checking and pulling the following LLM models
phi4:14b
qwen2:7b
gemma2:9b
mistral:7b
llama3.1:8b
llava:7b
llava:13b
----------
....
----------------------------------------
Sending the following data to a remote server
-------Linux----------
{'id': '0', 'name': 'Quadro 4000', 'driver': '390.157', 'gpu_memory_total': '1985.0 MB',
 'gpu_memory_free': '1984.0 MB', 'gpu_memory_used': '1.0 MB', 'gpu_load': '0.0%', 
'gpu_temperature': '61.0°C'}
Only one GPU card
-------Linux----------
{'id': '0', 'name': 'Quadro 4000', 'driver': '390.157', 'gpu_memory_total': '1985.0 MB',
 'gpu_memory_free': '1984.0 MB', 'gpu_memory_used': '1.0 MB', 'gpu_load': '0.0%',
 'gpu_temperature': '61.0°C'}
Only one GPU card
{
    "mistral:7b": "1.40",
    "llama3.1:8b": "1.12",
    "phi4:14b": "0.76",
    "qwen2:7b": "1.31",
    "gemma2:9b": "1.03",
    "llava:7b": "1.84",
    "llava:13b": "0.73",
    "uuid": "",
    "ollama_version": "0.5.7"
}
----------

Proxmox : Installation de Ollama en version LXC

Petit test d’installation de Ollama en version LXC via un script :

bash -c "$(wget -qLO - https://github.com/tteck/Proxmox/raw/main/ct/ollama.sh)"

On va voir le résultat … actuellement m’a carte NVIDIA (ou Bios) de supporte pas le Proxmox Passthrough.

root@balkany:~# dmesg | grep -e DMAR -e IOMMU | grep "enable"
[    0.333769] DMAR: IOMMU enabled

root@balkany:~# dmesg | grep 'remapping'
[    0.821036] DMAR-IR: Enabled IRQ remapping in xapic mode
[    0.821038] x2apic: IRQ remapping doesn't support X2APIC mode

# lspci -nn | grep 'NVIDIA'
0a:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF100GL [Quadro 4000] [10de:06dd] (rev a3)
0a:00.1 Audio device [0403]: NVIDIA Corporation GF100 High Definition Audio Controller [10de:0be5] (rev a1)

# cat /etc/default/grub | grep "GRUB_CMDLINE_LINUX_DEFAULT"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt video=vesafb:off video=efifb:off initcall_blacklist=sysfb_init

# efibootmgr -v
EFI variables are not supported on this system.

# cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

# cat /etc/modprobe.d/pve-blacklist.conf | grep nvidia
blacklist nvidiafb
blacklist nvidia

J’ai donc ajouter ceci :

# cat  /etc/modprobe.d/iommu_unsafe_interrupts.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1

J’ai bien un seul groupe iommugroup pour la carte NVIDIA :

Quand je lance le script cela termine par une erreur :

 
  ____  ____
  / __ \/ / /___ _____ ___  ____ _
 / / / / / / __ `/ __ `__ \/ __ `/
/ /_/ / / / /_/ / / / / / / /_/ /
\____/_/_/\__,_/_/ /_/ /_/\__,_/

Using Default Settings
Using Distribution: ubuntu
Using ubuntu Version: 22.04
Using Container Type: 1
Using Root Password: Automatic Login
Using Container ID: 114
Using Hostname: ollama
Using Disk Size: 24GB
Allocated Cores 4
Allocated Ram 4096
Using Bridge: vmbr0
Using Static IP Address: dhcp
Using Gateway IP Address: Default
Using Apt-Cacher IP Address: Default
Disable IPv6: No
Using Interface MTU Size: Default
Using DNS Search Domain: Host
Using DNS Server Address: Host
Using MAC Address: Default
Using VLAN Tag: Default
Enable Root SSH Access: No
Enable Verbose Mode: No
Creating a Ollama LXC using the above default settings
 ✓ Using datastore2 for Template Storage.
 ✓ Using datastore2 for Container Storage.
 ✓ Updated LXC Template List
 ✓ LXC Container 114 was successfully created.
 ✓ Started LXC Container
bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
 //bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
 ✓ Set up Container OS
 ✓ Network Connected: 192.168.1.45 
 ✓ IPv4 Internet Connected
 ✗ IPv6 Internet Not Connected
 ✓ DNS Resolved github.com to 140.82.121.3
 ✓ Updated Container OS
 ✓ Installed Dependencies
 ✓ Installed Golang
 ✓ Set up Intel® Repositories
 ✓ Set Up Hardware Acceleration
 ✓ Installed Intel® oneAPI Base Toolkit
 / Installing Ollama (Patience)   
[ERROR] in line 23: exit code 0: while executing command "$@" > /dev/null 2>&1
The silent function has suppressed the error, run the script with verbose mode enabled, which will provide more detailed output.

Misère.