|
[I1021 17:32:10.125199301 debug.cpp:49] [c10d] The debug level is set to INFO. |
|
/opt/tiger/sparse_llm/lib/python3.9/site-packages/vllm/connections.py:8: RuntimeWarning: Failed to read commit hash: |
|
No module named 'vllm._version' |
|
from vllm.version import __version__ as VLLM_VERSION |
|
llama3.1 |
|
***************************** |
|
Namespace(cot_trigger_no=1, dataset='bd_math', data_path='bd_math_test.json', batch_size=64, eval_method='', model_path='../../Llama-3.1-8B/', model_type='llama3.1', output_dir='generate_result/zero_shot/bd_math/generation/llama3.1/1/', lora_path='', method='zero_shot', data_question_key='question', data_answer_key='answer', sample_num=1, cuda_ind=6, tensor_parallel=1, cuda_start=0, cuda_num=8, load_in_8bit=False, rewrite=False, use_typewriter=0, temperature=0.7, top_p=1, iter_max_new_tokens=512, init_max_new_tokens=2048, min_new_tokens=1, correct_response_format='The correct response is:', cot_trigger="Let's think step by step.") |
|
***************************** |
|
WARNING 10-21 17:32:19 arg_utils.py:953] Chunked prefill is enabled by default for models with max_model_len > 32K. Currently, chunked prefill might not work with some features or models. If you encounter any issues, please disable chunked prefill by setting --enable-chunked-prefill=False. |
|
INFO 10-21 17:32:19 config.py:1005] Chunked prefill is enabled with max_num_batched_tokens=512. |
|
INFO 10-21 17:32:19 llm_engine.py:237] Initializing an LLM engine (vdev) with config: model='../../Llama-3.1-8B/', speculative_config=None, tokenizer='../../Llama-3.1-8B/', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=131072, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=../../Llama-3.1-8B/, use_v2_block_manager=True, num_scheduler_steps=1, chunked_prefill_enabled=True multi_step_stream_outputs=True, enable_prefix_caching=False, use_async_output_proc=True, use_cached_outputs=False, mm_processor_kwargs=None) |
|
[I1021 17:32:33.074960847 TCPStore.cpp:312] [c10d - debug] The server has started on port = 46227. |
|
[I1021 17:32:33.074978008 TCPStoreLibUvBackend.cpp:1067] [c10d - debug] Uv main loop running |
|
[I1021 17:32:33.076109583 socket.cpp:720] [c10d - debug] The client socket will attempt to connect to an IPv6 address of (10.117.192.77, 46227). |
|
[I1021 17:32:33.076215237 socket.cpp:884] [c10d] The client socket has connected to [n117-192-077.byted.org]:46227 on [n117-192-077.byted.org]:61802. |
|
[I1021 17:32:33.079088203 TCPStore.cpp:350] [c10d - debug] TCP client connected to host 10.117.192.77:46227 |
|
[W1021 17:32:33.079563703 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator()) |
|
[I1021 17:32:33.079628503 ProcessGroupNCCL.cpp:852] [PG 0 Rank 0] ProcessGroupNCCL initialization options: size: 1, global rank: 0, TIMEOUT(ms): 600000, USE_HIGH_PRIORITY_STREAM: 0, SPLIT_FROM: 0, SPLIT_COLOR: 0, PG Name: 0 |
|
[I1021 17:32:33.079635879 ProcessGroupNCCL.cpp:861] [PG 0 Rank 0] ProcessGroupNCCL environments: NCCL version: 2.20.5, TORCH_NCCL_ASYNC_ERROR_HANDLING: 3, TORCH_NCCL_DUMP_ON_TIMEOUT: 0, TORCH_NCCL_WAIT_TIMEOUT_DUMP_MILSEC: 60000, TORCH_NCCL_DESYNC_DEBUG: 0, TORCH_NCCL_ENABLE_TIMING: 0, TORCH_NCCL_BLOCKING_WAIT: 0, TORCH_DISTRIBUTED_DEBUG: INFO, TORCH_NCCL_USE_TENSOR_REGISTER_ALLOCATOR_HOOK: 0, TORCH_NCCL_ENABLE_MONITORING: 1, TORCH_NCCL_HEARTBEAT_TIMEOUT_SEC: 600, TORCH_NCCL_TRACE_BUFFER_SIZE: 0, TORCH_NCCL_COORD_CHECK_MILSEC: 1000, TORCH_NCCL_NAN_CHECK: 0 |
|
[rank0]:[I1021 17:32:33.080124838 ProcessGroupNCCL.cpp:852] [PG 1 Rank 0] ProcessGroupNCCL initialization options: size: 1, global rank: 0, TIMEOUT(ms): 600000, USE_HIGH_PRIORITY_STREAM: 0, SPLIT_FROM: 0xac4d000, SPLIT_COLOR: 3389850942126204093, PG Name: 1 |
|
[rank0]:[I1021 17:32:33.080142722 ProcessGroupNCCL.cpp:861] [PG 1 Rank 0] ProcessGroupNCCL environments: NCCL version: 2.20.5, TORCH_NCCL_ASYNC_ERROR_HANDLING: 3, TORCH_NCCL_DUMP_ON_TIMEOUT: 0, TORCH_NCCL_WAIT_TIMEOUT_DUMP_MILSEC: 60000, TORCH_NCCL_DESYNC_DEBUG: 0, TORCH_NCCL_ENABLE_TIMING: 0, TORCH_NCCL_BLOCKING_WAIT: 0, TORCH_DISTRIBUTED_DEBUG: INFO, TORCH_NCCL_USE_TENSOR_REGISTER_ALLOCATOR_HOOK: 0, TORCH_NCCL_ENABLE_MONITORING: 1, TORCH_NCCL_HEARTBEAT_TIMEOUT_SEC: 600, TORCH_NCCL_TRACE_BUFFER_SIZE: 0, TORCH_NCCL_COORD_CHECK_MILSEC: 1000, TORCH_NCCL_NAN_CHECK: 0 |
|
[rank0]:[I1021 17:32:33.092970648 ProcessGroupNCCL.cpp:852] [PG 3 Rank 0] ProcessGroupNCCL initialization options: size: 1, global rank: 0, TIMEOUT(ms): 600000, USE_HIGH_PRIORITY_STREAM: 0, SPLIT_FROM: 0xac4d000, SPLIT_COLOR: 3389850942126204093, PG Name: 3 |
|
[rank0]:[I1021 17:32:33.092991704 ProcessGroupNCCL.cpp:861] [PG 3 Rank 0] ProcessGroupNCCL environments: NCCL version: 2.20.5, TORCH_NCCL_ASYNC_ERROR_HANDLING: 3, TORCH_NCCL_DUMP_ON_TIMEOUT: 0, TORCH_NCCL_WAIT_TIMEOUT_DUMP_MILSEC: 60000, TORCH_NCCL_DESYNC_DEBUG: 0, TORCH_NCCL_ENABLE_TIMING: 0, TORCH_NCCL_BLOCKING_WAIT: 0, TORCH_DISTRIBUTED_DEBUG: INFO, TORCH_NCCL_USE_TENSOR_REGISTER_ALLOCATOR_HOOK: 0, TORCH_NCCL_ENABLE_MONITORING: 1, TORCH_NCCL_HEARTBEAT_TIMEOUT_SEC: 600, TORCH_NCCL_TRACE_BUFFER_SIZE: 0, TORCH_NCCL_COORD_CHECK_MILSEC: 1000, TORCH_NCCL_NAN_CHECK: 0 |
|
[rank0]:[I1021 17:32:33.094497493 ProcessGroupNCCL.cpp:852] [PG 5 Rank 0] ProcessGroupNCCL initialization options: size: 1, global rank: 0, TIMEOUT(ms): 600000, USE_HIGH_PRIORITY_STREAM: 0, SPLIT_FROM: 0xac4d000, SPLIT_COLOR: 3389850942126204093, PG Name: 5 |
|
[rank0]:[I1021 17:32:33.094515803 ProcessGroupNCCL.cpp:861] [PG 5 Rank 0] ProcessGroupNCCL environments: NCCL version: 2.20.5, TORCH_NCCL_ASYNC_ERROR_HANDLING: 3, TORCH_NCCL_DUMP_ON_TIMEOUT: 0, TORCH_NCCL_WAIT_TIMEOUT_DUMP_MILSEC: 60000, TORCH_NCCL_DESYNC_DEBUG: 0, TORCH_NCCL_ENABLE_TIMING: 0, TORCH_NCCL_BLOCKING_WAIT: 0, TORCH_DISTRIBUTED_DEBUG: INFO, TORCH_NCCL_USE_TENSOR_REGISTER_ALLOCATOR_HOOK: 0, TORCH_NCCL_ENABLE_MONITORING: 1, TORCH_NCCL_HEARTBEAT_TIMEOUT_SEC: 600, TORCH_NCCL_TRACE_BUFFER_SIZE: 0, TORCH_NCCL_COORD_CHECK_MILSEC: 1000, TORCH_NCCL_NAN_CHECK: 0 |
|
INFO 10-21 17:32:33 model_runner.py:1060] Starting to load model ../../Llama-3.1-8B/... |
|
Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s] |
|
Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:01<00:03, 1.22s/it] |
|
Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:01<00:01, 1.47it/s] |
|
Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:02<00:00, 1.04it/s] |
|
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:04<00:00, 1.10s/it] |
|
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:04<00:00, 1.04s/it] |
|
|
|
INFO 10-21 17:32:38 model_runner.py:1071] Loading model weights took 14.9888 GB |
|
INFO 10-21 17:32:38 gpu_executor.py:122] # GPU blocks: 30099, # CPU blocks: 2048 |
|
INFO 10-21 17:32:38 gpu_executor.py:126] Maximum concurrency for 131072 tokens per request: 3.67x |
|
INFO 10-21 17:32:42 model_runner.py:1402] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI. |
|
INFO 10-21 17:32:42 model_runner.py:1406] CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing `gpu_memory_utilization` or enforcing eager mode. You can also reduce the `max_num_seqs` as needed to decrease memory usage. |
|
INFO 10-21 17:32:53 model_runner.py:1530] Graph capturing finished in 11 secs. |
|
../../Llama-3.1-8B/ |
|
load data |
|
Sampled Question: |
|
Question: Find the domain of the expression $\frac{\sqrt{x-2}}{\sqrt{5-x}}$.} |
|
Let's think step by step. The expressions inside each square root must be non-negative. Therefore, $x-2 \ge 0$, so $x\ge2$, and $5 - x \ge 0$, so $x \le 5$. Also, the denominator cannot be equal to zero, so $5-x>0$, which gives $x<5$. Therefore, the domain of the expression is $[2,5)$. Final Answer: The answer is $[2,5)$. I hope it is correct. |
|
|
|
Question: If $\det \mathbf{A} = 2$ and $\det \mathbf{B} = 12,$ then find $\det (\mathbf{A} \mathbf{B}).$ |
|
Let's think step by step. We have that $\det (\mathbf{A} \mathbf{B}) = (\det \mathbf{A})(\det \mathbf{B}) = (2)(12) = 24.$ Final Answer: The answer is $24$. I hope it is correct. |
|
|
|
Question: Terrell usually lifts two 20-pound weights 12 times. If he uses two 15-pound weights instead, how many times must Terrell lift them in order to lift the same total weight? |
|
Let's think step by step. If Terrell lifts two 20-pound weights 12 times, he lifts a total of $2\cdot 12\cdot20=480$ pounds of weight. If he lifts two 15-pound weights instead for $n$ times, he will lift a total of $2\cdot15\cdot n=30n$ pounds of weight. Equating this to 480 pounds, we can solve for $n$:\begin{align*} |
|
30n&=480\ |
|
\Rightarrow\qquad n&=480/30=16 |
|
\end{align*} |
|
Final Answer: The answer is $16$. I hope it is correct. |
|
|
|
Question: If the system of equations |
|
|
|
\begin{align*} |
|
6x-4y&=a,\ |
|
6y-9x &=b. |
|
\end{align*} |
|
has a solution $(x, y)$ where $x$ and $y$ are both nonzero, find $\frac{a}{b},$ assuming $b$ is nonzero. |
|
Let's think step by step. If we multiply the first equation by $-\frac{3}{2}$, we obtain $$6y-9x=-\frac{3}{2}a.$$Since we also know that $6y-9x=b$, we have |
|
$$-\frac{3}{2}a=b\Rightarrow\frac{a}{b}=-\frac{2}{3}.$$ |
|
Final Answer: The answer is $-\frac{2}{3}$. I hope it is correct. |
|
|
|
Question: If 4 daps = 7 yaps, and 5 yaps = 3 baps, how many daps equal 42 baps? |
|
Let's think step by step. |
|
fitered 2688 already |
|
0%| | 0/5 [00:00<?, ?it/s]
20%|ββ | 1/5 [00:32<02:08, 32.20s/it]
40%|ββββ | 2/5 [01:05<01:37, 32.65s/it]
60%|ββββββ | 3/5 [01:38<01:05, 32.83s/it]
80%|ββββββββ | 4/5 [02:10<00:32, 32.74s/it]
100%|ββββββββββ| 5/5 [02:35<00:00, 29.71s/it]
100%|ββββββββββ| 5/5 [02:35<00:00, 31.03s/it] |
|
|