merged-deepseek-r1-with-python / README.md

Update README.md

168a744 verified 5 months ago

5.99 kB

	---
	library_name: transformers
	tags:
	- unsloth
	- trl
	- sft
	license: mit
	datasets:
	- iamtarun/python_code_instructions_18k_alpaca
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Llama-8B
	---

	# Model Card for Finetuned DeepSeek-R1 Code Review Model

	## Model Details / 모델 세부 정보

	### Model Description / 모델 설명

	English:
	This model is a finetuned version of the DeepSeek-R1 Distill Llama model, adapted for performing code reviews in Korean. It has been fine-tuned using QLoRA and additional dataset transformations from the [iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) dataset, converting code generation prompts into code review prompts. The LoRA adapters have been merged into the base model to produce a self-contained model that can be deployed directly.

	한국어:
	이 모델은 DeepSeek-R1 Distill Llama 모델을 기반으로, 한국어 코드 리뷰 작업에 맞게 파인튜닝된 모델입니다. [iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) 데이터셋의 코드 생성 프롬프트를 코드 리뷰 프롬프트로 변환하여 QLoRA 기법을 사용해 파인튜닝하였으며, LoRA 어댑터를 베이스 모델에 병합해 self-contained 형태로 제작되었습니다.

	- Developed by / 개발자: [More Information Needed / 추가 정보 필요]
	- Model type / 모델 유형: Causal Language Model with Finetuning for Code Review
	- Language(s) / 사용 언어: Korean, English
	- License / 라이센스: MIT
	- Base Model / 베이스 모델: [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)

	### Model Sources / 모델 소스

	- Repository / 리포지토리: [More Information Needed / 추가 정보 필요]
	- Paper / 논문 (optional): [More Information Needed / 추가 정보 필요]
	- Demo / 데모 (optional): [More Information Needed / 추가 정보 필요]

	---

	## Uses / 사용 용도

	### Direct Use / 직접 사용

	English:
	This model is intended for generating code reviews for Python code. It is designed to provide feedback on code quality, style, and possible improvements.
	It is designed as prototype for programming education.

	한국어:
	이 모델은 Python 코드를 대상으로 코드 리뷰(피드백, 스타일 개선 등)를 생성하기 위해 개발되었습니다.
	프로그래밍 교육을 위한 모델의 프로토타입으로 개발되었습니다.

	### Downstream Use / 다운스트림 사용 (optional)

	English:
	It can be integrated into developer tools, code analysis platforms, or educational environments to assist in code review tasks.

	한국어:
	개발자 도구, 코드 분석 플랫폼 또는 교육 환경에 통합되어 코드 리뷰 작업을 보조하는 용도로 활용될 수 있습니다.

	### Out-of-Scope Use / 지원하지 않는 사용 예시

	English:
	This model is not optimized for generating full code, handling languages other than Python, or for use in critical production environments without human oversight.

	한국어:
	이 모델은 코드 생성을 위한 모델이 아니며, 현재 데이터셋 상 Python 이외의 언어에 대해서는 최적화되어 있지 않습니다.
	이후 한국어 코드 리뷰 데이터셋과 Go와 Rust를 포함하는 모델은 추후에 업로드될 예정입니다.

	---

	## Bias, Risks, and Limitations / 편향, 위험 및 한계

	English:
	- The model has been trained on data that may have inherent biases, and its reviews are generated automatically.
	- This model is not perfectly optimized for Korean language code review.

	한국어:
	- 모델이 생성한 리뷰에도 편향이 있을 수 있습니다.
	- 한국어 코드 리뷰에 완벽하게 최적화되어있지 않을 수 있습니다.

	## How to get started with model / 모델 시작하기

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_name = "your_hf_username/merged-deepseek-r1-codereview"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	prompt = """아래는 작성된 Python 코드입니다.
	코드의 장단점, 개선 사항, 코드 스타일 등에 대해 3~4줄 정도의 간결한 리뷰를 작성하세요.

	# 코드:
	### Python 코드
	def sum_sequence(sequence):
	sum = 0
	for num in sequence:
	sum += num
	return sum

	### 코드 리뷰:"""

	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_length=300)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Training Details / 학습 정보

	### Training Data / 학습 데이터

	English:
	The model was fine-tuned using the iamtarun/python_code_instructions_18k_alpaca dataset. The original code generation prompts were transformed into code review prompts to suit the task.

	한국어:
	이 모델은 iamtarun/python_code_instructions_18k_alpaca 데이터셋을 사용해 파인튜닝되었습니다. 기존의 코드 생성 프롬프트를 코드 리뷰 프롬프트로 변환하여 학습에 사용하였습니다.
	Training Procedure / 학습 절차

	English:

	Preprocessing: The dataset was preprocessed to convert the code generation prompts into a standardized code review format.
	Fine-tuning: The base model was fine-tuned using QLoRA with 4-bit quantization for efficiency. LoRA adapters were merged into the base model to produce a self-contained model.

	한국어:

	전처리: 코드 생성 프롬프트를 코드 리뷰 형식으로 변환하기 위해 데이터셋을 전처리하였습니다.
	파인튜닝: 효율성을 위해 4비트 양자화를 사용하여 QLoRA 기법으로 베이스 모델을 파인튜닝하였으며, LoRA 어댑터를 병합하여 self-contained 모델로 제작하였습니다.