BoghdadyJR
/

tmp

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

tmp / tmp0lwue7f_ /model_architecture.txt

BoghdadyJR's picture

End of training

76ca076 verified about 1 year ago

history blame contribute delete

1.22 kB

	OPTForCausalLM(
	(model): OPTModel(
	(decoder): OPTDecoder(
	(embed_tokens): Embedding(50272, 512, padding_idx=1)
	(embed_positions): OPTLearnedPositionalEmbedding(2050, 1024)
	(project_out): Linear(in_features=1024, out_features=512, bias=False)
	(project_in): Linear(in_features=512, out_features=1024, bias=False)
	(layers): ModuleList(
	(0-23): 24 x OPTDecoderLayer(
	(self_attn): OPTAttention(
	(k_proj): Linear(in_features=1024, out_features=1024, bias=True)
	(v_proj): Linear(in_features=1024, out_features=1024, bias=True)
	(q_proj): Linear(in_features=1024, out_features=1024, bias=True)
	(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
	)
	(activation_fn): ReLU()
	(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
	(fc1): Linear(in_features=1024, out_features=4096, bias=True)
	(fc2): Linear(in_features=4096, out_features=1024, bias=True)
	(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
	)
	)
	)
	)
	(lm_head): Linear(in_features=512, out_features=50272, bias=False)
	)