--- license: apache-2.0 language: - en metrics: - accuracy library_name: transformers pipeline_tag: text-generation tags: - python - document - code - code2doc - instruction_tuned - basemodel - pytorch - docstring - documentation - text-generation-inference widget: - text: "def get_np_array_transition_probability_matrix(int_num_states, np_array_A_matrix):print(np_array_A_matrix)np_array_A_matrix += (np.full((int_num_states, int_num_states), float_eps) - (np.identity(int_num_states) * float_eps))print(np_array_A_matrix)np_array_D_matrix = np.diag(np.sum(np_array_A_matrix, axis=1))print(np_array_D_matrix)np_array_D_matrix_inv = np.linalg.inv(np_array_D_matrix)print(np_array_D_matrix_inv)np_array_P_matrix = np.dot(np_array_D_matrix_inv, np_array_A_matrix)print(np_array_P_matrix)print(np.sum(np_array_P_matrix, axis=1))return np_array_P_matrixDocument the python code above." example_title: "example" --- # pip-code-to-doc [pipableAi](https://www.linkedin.com/company/pipable.ai/about/) [colab_notebook](https://colab.research.google.com/drive/17PyMU_3QN9LROy7x-jmaema0cuLRzBvc?usp=sharing) ## What have we built? A 1.3 bn code documentation model that outperforms most models on documenting codes and making your in-house libs ready for LLM and RAG pipelines. We have also open sourced a [parsing lib](https://github.com/PipableAI/pip-library-parser) for the same, together the lib and model can turn your codebase to functional parse tree ready to be consumed by LLMs to execute complex tasks. This is a further trained version of pip-sql-1.3b. ## How we built it? We used softmax cross entropy and a modified form of policy grad along with Q loss, optimized in an EM set up. Loss behaviour in the set up mentioned above - ## License The model is open source under apache 2.0. License ## Usage ### Library use ```python !pip3 install git+https://github.com/PipableAI/pip-library-parser !pip3 install atlassian-python-api from pip_library_parser import CodeToDocGenerator from atlassian import Jira import torch torch.set_default_device("cuda") # Instantiate the CodeToDocGenerator generator = CodeToDocGenerator() # Generate docstrings for the module's functions and methods module = Jira module_name = "atlassian.Jira" docs = generator.generate_module_docs(module, module_name) print(docs) ``` ```python from pip_library_parser import CodeToDocGenerator # Instantiate the CodeToDocGenerator generator = CodeToDocGenerator() code_snippet = """ def example_function(x): return x * 2 """ docstring = generator.generate_docstring_from_pip_model(code_snippet) print("Generated Docstring:") print(docstring) ``` ### Installation ```bash pip install transformers ``` ### Prompt ```python prompt = f"""{code} Give one line description of the python code above in natural language. """ ``` ### PyTorch ```python from transformers import AutoModelForCausalLM, AutoTokenizer device = "cuda" model = AutoModelForCausalLM.from_pretrained("PipableAI/pip-code-to-doc-1.3b").to(device) tokenizer = AutoTokenizer.from_pretrained("PipableAI/pip-code-to-doc-1.3b") prompt = f""" def example_function(x): return x * 2 Give one line description of the python code above in natural language. """ inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=300) tokenizer.decode(outputs[0], skip_special_tokens=True).split('')[-1].split('')[0] ``` ## Examples ### prompt ```python ########################### # Generate Analytical Model ########################### ################################################## # func: get_np_array_transition_probability_matrix ################################################## def get_np_array_transition_probability_matrix(int_num_states, np_array_A_matrix): print('np_array_A_matrix:') print(np_array_A_matrix) ##################################################### # Perturb the adjacency matrix to avoid singularities ##################################################### np_array_A_matrix += (np.full((int_num_states, int_num_states), float_eps) - (np.identity(int_num_states) * float_eps)) print('np_array_A_matrix:') print(np_array_A_matrix) print('np_array_D_matrix:') np_array_D_matrix = np.diag(np.sum(np_array_A_matrix, axis=1)) print(np_array_D_matrix) print('np_array_D_matrix_inv:') np_array_D_matrix_inv = np.linalg.inv(np_array_D_matrix) print(np_array_D_matrix_inv) print('\n\n') print('np_array_P_matrix:') np_array_P_matrix = np.dot(np_array_D_matrix_inv, np_array_A_matrix) print(np_array_P_matrix) print('np.sum(np_array_P_matrix, axis=1):') print(np.sum(np_array_P_matrix, axis=1)) print('\n\n') return np_array_P_matrix ################################################## # func: get_np_array_perron_frobenius_eigen_vector ################################################## def get_np_array_perron_frobenius_matrix(int_num_states, np_array_P_matrix): np_array_perron_frobenius_matrix = np.linalg.matrix_power(np_array_P_matrix,1000) np_array_perron_frobenius_vector = np_array_perron_frobenius_matrix[0,:] print('np_array_perron_frobenius_matrix:') print(np_array_perron_frobenius_matrix) print('np.sum(np_array_perron_frobenius_matrix, axis=1):') print(np.sum(np_array_perron_frobenius_matrix, axis=1)) print('np.sum(np_array_perron_frobenius_matrix, axis=0):') print(np.sum(np_array_perron_frobenius_matrix, axis=0)) print('np.sum(np_array_perron_frobenius_matrix, axis=0)/int_num_states:') print(np.sum(np_array_perron_frobenius_matrix, axis=0)/int_num_states) print('np.dot(np_array_perron_frobenius_vector, np_array_P_matrix):') print(np.dot(np_array_perron_frobenius_vector, np_array_P_matrix)) print('np_array_perron_frobenius_vector:') print(np_array_perron_frobenius_vector) print('\n\n') return np_array_perron_frobenius_vector, np_array_perron_frobenius_matrix ############################# # func: get_np_array_Z_matrix ############################# def get_np_array_Z_matrix(int_num_states, np_array_P_matrix, np_array_perron_frobenius_matrix): np_array_Z_matrix = np.linalg.inv(np.identity(int_num_states) - np_array_P_matrix + np_array_perron_frobenius_matrix) print('np_array_Z_matrix:') print(np_array_Z_matrix) print('\n\n') return(np_array_Z_matrix) ############################# # func: get_np_array_H_matrix ############################# def get_np_array_H_matrix(int_num_states, np_array_Z_matrix, np_array_perron_frobenius_vector): np_array_H_matrix = np.zeros([int_num_states, int_num_states]) for i in range(int_num_states): for j in range(int_num_states): np_array_H_matrix[i][j] = (np_array_Z_matrix[j][j] - np_array_Z_matrix[i][j])/np_array_perron_frobenius_vector[j] print('np_array_H_matrix:') print(np_array_H_matrix) print('\n\n') return np_array_H_matrix ########### # func: run ########### def run(np_array_A_matrix): int_num_states = len(np_array_A_matrix) np_array_P_matrix = get_np_array_transition_probability_matrix(int_num_states, np_array_A_matrix) np_array_perron_frobenius_vector, np_array_perron_frobenius_matrix = get_np_array_perron_frobenius_matrix(int_num_states, np_array_P_matrix) np_array_Z_matrix = get_np_array_Z_matrix(int_num_states, np_array_P_matrix, np_array_perron_frobenius_matrix) np_array_H_matrix = get_np_array_H_matrix(int_num_states, np_array_Z_matrix, np_array_perron_frobenius_vector) return(np_array_H_matrix) Give one line description of the python code above in natural language. ``` ### Response ```txt The given python code is a function that calculates the transition probability matrix, P, for a given adjacency matrix A, and then uses these matrices to calculate the Perron-Frobenius eigenvector and its inverse matrix Z, and finally, the H matrix which is the inverse of the Z matrix. The H matrix is then returned as the output of the function. The adjacency matrix A is a square matrix where each element at position (i, j) represents the probability of transitioning from state i to state j. The function first perturbs the adjacency matrix to avoid singularities, then calculates the transition probability matrix P, the Perron-Frobenius eigenvector and its inverse matrix Z, and finally, the H matrix. The H matrix is then returned as the output of the function. ``` ### Team Avi Kothari, Gyan Ranjan, Pratham Gupta, Ritvik Aryan Kalra, Soham Acharya