Clelia (Astra) Bertelli's picture

Clelia (Astra) Bertelli

as-cle-bert

AI & ML interests

Biology + Artificial Intelligence = ❤️ | AI for sustainable development, sustainable development for AI | Researching on Machine Learning Enhancement | I love automation for everyday things | Blogger | Open Source

Recent Activity

Organizations

Social Post Explorers's profile picture Hugging Face Discord Community's profile picture GreenFit AI's profile picture

as-cle-bert's activity

upvoted an article 11 days ago
posted an update 11 days ago
view post
Post
996
Hi HuggingFace community!🤗

I just published an article in which I try to articulate some counter-points to Dario Amodei's post "On DeepSeek and Export Control"👉 https://huggingface.co/blog/as-cle-bert/why-we-dont-need-export-control

I try to address several key passages of the third section from Amodei's post (https://darioamodei.com/on-deepseek-and-export-controls), bringing my perspective on the importance of open source, open knowledge and multipolarity in a crucial field for our future such as Artificial Intelligence.

Happy reading!✨
published an article 11 days ago
replied to their post 22 days ago
view reply

Hi!

I generally use LangChain + PyPDF, I leave here a code snippet:

from langchain_community.document_loaders import PyPDFLoader

def preprocess(pdf: str) -> list:
    """
    Uses LangChain's PyPDFLoader to extract text.
    """
    loader = PyPDFLoader(pdf)
    documents = loader.load()
    for doc in documents:
        print(doc.page_content)    

This should give a more solid result :)

PS: Langchain is distributed under an MIT license, see their GitHub (https://github.com/langchain-ai/langchain)