Summarize by Aili

How I got into deep learning

https://www.vikas.sh/post/how-i-got-into-deep-learning

🌈 Abstract

The article discusses the author's journey of learning deep learning, from initially being intimidated by the math-heavy nature of the field to eventually becoming proficient and building several open-source deep learning libraries. It covers the author's background, the useful skills they developed, their approach to learning deep learning through book learning and implementing research papers, as well as their experience with fine-tuning models and discovering interesting problems to solve. The article also touches on the author's decision to open-source their work and how that led to a job opportunity at a research lab.

🙋 Q&A

[01] My Background and Approach to Learning Deep Learning

1. What was the author's initial perception of deep learning, and how did that change over time?

The author initially convinced themselves that deep learning was too complicated for them, as they had learned machine learning and Python through Kaggle competitions, which left them with gaps in the fundamentals.
However, the author later proved this perception to be false and was able to learn deep learning effectively.

2. What were the key skills the author already had that were useful for learning deep learning?

Strong Python programming ability
Data cleaning skills (as data cleaning makes up over 70% of the author's work)
Pragmatism in recognizing when to go deep into rabbit holes versus taking a faster, easier solution

3. How did the author approach learning deep learning this time, and what resources did they use?

The author decided to learn deep learning in a bottom-up, fundamentals-first approach.
They read "The Deep Learning Book" slowly, looking up unfamiliar terminology and math concepts, and complemented it with resources like "Math for Machine Learning".
The author also found that teaching the skills they were learning helped solidify the concepts in their head.

[02] Implementing Research Papers and Fine-Tuning Models

1. What did the author do after reading "The Deep Learning Book"?

The author read some of the foundational deep learning papers from 2015-2022 and implemented them in PyTorch, using tools like Google Colab and Weights and Biases.

2. How did the author approach fine-tuning models?

The author found that fine-tuning pre-trained models, such as those available through the Hugging Face Transformers library, was a good entry point for training models.
They joined Discord communities like Nous Research and EleutherAI to stay up-to-date on the latest models and papers, and tried fine-tuning smaller models (7B or fewer parameters) using techniques like LoRA.

[03] Problem Discovery and Open-Sourcing

1. What problems did the author try to solve, and how did they approach solving them?

The author realized that high-quality data was often locked away in PDFs, so they tried to generate synthetic data and extract data from PDFs to create good training data.
This led the author to develop several models, such as an equation-to-LaTeX model, a text detection model, an OCR model, and a layout model, by modifying existing architectures and generating or finding the right datasets.

2. Why did the author choose to open-source their work, and what were the benefits?

The author believes that the data stack is an underinvested area of AI, and that widely distributing high-quality training data can help prevent a monopoly by a few organizations.
Open-sourcing their work also led to the author getting exposure, which ultimately resulted in a job opportunity at a research lab.

Shared by Daniel Chen ·

Install fromChrome Web Store